The absolutely ridiculous math of April 2022 AMA answers #4 and #6
Warning: This post contains a lot of math, and lots of estimations.
In the April 2022 AMA, questions #4 and #6 asked about appeals. Those questions and answers are below:
4) Why aren't appeals being used as a way to identify, evaluate, and educate poor reviewers?
Great minds think alike! We are waiting to get more appeals processed to start identifying major trends and seeing how we can use this data to support the community as a whole. As of today we have 20K appeals processed, and we think we need about 150K appeals processed to kick off the analysis.
6) What percentage of Appeals have been approved and are you tabulating the rejection reasons?
Love the data questions! We’ve processed about 20K of the appeals submitted since Dec 2021 and we are seeing about a 50/50 split across appeals being accepted and rejected. Also, we are definitely tracking the rejection reasons and we still want to process much more for us to start thinking about how to use the trends to inform what educational materials we hope to create (see #4 response above).
The glaring problem with these answers revolves around the amount of time it took Niantic to review the initial 20k nominations, as well as how long it will take them to process the appeal backlog and the 150k appeals to meet their goal.
Niantic announced they were beginning to review appeals on February 8th, 2022 per this post, and the next day (February 9th, 2022) many users including myself began to have their appeals processed. Going by the date the 20k number was originally revealed on twitter (March 30th, 2022), we also know that it took nearly 2 months for Niantic to review these 20k nominations, a rough average of about 10k per month. This means, to reach the goal of 150K, Niantic intends to spend approximately THIRTEEN MORE MONTHS reviewing the remaining 130k appeals to meet their goal before they actually use that data???
This isn't even considering the fact that the only reason we have this many appeals submitted is because Niantic intentionally allowed (or at least ignored) a bug that proliferated for several months that allowed users to appeal one nomination per day instead of once per month. We also know thanks to this tweet that most of those 20k appeals are from mid-December, telling us Niantic still has several months worth of one-per-day appeals to get to. Niantic activated appeals on December 8th, 2021, so lets estimate that the "appeals from "mid-December" is comprised of the middle 12 days of the month, December 8th through December 20th. As people could submit one appeal per day instead of one per month, this means that the volume of appeals that Niantic received in those estimated 12 days is approximately the same volume they would have expected for an entire year if appeals were given one-per-month as was intended. This means that the 20k nominations reviewed are approximately one year's worth of anticipated appeals. Because a year's worth of anticipated appeals is approximately 20k nominations, had the one-per-day bug not occurred, it may have taken SEVEN AND A HALF YEARS for Niantic to even receive 150k appeals to meet that goal if they stayed at a constant rate. Was the plan to wait seven and a half years to actually make use of appeals data? Does Niantic not understand that by the time they even got halfway to that point, that the older data would be depreciated as the climate of wayfarer changes? It just makes no sense.
Niantic disabled appeals on February 15th, 2022, just a week after they began to process appeals. This means that between when appeals were activated (Dec 8th) and paused (Feb 15th), users could have submitted a max of 69 appeals, with a 70th appeal available once they were recently turned reactivated on March 30th 2021. Continuing our assumption from before, if we go by date, 12 days worth of appeals is approximately 17% of the duration appeals were available. This means that if the same rate of appeals was submitted for the entire duration of appeals (admittedly this is unlikely), we could have approximately 10 more months for Niantic to finish reviewing the current backlog. Because it will take approximately 10 more months to finish the current backlog, we must also account for how long it will take to process the additional appeals from those 10 months, and so on, which compounds to be another 2 months, for a total of approximately 12 months to get rid of the appeal backlog.
The point of the above though, is that while it may take approximately 12 more months for Niantic to review the backlog of appeals at the current rate, as we established at the beginning of this post, based on the current rate of reviewing, it will take about 13 months of reviewing at the current rate for Niantic to meet their goal of 150k appeals processed. While they may feasibly meet the goal a year from now if my backlog estimations were low (although that is still far to long to wait to take action), if my estimations were accurate or high, Niantic will run out of appeals to review before hitting their goal. At that point, the rate of appeals processed with drastically drop in line with how often users can submit appeals. If we again use our estimation that 12 days of appeals produced 20k appeals, that means that approximately 1700 appeals were submitted per day in mid-December, and if the rate stats the same with a one-per-month allocation, Niantic should expect approximately 1700 appeals per month going forward. If my estimations were in fact correct or high, a difference as small as 10k nominations to reach the 150k goal could result in 6-12 months to finish off at a rate of 1700 appeals being submitted per month.
Additionally, going by the exact dates that appeals began to be processed (Feb 9th) and the 20k number was revealed on Twitter (March 30th) we get a total of 50 days difference. This means approximately 400 appeals are being answered every day on average. In practice though it seems there are days when no appeals are processed at all, making the number processed per day a bit higher than that on days that appeals are actually processed. Lets say the appeal reviewers work 5 days a week and accordingly process 560 appeals per day on average. Personally, I can do a few hundred reviews myself in a matter of a few hours (and frequently stream myself doing so on the Wayfarer Discussion Discord), and frankly I would expect a paid and trained Niantic employee to be similarly efficient and produce far more over a full workday. A pair of full time employees should be easily capable of pushing those numbers, much less potentially more. If two people have 8 hours to do 560 appeals, that means they're each doing about 280 per day and they have about 1.7 minutes per appeal. Considering that appeals seem to be outsourced to cheap labor (Appeals only tend to get processed during the hours of the Indian work-day), there really is no excuse for how slowly the process is moving along and how few resources are seemingly being put into it.
Overall, the AMA answers and specifically the goal of reaching 150k nominations processed before utilizing appeal data is unacceptable. Not only is that goal number of 150k absurdly high, its fairly clear Niantic did not have a feasible plan to use data gained from appeals in a meaningful way when they initiated the appeals process, nor have they managed to form one now either. This is especially true given that one-per-day appeals were not intended. The plan for using appeal data absolutely needs to change. Unfortunately, answer #6 showed that when they do reach their data goal, that their only plan is to create new educational tools. This could mean a lot of things, but unfortunately I have a feeling we will just end up with something as useless and horrible as this infamous infographic. This is so tone deaf, mostly because the majority of items seeing success in appeals are things that reviewers should have already known by reading the criteria and passing the test. In many cases, the tools to know these answers already exist, but reviewers are just not reading them or following them. Creating new materials will not magically solve the problems of people not reading or following the rules in the first place. The issue is a failure to keep reviewers educated with existing/updated criteria, a failure to ensure that reviewers (and abusers) are following that criteria, a failure to teach more about what rejection reasons are appropriate for, and a need for better clarification or improvements on problematic rejection reasons. And even though those things are very well known and longstanding issues, it seems like we may have to wait a year or more for you to even look at data to try to fix those issues. I understand that data can lead to better solutions, but there is plenty of existing data and history to go off of now.
Before concluding I just want to point out some potential mathematical caveats that I did consider so we can try to avoid people arguing about them in the comments. While these things are all worth considering and likely effect the math there was not a good or easy way to estimate them, and most of them likely balance each other out anyways:
- The 20k appeals processed number is likely an estimate itself. It could be significantly more or less than 20k, which would mess with almost all of the math. (Personally I would guess it was low and that Niantic rounded up to 20k, which is why I chose to stick to a rough 10k per month processed estimate for that example, as I felt it would be more accurate over time).
- The 20k appeals processed number was stated to be "mostly mid December 2021", implying it includes appeals from outside of mid-December. Personally I would interpret "mid-December" to be the middle third (10 days) of December. By utilizing an estimate of 12 days worth of appeals in my math, I tried to account for that discrepancy.
- The rate at which appeals were submitted when the feature launched likely started high and got lower as time went on because the feature was new and people had a backlog of rejections to appeal.
- Conversely, some people may have appealed more later on than at the launch of the feature as with time the feature itself became more widely known and it became common knowledge that you could submit appeals once per day.
- The frequency at which a user used their once-per-day appeal is likely lower than what how often they will now use their once-per-month appeals, as it was easy to forget to appeal something every day and easier to run out of rejections to appeal.
- Any estimation involving Niantic reaching the end of the backlog or their 150k goal could easily be thrown off by them allowing more frequent appeal accruals, which is something they said they would consider as time goes on.
- Employees reviewing appeals were likely new when this processed started and may have become more efficient over time, or as they learned more nuances of reviewing it could be taking them longer as they become more thorough.
- On average, reviewing an appeal likely takes longer and requires more attention to detail than a regular user review, even when considering the likely possibility that some people appealed their blatantly ineligible coal.
Thanks for reading, sorry in advance if my math sucked.
Thanks, this was a good read. 560 reviews within 8 hours does sound pretty slow. I think the 200 reviews in 2 hour timeframe (or 100 review per hour) as you said becomes pretty much the norm after doing it for a little while. Especially with endless stream of playgrounds to review.
The best I managed within a day was 1200 reviews, but I dedicated entire day for that and I made it in maybe in 2 hour intervals and kept coffee / food / jogging breaks in between the sessions. Essentially, I started the reviewing as the first thing in the morning with coffee cup in my hand and ended session with hitting the bed. I was streaming the process via discord and talking with others kept the flow going too. I don't think it would've been manageable without someone to talk to while doing the reviews for that long.
Also there was an issue of trying to avoid hitting that cooldown timer. I think I hit the 4h cooldown at least once during that reviewing session but it seems to be manageable by waiting at least 1 minute between the reviews. This does slow down rating quite a bit since sometimes there are 100% clear cases that would take max 10 seconds to see that they are legit.
@AisforAndis-ING just want to clarify, where "150K appeals in backlog" came from? There is "need to process about 150K to start analysis" line in the AMA, but backlog might contain less than that or more than that.
Then, why "SEVEN AND A HALF YEARS for Niantic to even receive 150k appeals to meet that goal if they stayed at a constant rate"? Not everyone appealed nominations every day, so the rate will not decrease 30 times.
Also, I believe, with 1 appeal per month players will pick rejected nominations more carefully, like, what local reviewers always deny, and what should be 100% accepted. This can influence the analysis - when/how to educate reviewers.
I am definitely a proponent of the idea that reviewing can be done quickly but I don't think that's what's best for appeals. If we are expecting sub-1 or sub-2 minute decisions for nominations which definitionally require a closer look we're going to end up with too many "bad" rejections and unhappy Wayfarers. Additionally the folks handling these reviews are starting from square 1 and don't have the years of experience or likely the passion for the effort that many of us have. I think sometimes about what it would mean to do reviewing as my job and even if it were something I literally started tomorrow I suspect that within a short period of time my motivation would slip very far. Those of us who review a lot and have done so over long periods of time regularly take breaks from the system, either slowing to just a few reviews for a while or stopping altogether. Compare that to being required to review every single day for 6-8 hours. It just isn't comparable. All that to say, I think spending 2 to 5 minutes per review in order to give it a fair look and also maintain a steady and sustainable pace is very reasonable. I also think it will take new folks some time to work towards the 2 minute mark.
So what does this equate to? Between 12 and 30 reviews an hour per person for 7-8 hours a day or between 84-240 total reviews. Now consider that it's possible 3 reviews are required so even with 3 people hired to review that last range is our total per day. Is that enough? not when we're submitting 1 per day or when the goal is a baseline of 150k completed reviews to start analysis. So how many teams of 3 are needed and how many is Niantic willing to pay for? Hopefully Niantic can find ways to complete the analysis needed on a smaller set of completed reviews or hire enough people to get the numbers needed.
I don't think you fully read my post, but to answer your questions anyways:
where "150K appeals in backlog" came from?
I never said that, although I did do the math to try to estimate how long the backlog is. We know 20k appeals were reviewed and we know the approximate time period those reviews were from. The backlog can be estimated based on that.
why "SEVEN AND A HALF YEARS for Niantic to even receive 150k appeals to meet that goal if they stayed at a constant rate"?
The seven and a half years estimate is based on how long it would have taken players to submit 150k appeals to Niantic based on the one appeal per month system which was the original intention.
Also, I believe, with 1 appeal per month players will pick rejected nominations more carefully, like, what local reviewers always deny, and what should be 100% accepted. This can influence the analysis - when/how to educate reviewers.
You're right that players would choose their appeals more carefully, however I would also be willing to bet that players already did that at the beginning, not knowing how long the once-per-day appeals would last. If anything this led to more coal being submitted for appeals as people ran out of "quality" rejected nominations. Its an interesting topic, but aside from impacting the statistics of how much was accepted or rejected, I don't see how this would effect any of the math I did though.
560 reviews sounds slow but I don't think its that slow in practice. Realistically though its an average. There are going to be appeals where reviewers clearly got it wrong, reviews where they clearly got it right, and a lot in between. The ones that are obvious will be really quick - and I would bet that with the flood of older appeals being submitted that updated street views and satellite views might make some nominations that were previously questionable far more obvious. The obvious appeals can be processed quickly, leaving more time for the more difficult ones that require more time and attention.
The point was more so just the amount that are getting done. I hate to be a "free labor" complainer because frankly I enjoy reviewing, but think of all of the reviewers that review every day and the thousands and thousands of reviews they complete, and regardless of how fast Niantic's employees are going or how many employees they have, they're only able to get about 400/560 reviews done per day on average? And they have any arbitrary goal of 150,000 until they are willing to look at the data? Thats not right.
I completely agree about the time it takes to review a single nomination, and as I said in the prior comment, realistically its going to come down to an average. We've all gotten clearly bad rejections and I'm sure there are people who appealed the submission they did of their pet cat. Those should allow for quick review times on appeals that bring the average time down, allowing the Niantic reviewers more time to look at the appeals that really need attention to detail.
The point is that the overall quantity of reviews that are coming out are low, and that the goal of 150k is far too high.
Also, given that we have seen some wildly different outcomes on nearly identical appealed nominations, I'm going to to guess that its not something where a group is agreeing on them
The only problem with your math is that we don't know anything about the back scene.
How many people have they hired? How are they trained?
Maybe they follow a manual training shadowing another reviewer and then they ramp up to speed so the number of reviews per day is not linear and now they are reviewing 10x the number of the first days.
When they are able to process all the appeals in a timely basis they can allow more appeals as hinted by the initial announcement.
We all expect that during this time they are training an AI to process reviews and appeals, the current human reviewers might see the data suggested by such AI hinting about the quality of the nomination, so the person can decide faster if that's a good nomination or not and training at the same time the AI
There are too many holes here to assess the situation.
If your goal is to identify individual reviewers who are problematic then I don't think 150K appeals is a very high bar at all-- remember that reviews and reviewers are spread out across the globe. Allow me to grab an envelope back and scribble some numbers.
IIRC, your local review area is nine L6 S2 cells. There are around 24K cells of that size on the earth. However, about 70% of the earth is covered with oceans. There are lots of cells that straddle a coastline, though, so let's make a rough estimate and eliminate 65% of the cells, which brings us down to 8400.
From here it gets messy because you have to think about urban/suburban/rural splits and where most of the activity is. I'd be willing to bet that the vast majority of the wayspots of the world are concentrated in maybe 3000 of those, but that's a gut estimate based on no data whatsoever but let's roll with it. So, 150K appeal reviews would mostly happen in 3K cells, or 50 appeal reviews per cell on average.
Let's turn our attention to reviewers. How many appeal mismatches would it take to establish a pattern of "bad reviewing"? Clearly the answer is more than one, but what's reasonable? Assume that most rejections won't be appealed so every reviewer will only have a percentage of their rejections appealed. I would say that having half of your appealed rejections overturned is a reasonable bar with a minimum of, say, twenty appeals.
How many appeals does it take to achieve that? Now we get into really unknown territory where we need to think about reviewers per review, reviewers per cell, etc. and we just don't have those data so I'm going to make up a number and say that on average a cell has four times the number of reviewers needed for consensus, so every reviewer averages 25% of the candidates in that cell. So, 50 appeal reviews per cell would be 12.5% per reviewer on average.
How many cells does the average reviewer see? I'd guess that the majority are concentrated in maybe 2-4 cells due to low-density areas, oceans, etc. So that would be 25-50 appeals per reviewer with 150K appeals processed.
Bottom line: I've guesstimated or completely made up a lot of numbers in here, and it's late so I haven't checked my work, but 150K appeals processed seems like around what I would expect to be a minimum dataset for there to be meaningful data about reviewers and their rejections.
When the speculation begins with other purposes, guesswork of nonlinear rate of review, and (clearly and overtly stated) made-up numbers, it’s not math any more, it’s imagination.
Yeah, I misread "150K backlog", your estimate is 12 months to get rid of backlog (i.e. 120K), kinda close. I hate math, but if I extrapolate my own data - 3 appeals sent in December, 2 approved in February/March - the backlog will be 30K))
Perhaps a poll on forum - how many appealed / how many processed - would help to get better estimate/distribution.
However, let's not forget it's 50/50 split, so _at least_ 50% of the appeals is submitter education (I mean, it's not future infographics, it's directed educational message NOW). Which is huge. I'm tired of seeing "little library is acceptable on property edge" statement everywhere, maybe personal appeal response can help with that. Then, back to 50% accepted - obviously, reviewers should be educated (I'd love to see some real data here, like, top 5 overthrown rejection reasons, or top 5 successfully appealed nomination types), but maybe the system needs to be tweaked. Or maybe submitters provided more details with the appeal - and it links back to submitter education.
My point is, "reviewer education" is probably not the first goal of the appeal system.
"How many people have they hired?"
I'm gonna go out on a limb and say anywhere between 0 and 1 intern
@holdthebeer-ING Some things will be overturned on appeal even though reviewers did the right thing on the original submission.
Remember that there's a possibly-large time lag between the initial review and the appeal review, which means that the people processing appeals can be working with different information than the initial reviewers had. Lots of stuff gets rejected in areas of new construction, for example, that could easily be easily accepted on appeal because maps, satellite view, and/or street view have been updated in the interim.
right, not going into details (like, here, reviewers trust construction site nominations), the appeals analysis might reveal lots of interesting tendencies, and "educate reviewers" might be even outside of top 3.
For example, engagement: if submitters get approved appeals (or even rejected appeals with personal message) - will they nominate more? Will they nominate better? Can one personal review be part of "warm welcome" treatment?
Or, can appeals help to detect botnets / coordinated review?
Or, back to rejections and score system - say, if nomination was rejected with many evenly distributed random reasons (i.e. not one specific reason) - does it indicate that reviewers simply did not agree that nomination meets exploration/exercise criteria, and it's time to add the "does not meet criteria" back into system?
You bring up some good points, but you overlooked two key factors: upgrades and the amount of reviewers it takes for a nomination to get resolved.
As I'm sure you know, upgrades push your nominations to a far wider range of reviewers throughout your country, expanding them past your local review area. A common complaint about upgrades is that they "doom" your nomination (generally due to people not knowing or caring about local nuances), and have much higher rates of wrongful rejections. Because of this, I would guess that a significant portion of appeals were nominations that were previously upgraded. This also significantly bridges the gap between the global distribution of reviewers, especially in larger countries.
The second thing to consider is that 150k appeals processed does not equal 150k user reviews analyzed. While the exact number of people it takes to review a nomination for it to come to conclusion it not known, lets take a guess and say it takes 30 people to review your nomination for it to be processed. That would mean that by evaluating 150k appeals, they are actually evaluating 4.5 million users reviews. Presumably some of the appealed nominations were correctly denied, so lets cut that in half to factor that in, and we're now at 2.25 million data points. Now, the likelihood that all of those people who contributed to overturned appeals got it wrong is low, but presumably at least half of them would be wrong for those reviews to originally result in a rejection, which would still leave you with at least 1.125 million incorrect user reviews. Personally I think that is way more than enough data needed to start looking at trends and making improvements.
Something else to keep in mind with this is that you have reviewers who review a lot and you have reviewers who only review a little. People who review a lot and are frequently reviewing incorrectly are likely going to appear early on in the data collection as being problematic, whereas doing more and more reviews will lead to more data and identifying users who either make mistakes less often or who simply just don't review as much and therefore don't contribute to as many bad rejections. While obviously we want to make sure that everyone is reviewing incorrectly, identifying and rectifying the reviews of high-volume bad reviewers will likely go a long way to stopping bad rejections by itself. Keep in mind that in many cases, one or two votes could be making all the difference between an acceptance and a rejection.
@AisforAndis-ING You are correct that I glossed over upgrades. I also barely handwaved at home and bonus locations.
I didn't really skip over the number of reviewers required to reach agreement, though, because I was looking at it from the perspective of scoring individual reviewers rather than looking at general trends. If the point is to educate/penalize poor reviewers then the only number that matters is the number of appeals for things that each reviewer voted on. It doesn't matter if ten or a hundred or a million people review each submission-- if one candidate is appealed then everyone who voted on it gets one datapoint of agreement or disagreement with a Niantic decision.
You are also correct that not everyone reviews at the same cadence. That too is OK, I think. If someone reviews like one thing per week then there will never be enough datapoints to understand their quality, but their impact on the overall system is minimal.
I honestly think Niantic can get better data about trends by mining the raw review data to look for patterns without factoring appeals into the mix. If I was King of Data at Niantic I would be aggressively targeting botnets and colluders as my top priority, and I don't think appeal data are necessary for that.