Rating accuracy

Sort:
BdoggerX
Do you think an 800 rapid/600 blitz player is the same skill level as someone with that rating a year ago or five years ago? i.e do you think the “average”player is getting better with more people playing or have rating been consistent through time?
justbefair

How would anyone know?

It seems like it makes sense that a major part of all the many millions who have joined in the last few years are quite different than those who may have already had some playing experience when they joined up in the first decade of chess.com.

ghost_of_llama

In just 2 posts, 3 related but distinct ideas have been brought up. They're often confused for each other so these topics often become muddled.

Addressing the question in the first sentence of the OP, which I'll call the skill-to-point ratio, as far as I can tell it's been essentially constant over the last few years... i.e. the same amount of skill results in the same rating.

Average rating and the skill of incoming new players are independent of this. This independence is not obvious and whether / how they're related can be its own discussion. Having looked into the math and in doing some simulations, I believe I have strong evidence / argument for them being independent.

Going back further in time than a few years, and depending on the time control, the ratings are definitely not the same. The two biggest reasons for this are:

1) In the past Chess.com has artificially raised all ratings of a particular time control.

2) The classification of 10 minute games (rapid or blitz) was changed making comparisons between some old and new accounts difficult.

ghost_of_llama

I will add that... AFAIK, to the best of my memory, etc, blitz ratings have never been artificially changed overnight (which has happened in the past with rapid, bullet, and daily IIRC).

For blitz, at the very top, ratings have crept up... however I don't think the skill-to-point ratio has changed very much for anyone else (i.e. I don't think there has been significant inflation/deflation for blitz).

Of course I can't be sure. The site is ~16 years old and I don't have a lot of data. I think the top ratings have increased because, for example, in the beginning there weren't enough GMs for Hikaru to be rated accurately. For 99% of people this was never a problem.

One way to try tracking things is looking at rating history graphs of long time members who were already experienced in chess and adults at the time they joined.

Also part of the discussion are the multiple inflationary and deflationary effects that are quite real, but it seems are often small and/or balance each other... as a simple example when someone cheats to win many games, and then is banned, they're removing rating points from the population, which deflates ratings... however chess.com awards at least some of those points back, which counters this.

calbitt5750
It seems like if a lot of new inexperienced players were coming on (say 400-700), the percentile rankings of current players above 700 would increase. But I’m 837 rapid at the moment and 69.6 percentile and haven’t seen my percentile increase at any better rate than the rating. But I agree you can’t be sure without a historical graph showing the percentile of 837 a year, two or three ago. Maybe that’s discoverable on site some way, but I don’t know. If 837 was only 60 percentile a year ago, that would indicate some inflation, wouldn’t it?
ghost_of_llama
calbitt5750 wrote:
If 837 was only 60 percentile a year ago, that would indicate some inflation, wouldn’t it?

That's the intuitive conclusion, sure, and it's an interesting question worth investigating i.e. whether new players or a changing average causes inflation/deflation... sorry this is such a long reply but I try to do a good job of making the answer easy to understand at the end...

anyway... it turns out the answer to the question I quoted is "no."

-

Imagine a magical device that can weigh your skill the same way a normal scale weighs a body. After your skill is "weighed" the device gives you a chess rating. In reality we can't do this, we have to make a guess using your results against other players... so the intuition is that when these players change, your rating should also change... and sure if you only ever played completely new accounts your rating would be random, but in practice this is not how it works.

I often think a good way to make it easier to understand is using an economy as an analogy. A simple example to start things off: you go to different websites and have different rapid ratings on each of them. It's the same as going to different countries with an amount of gold, and each of them exchanges that gold for a different amount of the local currency. The gold is your chess skill, and the local currency is the website's rating.

When we talk about inflation or deflation what we're talking about is whether the gold (or your skill in chess) can be accurately represented by the same number year after year... and so now it's easy to understand when I say the only way inflation/deflation can exist is if the total amount of gold changes, the total amount of currency changes, or both change but the ratio is not preserved.

(We can substitute gold for any intrinsically valuable thing like food.) For example if you flood an economy with lots of food, or if you print / destroy a lot of money, then that can cause inflation / deflation.

-

Ok, finally getting to the point... as long as we create and destroy exactly the right amount of rating for each player who joins or quits, then inflation will never happen... of course mathematicians knew this, and so that's exactly what rating systems (like Glicko on chess.com) work very hard to do. When two long-time active players play, the number of points gained by one player is equal to the number of points lost by the other... this makes sense because after the game the total amount of skill hasn't changed so you don't want to create or destroy rating points. However notice what happens to new players... they can gain or lose 100s of points in a single game even when their opponent wins or loses only 10. As long as you can quickly get new players to their correct rating then there won't be any inflation / deflation because the point-to-skill ratio remains constant.

And now you (hopefully) have an intuition for why the point-to-skill ratio is independent from the percentile. If 1 million new users joined chess.com tomorrow, all with the skill of a 1200 rated player, as long as we start them all at 1200 then no inflation or deflation would happen even though the average and percentile would change a lot.

BdoggerX
Thanks for the interesting thoughts. I was hoping I could feel better about my poor ratings if there was some consensus that the overall “average” player has improved over time! Seems more probable that if there has been any change at all it’s inflationary rather than deflationary.
JubilationTCornpone

It's my sense the population has improved--and certainly should have, considering the amount of instruction and computer help available nowadays. If people actually haven't improved with all that, it would say something rather unfortunate about all of us--though that doesn't prove it one way or the other.

What I think I've observed--definitely and measurably--is that the average rapid rating on chess.com has come down about 200 points in the past four years. I mean, it is as low as 650 now, and it didn't used to be that low. This can be considered point deflation, or others have said that a lot of beginners joined during the pandemic and they just aren't as good. But on the third hand, who says they aren't as good just because they joined during the pandemic?! So, I do think we can say the average number is lower, at least for rapid on chess.com, but we can't say the reason.

Then, we have the usual comparison of this site to USCF (and FIDE). It's usually said that this site is several hundred points overrated compared to USCF. But that has been said for years. So if there's a fall of 200 points on this site, then shouldn't that be by now in line with USCF? Maybe. Unless this site has actually gotten weaker (because of all those pandemic newcomers). My own sense is, my USCF rating was around 1300. My chess.com rapid rating right now is around 1350--but it moves lower sometimes (not usually higher). Point being, it's pretty close to USCF, at least for me. But then again, I haven't played USCF in a while, and they've taken on a lot of schoolkids in the past ten years (scholastics), and those kids can be wildly underrated--so maybe that bar has moved too.

The one thing I thought I could pin something on is accuracy (as judged by the computer), but as someone (Elroch) noted the other day, even that changes as the computer is constantly being updated.

I feel like I know more than I used to. I feel like my opponents know more than they used to (just based on the moves they make). Yet the ratings aren't higher, and even are slightly lower. But it could be imagination or just because I want to believe that.

So I guess it's a long way of saying there's really no way to know for sure.

JubilationTCornpone
calbitt5750 wrote:
It seems like if a lot of new inexperienced players were coming on (say 400-700), the percentile rankings of current players above 700 would increase. But I’m 837 rapid at the moment and 69.6 percentile and haven’t seen my percentile increase at any better rate than the rating. But I agree you can’t be sure without a historical graph showing the percentile of 837 a year, two or three ago. Maybe that’s discoverable on site some way, but I don’t know. If 837 was only 60 percentile a year ago, that would indicate some inflation, wouldn’t it?

There's definitely a discoverable way. Chess.com could publish this data. They could at least say what rating is median now, vs what rating was median two, four, six, years ago...whatever they are willing to do. And since they moved 10 minute games from blitz to rapid, they should not publish this data on "blitz" or "rapid"--since that's contaminated. They could publish data on exact time controls--at least the most popular ones. But I don't think they want to.

To your specific question, I believe you have it backwards. If 837 was 60th percentile a year ago, and is 69th percentile now, that would indicate rating deflation rather than inflation. Because if a higher percentile performance is associated with a constant rating, then it follows a constant percentile performance is associated with a lower rating. But again, maybe the whole population has moved.

ghost_of_llama

Again, the point-to-skill ratio is not the same as average or percentile.

If every player rated over 2500 (or under 1000) quit the site tomorrow, your rating wouldn't change (so no inflation or deflation) but your percentile would. They're independent.

And we can do it the other way too. If chess.com decided everyone on the site is underrated by 100 points, and gave everyone 100 points overnight, that would "inflate" ratings, but your percentile would stay the same.

And another example, if everyone gained 100 points worth of skill overnight, then rating would be deflated, but the average rating and percentiles would stay the same.

There always seems to be a lot of confusion about this topic.

ghost_of_llama
JubilationTCornpone wrote:

This can be considered point deflation

Only if it's caused by everyone losing, on average, 200 points... but there is simply no mechanism that removes points from the pool on that scale.

, or others have said that a lot of beginners joined during the pandemic and they just aren't as good. But on the third hand, who says they aren't as good just because they joined during the pandemic?!

Because people who play out of boredom aren't as skilled as people who are passionate about the game.

So, I do think we can say the average number is lower, at least for rapid on chess.com, but we can't say the reason.

Then, we have the usual comparison of this site to USCF (and FIDE). It's usually said that this site is several hundred points overrated compared to USCF. But that has been said for years. So if there's a fall of 200 points on this site, then shouldn't that be by now in line with USCF?

If everyone lost 200 points, then yes, but that's not what caused the average to drop.

JubilationTCornpone
ghost_of_llama wrote:

If everyone lost 200 points, then yes, but that's not what caused the average to drop.

Alright, but what did cause the average to drop?

By the way, I did like a lot of your economic analysis, but it doesn't necessarily support your point. For example, while all the things you mentioned as variables and unknowns are such at least for purpose of this discussion, we do know quite a bit too. There is more gold above ground now than in 1960. There are more dollars in circulation than in 1960, glossing over whether we mean monetary base, money supply, and by what measure, etc., the growth in dollars has exceeded the growth in gold, not just over time, but consistently every year, with predictable results (dollar cost of gold moves from ~$35 to ~$1950). So it's not an exact science, but it also isn't a total mystery.

But if you know the mechanism by which the average dropped, I'd like to know it. If you already said, I may have missed it.

ghost_of_llama
JubilationTCornpone wrote:
ghost_of_llama wrote:

If everyone lost 200 points, then yes, but that's not what caused the average to drop.

Alright, but what did cause the average to drop?

By the way, I did like a lot of your economic analysis, but it doesn't necessarily support your point. For example, while all the things you mentioned as variables and unknowns are such at least for purpose of this discussion, we do know quite a bit too. There is more gold above ground now than in 1960. There are more dollars in circulation than in 1960, glossing over whether we mean monetary base, money supply, and by what measure, etc., the growth in dollars has exceeded the growth in gold, not just over time, but consistently every year, with predictable results (dollar cost of gold moves from ~$35 to ~$1950). So it's not an exact science, but it also isn't a total mystery.

But if you know the mechanism by which the average dropped, I'd like to know it. If you already said, I may have missed it.

The same way average height decreases when short people join a group.

The picture below doesn't prove anything, it's just interesting. Back when this was first going on I was interested in the topic and grabbed a few data points across 1 year.

-

-

https://www.chess.com/leaderboard/live

And currently the blitz stat says 22.7 million with an average rating of 652.

As I recall, around 6-8 months ago when I was casually checking (without recording it) the number of users went up and at the same time the average rating had gone up too, and I hadn't seen that before, so that was neat.

ghost_of_llama
JubilationTCornpone wrote:

By the way, I did like a lot of your economic analysis, but it doesn't necessarily support your point.

It's just an analogy to help people understand in terms of something more concrete instead of it being some nebulous abstract thing.

When I wanted to figure out how new players joining would affect ratings I wrote a program and simulated it. The result was the average can go up, down, or stay the same. It all depends on how efficiently the initial rating period moves players to their correct rating, because as I said, after two players are established, (essentially) the number of points one player wins is equal to the number of points the other loses... when points and skill remain constant there can't possibly be inflation / deflation.

After players have an established rating, if they improve, then sure, that could cause deflation. There are inflationary effects too though, for example when a new player joins, loses 10 games, decides chess sucks, and quits forever, that adds points into the pool.

JubilationTCornpone
ghost_of_llama wrote:
 

Nice graph. Of course if all those new players had come in with a 1200 rating (as in the past would have been the case), then you'd expect to actually see rating inflation (more points associated with less ability). Since they allowed people to come in at 800, it seems likely they aren't actually that weak--some are, but in aggregate--so it had the opposite affect. So I think, maybe, we can say that the cause of inflation/deflation is allowing new players to select their starting rating even though they have no real idea what it should be.

In any event, I'm glad this discussion finally happened. I tried to start it a couple times and it went nowhere. For myself, I've decided to focus on getting an average accuracy over 80%. The rating can do what it likes.

ghost_of_llama

Yeah, starting new accounts higher than 600 and then having 10 million players join who are below that is definitely a source of inflation.

As these new players improve, they're also a source of deflation though. Improvement causes deflation... there are multiple sources of each (inflation / deflation) so it's hard to predict.

The Glicko rating system does a good job of trying to keep things stable though, and chess.com does things like refunding (at least some) points lost to cheaters. Once you've established a certain ratio of skill vs rating it's hard to change.