Why can I beat 1000 rated bots but cannot beat 400 rated humans?

Sort:
usbspeakers

Doesn't seem that both these things can be true. I easily beat 900-1000 rated bots but lose close to 100% of my games against 400 human players. 

One of the rating systems must be wrong. Even if the 400 rated humans were cheating they would be winning and going up in rating, unless cheating is so rampant at that level there are always new cheaters coming up to fill in the ranks? Or is the bot rating system just off by a factor of over 300%? 

justbefair
usbspeakers wrote:

Doesn't seem that both these things can be true. I easily beat 900-1000 rated bots but lose close to 100% of my games against 400 human players.

One of the rating systems must be wrong. Even if the 400 rated humans were cheating they would be winning and going up in rating, unless cheating is so rampant at that level there are always new cheaters coming up to fill in the ranks? Or is the bot rating system just off by a factor of over 300%?

Yes. Many of the bots ratings are very questionable. People learn techniques to use against them.

jobieone89

Sorry to bring up an old thread but is there a definitive reason. The 1000 rank bot, just played them for the first time since i started a few weeks ago and breezed through them, yet i play 400 ranked players and its a tough game. Is the ranking system really that far out?

My local chess club wants people ranked about 1000 to join as a beginner, just feels very far away

InfinityPhoenix7

Bots are weird I can beat 2000 bots yet I sometimes can't beat 800 players

jobieone89

Id love to know if there is any science behind it or they are just programmed to make mistakes every so often

magipi
jobieone89 wrote:

Sorry to bring up an old thread but is there a definitive reason. The 1000 rank bot, just played them for the first time since i started a few weeks ago and breezed through them, yet i play 400 ranked players and its a tough game. Is the ranking system really that far out?

Bot ratings are not ratings at all, just a number that's written there. Win or lose, that number doesn't change. Some of them are horribly off, and in general low rated bots are ridiculous.

jobieone89

Oh i see, ok that makes slightly more sense. Its frustrating to have any number at all

RyanZ_MD

I beat 1800 bots as a 1200, so they are probably really over rated. But I don't understand how some of the 800s won against a 2000 bot. I never done that yet. So either I am a really bad 1200, or bots are just not good ways of measuring skill.

VerifiedChessYarshe

skill issue+Humans to robots are different creatures

basketstorm

In discussing bot ratings, it's important to understand that bots operate within their own rating pool. Only bots exist in that pool happy. Integrating bots into the player pool would require frequent adjustments to their (bots) ratings. Only then they will feel "real". Theoretically, doable.

But anyway probabilities among bots should align with their ratings perfectly (if you win 50% of the games against 1200 bot, you must win 64% of the games against 1100 bot etc) otherwise it might indicate a miscalibration by the developers.

You need a lot of games to gather accurate statistics. It’s not accurate to claim, for example, "I'm rated 1200 but I defeated 1600-rated bot, so the bot’s rating must be inflated by around 400 points." Ratings are about predictions and probabilities. If you consistently (100%) win against a bot, it suggests that the bot is too weak for your skill level, and you should choose a stronger opponent for a more accurate assessment. But if it is just one game, it doesn't mean anything yet.

To evaluate the accuracy of a bot’s rating, play against a relatively strong bot (for your skill level) multiple times to gather sufficient data. Using a rating table, you can determine the rating difference between your rating and the bot’s:
ELO Difference Outcome Probability
+800 0.99%
+750 1.32%
+700 1.75%
+650 2.32%
+600 3.07%
+550 4.05%
+500 5.32%
+450 6.98%
+400 9.09%
+350 11.77%
+300 15.10%
+250 19.17%
+200 24.03%
+150 29.66%
+100 35.99%
+50 42.85%
0 50.00%
-50 57.15%
-100 64.01%
-150 70.34%
-200 75.97%
-250 80.83%
-300 84.90%
-350 88.23%
-400 90.91%
-450 93.02%
-500 94.68%
-550 95.95%
-600 96.93%
-650 97.68%
-700 98.25%
-750 98.68%
-800 99.01%

The more games you play, the more precise this estimate will be. However, you cannot directly compare bot ratings to human ratings due to the differing rating pools I mentioned.

So you have two options: 
1) calculate your rating in "bot-ELO"or 
2) convert the bot's rating to a chess.com rating. Chess.com reportedly uses the Glicko system, which affects rating dynamics but not the fundamental probability calculations.

So for example, if you play against a chess.com bot rated at 1000 and win 22 games with 1 draw (0.5), your win rate would be (22 + 0.5) / 30 = 0.75, or 75%. This results in an approximate rating difference of -200, or precisely -190.848.

The calculations are:
P = 1/(1+POWER(10, D/400))
D = 400xLOG10((1-P)/P)
where P is the probability and D is the rating difference.

So, your "bot-ELO rating" in this example would be around 1000 + 190 = 1190. If your actual chess.com rating is 950, the bot’s "chess.com-converted" rating would be approximately 950 - 190 = 760. According to comments, I'd expect greater difference than in my example.

magipi
basketstorm wrote:

In discussing bot ratings, it's important to understand that bots operate within their own rating pool.

Are you sure about that?

It's more likely that bot ratings are not consistent. They are semi-random, based on the gut feeling of a developer. Some are blatantly overrated, others are not.

basketstorm

We can't definitively say that some bots are "overrated" and some not or that they aren't consistent among themselves without gathering sufficient data. A couple or even 10 games isn't enough. And as I've mentioned, if you consistently win against a certain bot, that data isn't helpful either. You need to find a stronger bot that actually challenges you, and play enough games with it (the more, the better -at least 100 games, I'd say). Then, pick another bot with a higher rating, do the same 100-game test, and compare your win frequencies against both bots according to ELO. That's when you can say that you've spotted an inconsistency.

Rating difference suggests probability, which means you can win, but you can also lose. How often you win depends on the rating difference. Again, there's no point in comparing a bot's rating to a player's rating. A bot's rating is constant, unrelated to players and only shows differences between bots.

It's likely that more than just the developers' gut feelings were involved in rating the bots. Because it’s easy to calibrate bots by running simulations where they play against each other and then precisely rate each bot (at different difficulty settings) using ELO. This obvious method would help ensure that their ratings are internally consistent. I don't see why a reputable company with a lot of technical resources would do otherwise and rely on gut feelings.

haveyouseencyan

There are two bots, one at 800 and another at 900 that are quite tricky IMO. IamChristini and CDawgVA. You can beat those two easily? I doubt that, but perhaps.

Just keep playing and very soon you will be beating 400+ players, you are probably very close. I was stuck around 280 for quite a while, like a week or longer, then I jumped up to 400 in the space of like 2 days.

Highest rated bot I have beaten so far is 1,100 or 1,200 but I have not played any since I reached 400. As I said above, those two bots I mentioned IMO are under-rated.

So yea, the bot ratings are out of whack a bit. And you can manipulate bots into wins tbh, for example, by trading pieces with the aggressive ones to nullify their attacks, then play it out to the late game and they will eventually blunder and your ahead.

Last thing to add. Anyone new joining chess.com, will be down in the low elos, regardless how good they are, just because they are new. Plus, there are some players there who just like bashing noobs.

AerryChris

Bots will sometimes just chuck in a blundered queen but otherwise play accurately.

basketstorm

I have to agree with the observation about CDawgVA "900." When I select just the "Engine" and set it to 1000, I end up with significantly more wins in the long run. So, there MIGHT be an inconsistency in the bot ratings between bots. However, issue isn't about the difference between player and bot ratings - I'm still convinced that player and bot ratings can't be directly compared and that bot's rating in "player rating" units can be estimated after certain amount of games using ELO table.

magipi
basketstorm wrote:

It's likely that more than just the developers' gut feelings were involved in rating the bots. Because it’s easy to calibrate bots by running simulations where they play against each other and then precisely rate each bot (at different difficulty settings) using ELO. This obvious method would help ensure that their ratings are internally consistent. I don't see why a reputable company with a lot of technical resources would do otherwise and rely on gut feelings.

Just because it would have been easy to do it it doesn't mean that they actually did it. Relying on gut feeling is faster and cheaper and easier, and no one will complain anyway.

A strong indirect evidence: they could have calibrated bot ratings to be on par with human ratings. But they didn't. That shows that they didn't really care.

One last point: please don't use "ELO" in all caps. It's not an acronym, it's named after the inventor, professor Arpad Elo.