How do people calculate engine ELO Ratings?

Sort:
USArmyParatrooper
If you took the top engines running on ideal platforms, and put them against all of the top human players, wouldn’t they just continually go undefeated?

Would that not make their hypothetical ELO ratings infinite?
sammy_boi

No matter how large the rating difference, a win will earn a positive number (even if websites round it off to zero in some instances, the formula itself awards at least some small fraction of a point every time). After infinite wins the Elo rating would indeed by infinite.

This is ok though, after you understand that Elo is not a measurement like height or speed. It's only useful in certain conditions, like opponents are not hugely mismatched, the players in the pool play often, and a variety of opponents. When this happens the rating is a good predictor for your performance within that population.

 

Historically there have been times when a person's rating was unreliable. There was a guy named Bloodgood who played only in the prison system, and also faked a bunch of tournament wins to get a really high USCF rating. When it was so high he qualified for the US Championship, they froze his rating.

https://en.wikipedia.org/wiki/Claude_Bloodgood#High_rank_possibly_via_manipulation

sammy_boi

So you might ask, if engines don't play people, can we directly compare the ratings? Are engine ratings on the same scale as FIDE ratings?

The answer is no, they're different, because the players compete in separate pools. But because the best engines used to be "only" as strong as human GMs, and some engines today still are, their ratings aren't completely arbitrary. We can assume engine and FIDE ratings are more or less comparable, and the best engines really are 600-800 points better than strong GMs.

mgx9600

Assume ELO is based on expected outcomes of a normal distribution (which, if I remember correctly, is what the original ELO scoring are defined).  Then you can get a ELO score even playing very weak players because the normal distribution is never 0 at anywhere.  So, for example, you are very good and play somebody with a known ELO score of (say) 200 and you beat him 1,000,000 games and he wins 1 game.  Then you can find a value to give 1 million-to-1 odds.  That'd be your ELO.

 

mgx9600

I don't have a calculator near by so can't give you the normal-based ELO in above example.  But it isn't hard to find it anyway.

sammy_boi
mgx9600 wrote:

Assume ELO is based on expected outcomes of a normal distribution (which, if I remember correctly, is what the original ELO scoring are defined).  Then you can get a ELO score even playing very weak players because the normal distribution is never 0 at anywhere.  So, for example, you are very good and play somebody with a known ELO score of (say) 200 and you beat him 1,000,000 games and he wins 1 game.  Then you can find a value to give 1 million-to-1 odds.  That'd be your ELO.

 

Although the formula has been shown to be inaccurate at the extremes. Even for gaps of "only" 400-600 points. As I recall in reality, the lower rated players consistently over preform.

There was some debate among mathematicians about how they might correct for that on chessbase a few years ago as I recall.

breakingbad12

Are you sure, sammy_boi? Just because something increases its value doesn't mean it can go to infinity after infinity time. If you sum every 1/(2^n) then you get 1, for instance.

sammy_boi
breakingbad12 wrote:

Are you sure, sammy_boi? Just because something increases its value doesn't mean it can go to infinity after infinity time. If you sum every 1/(2^n) then you get 1, for instance.

Yeah, the infinite sum is divergent, I've messed with the formula before.

mgx9600
sammy_boi wrote:
mgx9600 wrote:

Assume ELO is based on expected outcomes of a normal distribution (which, if I remember correctly, is what the original ELO scoring are defined).  Then you can get a ELO score even playing very weak players because the normal distribution is never 0 at anywhere.  So, for example, you are very good and play somebody with a known ELO score of (say) 200 and you beat him 1,000,000 games and he wins 1 game.  Then you can find a value to give 1 million-to-1 odds.  That'd be your ELO.

 

Although the formula has been shown to be inaccurate at the extremes. Even for gaps of "only" 400-600 points. As I recall in reality, the lower rated players consistently over preform.

There was some debate among mathematicians about how they might correct for that on chessbase a few years ago as I recall.

 

Whether ELO score is a good indication of skill level is definitely debatable.  But, if you just want the ELO score, it is certainly possible to derive it with my example.

sammy_boi

Yeah, I'm just saying.

Because some people think it can help estimate the chances of huge mismatches, like when Carlsen played that amateur recently, but Elo himself didn't intend for it to be accurate in extreme cases, and in practice it hasn't been.

ChessianHorse
@sammy_boi
Just because a small number is added after every win, does not mean that the rating after infinitely many games will be infinite. At least if the ratings are adjusted after every game (meaning the rating increase after each game becomes smaller).
The way the rating system is constructed, I suspect you can only become a couple hundred to maybe 1000 points above your opponent if you play the opponent infinitely many times.