Game Report Calculations

Sort:
chesslover0003

I like the game report feature of chess.com.  I have some questions about how some of the calculations are made.

1. How is a move flagged as each of the following?  What is the engine calculation?

  • Brilliant - This move not only was the best move but was difficult to find, even for the engine! Good job! 
  • Best Move - The best move, according to the engine!
  • Excellent - A great move, but not quite the best!
  • Good - This move is okay, but could be better!
  • Book -  An established opening move
  • Inaccuracy - This is a weak move that could be much better
  • Mistake - A bad move that immediately worsens your position
  • Blunder - A very bad move that could lose material or lose the game
  • Missed win - A move was missed that would have won material, or won the game

2. How is accuracy calculated?

3. How are key moments determined?

BondageFair

There are technically a few factors that is considered for accuracy measurement however we will just mention the most relevant factor which would be: 1. The difference in moves between both parties e.g if opponent makes bad moves and you make good move then your accuracy will increase by a lot, if your opponent makes good moves and you make good responses then both parties have high chances of getting the near the same accuracy percentage, if your opponent makes good moves but you make better responses(which normally an engine would consider as a suggested move) then your accuracy will also increase by a lot, if both of you make extremely well moves for the most part and one of you have more mistakes than the other then your accuracy may rise or may continue being not so high, it all really depends because it's possible for engines to also display false positives/false negatives, meaning your high accuracy won't always be as genuine as you think, or your blunders/mistakes won't always be as genuine as you think (that's why this platform tries have you redo the moves when you are trying to analyze a report so it could try fixing any false reports) your accuracy may also increase drastically because of said issues or it may decrease despite having good moves because of said issues. Essentially accuracy will mostly be only estimates, nothing to really take seriously. and key moments are simply just note worthy moments, it's basically all the data the engine gathered/calculated and now it has you go through all of that data.

chesslover0003

@bondagefair Thank you.  I was actually looking for a formula happy.png  If proprietary then I get it.  Perhaps accuracy = (brilliant + best + excellent + good moves)/(total moves - book moves).

It makes sense this depends on the strength (and depth used) of the engine.  That said, it doesn't seem possible an engine could find a better move than best (i.e. brilliant).  It makes more sense that non-best scores/ratings are derived from best in some way.

Chess.com is using an 8-point score/rating (brilliant, best, excellent, good, book, inconsistency, mistake, blunder, missed win) vs a more traditional chess 6-point score/rating (!! very good, ! good, !? interesting, ?! dubious, ? bad, ?? very bad).

PicoChess uses a PicoTutor function based on the traditional chess 6-point score/rating.  I'm still browsing the source to confirm the thresholds and criteria used for each.

 

 

 

BondageFair

Sorry for the late response, you should try analyzing a game with blunders and a game without blunders and try comparing them to an engine, if I recall I think chess.com uses stockfish depth >= 16 for their analysis so compare the games to stockfish and monitor the evaluations to see if there's a particular pattern that occurs for a blunder and then you could use that pattern to determine when something is a blunder or not.

BondageFair

and then maybe try using the evaluation pattern to predict a particular accuracy.

BondageFair

by the way it's better to use the method from the #4 comment to determine blunders rather than use your own definition that way determining blunders would be more accurate.

BondageFair

You should also play a few matches and go for particular accuracy percentages, some which may be risky. for starters you should play completely normal at first for 1 match, if your accuracy for that particular match is perhaps <= 60 or <= 80 then just play a custom game and redo the same moves that your opponent made but don't redo the same moves that you made(such as the mistakes/blunders) and then take note of the accuracy increment and then rely heavily on an engine for the next custom match and then compare that to the rest of those matches.

BondageFair

You should also play a few matches and go for particular accuracy percentages, some which may be risky. for starters you should play completely normal at first for 1 match, if your accuracy for that particular match is perhaps <= 60 or <= 80 then just play a custom game and redo the same moves that your opponent made but don't redo the same moves that you made(such as the mistakes/blunders) and then take note of the accuracy increment and then rely heavily on an engine for the next custom match and then compare that to the rest of those matches.

BondageFair

hm, half of my statement is being filtered out or maybe it's just appearing that way for me.

BondageFair

If you have telegram you can try contacting me: Aidsys (hopefully this isn't against the chess.com policy) but contact me if you can and then we can continue talking about it

JR8420
Guys, what do you think of this game (I a playing white)? 
 

 

Made_in_Shoreditch

It may help by understanding the Nunn Convention

chesslover0003
Symbol Meaning Evaluation
!! Very Good

 

# very good moves
        if best_deep_diff <= c.VERY_GOOD_MOVE_TH and deep_low_diff > c.VERY_GOOD_IMPROVE_TH:
            if (best_score == 999 and (best_mate == current_mate)) and legal_no <= 2:
                pass
            else:
                eval_string2 = '!!'

! Good

 

# good move
        elif best_deep_diff <= c.GOOD_MOVE_TH and deep_low_diff > c.GOOD_IMPROVE_TH and legal_no > 1:
            eval_string2 = '!'

!? Interesting

 

## interesting move
        elif best_deep_diff < c.INTERESTING_TH and abs(deep_low_diff) > c.UNCLEAR_DIFF and score_hist_diff < c.POS_DECREASE:
            eval_string2 = '!?'

?! Dubious

 

# Dubious
        elif best_deep_diff > c.DUBIOUS_TH and abs(deep_low_diff) > c.UNCLEAR_DIFF and score_hist_diff > c.POS_INCREASE:
            eval_string = '?!'

? Bad

 

# Mistake ?
        ##elif D1 > c.BAD_MOVE_TH and D2 > BAD_MOVE_TH:
        elif best_deep_diff > c.BAD_MOVE_TH: ## and  best_low_diff  > BAD_MOVE_TH:
            eval_string = '?'

?? Blunder

 

# Blunder ??
        ##if D1 > c.VERY_c.BAD_MOVE_TH and D2 > c.VERY_BAD_MOVE_TH:
        if best_deep_diff > c.VERY_BAD_MOVE_TH and legal_no:
            eval_string = '??'

chesslover0003

This is from the PicoChess source.  I'm still researching.

Wildekaart

You don't need an answer. Because the engine evaluation is stupid. Some obvious moves are 'brilliant'. Some attacking moves with a minor fallacy are 'mistakes'. And the accuracy stat is a joke really. Of what use is it to you to compare yourself to an engine, that can't possibly think like a human player?

It's good for catching obvious blunders but that's really where it ends. Some popular chess YouTubers are likely to have a video explaining why it is so poor if you can't understand it. Which seems to be the case for you as you're searching for an answer.

chesslover0003

@wildekaart I disagree.  I'm not suggesting the approach is perfect and I also understand there are some limitations to engines.

I believe an engine still provides a great basis for evaluating and comparing moves for the majority of players.  Likewise, I think it's useful to compare one's progress over time against an engine.

Thanks for the insight.

FYI, I wouldn't characterise an engine evaluation as stupid or useless.  Humans have been unable to beat computers for the past 15 years (I think Kramnik vs Deep Fritz in 2006 may be the last tournament and Kramnik Lost).  Computers have been handicapped since.  A mobile phone and even a Raspberry Pi have reached GM levels.

As for Chess.com's accuracy rating... perhaps one of the issues is that the algorithm is not open.  We do not know what the accuracy algorithm does.  We do not know exactly what it's pros or cons are.  Perhaps the issue is your/our understanding of it.  What it is and what it isn't. It's possible an open algorithm would provide more transparency and understanding about how to interpret it.

arvinleroux

Sometimes I find the evaluation of moves questionable. Here the engine says e4 is a good move but I don't see it. I also feel like I've had a few moves that sacrifice material for position evaluated as "mistakes" or worse despite leading to an advantage if not mate. I guess I take the engine feedback with a grain of salt. But I definitely like to replay games and enjoy learning from and using the feature.

chesslover0003

@arvinleroux yes, engine evaluations need to be taken with a grain of salt happy.png  I know there are some positions that are "anti-computer".  

A money hitting keys at random on a typewriter for an infinite amount of time will eventually create the complete works of Shakespeare.

m1md

When you're terribly losing, the engine might say that even moves that directly lose material are still good. It's not a mistake because any move will lead to a loss so it doesn't matter (to the engine) whether you'll lose in 10 or 20 moves.

Sacrifice might be a mistake if it doesn't lead to an advantage with a perfect play from the other side. The fact that your opponent at a similar rating won't see the best move doesn't change the fact that it's a mistake and would be seen at a higher rated game.

It's true that sometimes 'brilliant' move might not be indeed brilliant. A move might be considered brilliant if the engine doesn't consider it to be the best or excellent at a lower depth and suddenly sees it at a high depth. However, it still might make sense. For example, maybe the move that was previously considered to be the best leads to a loss in 20 moves given perfect play. Human (especially one rated 1200) probably won't see it but an engine with deep analysis will.

 

 

m1md

However accuracy has its caveats, for example, a game with a lot of trades and a long endgame will have very high accuracy, because moves are simple and obvious and even if a trade isn't the best move it leads to simplifying position and easier (and more accurate) game.  On the other hand, a very decent game from a human perspective (one that ends in e.g. 20 moves) might have lower accuracy because it had few inaccuracies and mistakes. For example, in a 20 move game, two mistakes account for 10% of the moves and they significantly lower the accuracy. In an 80 move game with a long endgame (where almost all moves are best or excellent) one blunder accounts for only 1.25% of the moves. I've noticed that sometimes the side that won has lower accuracy. That's because if you capitalized on a huge mistake from your opponent you might for example trade rook for a bishop and make other moves that will make the game shorter and easier, even if those moves aren't great.