Move Evaluation Descriptions

Sort:
justbefair

Move evaluations descriptions "excellent" and "good" are confusing. They are left over from when computer evaluations were largely based on gains or losses of material. Since the current move descriptions reflect changes in estimated winning chances, any deflection from the best move represents an increased chance of losing and should not be described as "Excellent" or "Good". They should be replaced. 

I suggest "suboptimal" and "dubious", respectively but perhaps others can come up with better adjectives. 

The problem comes from the decision to change the move evaluations and their descriptions a year or two ago from reflecting a material loss measured in centipawns to the "Expected Points Model." which rates the change in each side's winning chances after each move. https://support.chess.com/article/2965-how-are-moves-classified-what-is-a-blunder-or-brilliant-and-etc


A "blunder" used to be a move that cost you 2 points of material (200 centipawns) or more. Many people pointed out that there were times when this description was not correct. People lose material in a sacrifice that is not repaid until several moves later. Describing such moves as blunders was sometimes a mistake. So chess.com looked at it and thought that a "Blunder" should be a move that materially worsens your winning chances and now that's what it means. It is a move than worsens your winning chances by anywhere from 20% to 100%.


I think this was an overall improvement in the definition of a blunder. 


At the other end, moves less than "Best" were no longer describing losses of a few centipawns. 
All moves below "Best" worsen your winning chances under the "Expected Points Model."

 

Table I: Move Classifications with their corresponding change in expected points boundaries. If the expected points lost by a move is between a set of upper and lower limits, then the corresponding classification is used.

Classification    Lower Limit    Upper Limit
Best    0.00    0.00
Excellent    0.00    0.02
Good    0.02    0.05
Inaccuracy    0.05    0.10
Mistake    0.10    0.20
Blunder    0.20    1.00


 "Excellent" and "Good" Moves do as well. An "Excellent" move can reflect a 2% worsening in the Expected Points Mode. A "Good" move can mean a 2% to 5% worsening. These are bigger changes than a few centipawns.
 
When you are already down by three queens, losing a bishop may not substantially worsen your chances but it is still confusing to describe such moves as "Excellent" or "Good" when you are losing a piece.  


If your Doctor told you that the test results were "Excellent" and that your chances of dying this week only went up by 2%, you'd find a new Doctor.

 
I suggest that English is nuanced enough to replace "Excellent" with something like "Suboptimal" and "Good" with good old "dubious" or "?!."

tygxc

@1

There are no excellent, suboptimal, or dubious moves,
there are only good moves, errors, and blunders.
An error (?) is a move that turns a drawn position into a lost position or a won position back to a drawn position.
A blunder (??) turns a won position to a lost position. All other moves are good moves.

Martin_Stahl

Pretty sure that the chart given has nothing to do with move evaluations or material count, but winning expectancy. So an excellent move changes the expected outcome, based on the previous calculation by less than 2%. For good, between 2-3%.

justbefair
Martin_Stahl wrote:

Pretty sure that the chart given has nothing to do with move evaluations or material count, but winning expectancy. So an excellent move changes the expected outcome, based on the previous calculation by less than 2%. For good, between 2-3%.

I agree.  I think I said they moved away from using material loss as the sole basis for evaluation. However, I am just saying that describing any move that materially worsens the winning chances as "Excellent" or "Good" is confusing to many players.

Martin_Stahl

If a player goes from a 70% win expectancy to a 68.5%, I can't see why that would be considered suboptimal or dubious. 

 

From a regular evaluation standpoint, both of those are probably less than a 0.25 difference. It's close enough to the best that good or excellent are probably fine.

tygxc

@5
Inaccurate moves do not exist.
Either the move changes the game state from draw to loss, or from won back to draw, then it is a mistake, or it does not change the game state and then it is not inaccurate.

justbefair
Praveen_bhat97 wrote:
justbefair wrote:
Martin_Stahl wrote:

Pretty sure that the chart given has nothing to do with move evaluations or material count, but winning expectancy. So an excellent move changes the expected outcome, based on the previous calculation by less than 2%. For good, between 2-3%.

I agree.  I think I said they moved away from using material loss as the sole basis for evaluation. However, I am just saying that describing any move that materially worsens the winning chances as "Excellent" or "Good" is confusing to many players.

I often get confused with inaccurate moves and mistake! How do you define them? Or difference between them?

It's not easy to figure our the difference without being a computer yourself.   They run their model making an estimate of who is going to win after every move.   A difference of several points in the model is measurable and therefore reportable. 

However,  making sense of that difference as a human can be difficult.

Martin_Stahl
Praveen_bhat97 wrote:

I often get confused with inaccurate moves and mistake! How do you define them? Or difference between them?

 

The linked chart gives the idea the site is using, based in the engine determination of winning expectancy changes. 

 

A lot of people have different definitions for any move classifications.  Commen evals are

  • Is winning (decisive)
  • Is winning (clear advantage)
  • Is better (slight advantage)
  • Even
  • Is worse (slight disadvantage)
  • Is losing (clear disadvantage)
  • Is losing (decisive)

An inaccuracy may drop the evaluation some but not as much as a mistake; this is fairly subjective. A mistake likely changes the evaluation a full category. A blunder changes at least two and from winning to even or completely switching from winning to losing.

 

Martin_Stahl
tygxc wrote:

@5
Inaccurate moves do not exist.
Either the move changes the game state from draw to loss, or from won back to draw, then it is a mistake, or it does not change the game state and then it is not inaccurate.

 

Well, that is subjective. A move that makes the win harder, over a cleaner option, could be considered an inaccuracy.

 

Say, instead of a forced mate, one that allows counterplay but doesn't majorly impact the evaluation could be considered an inaccuracy.

 

magipi
justbefair wrote:

They are left over from when computer evaluations were largely based on gains or losses of material. Since the current evaluations reflect estimated winning chances

l" and "Good" with good old "dubious" or "?!."

I'm sure that both of those sentences are factually wrong.

First of all, the script that assigns those dumb adjectives is not some ancient thing, it was probably written 5-10 years ago by some chess.com programmer.

Second, computer evaluation has nothing to do with winning chances. (How could it be?!) It still evaluates future positions based on many factors, and one of the most important one is material.

Martin_Stahl
magipi wrote:
justbefair wrote:

They are left over from when computer evaluations were largely based on gains or losses of material. Since the current evaluations reflect estimated winning chances

l" and "Good" with good old "dubious" or "?!."

I'm sure that both of those sentences are factually wrong.

First of all, the script that assigns those dumb adjectives is not some ancient thing, it was probably written 5-10 years ago by some chess.com programmer.

Second, computer evaluation has nothing to do with winning chances. (How could it be?!) It still evaluates future positions based on many factors, and one of the most important one is material.

 

Winning expectancies (chances) are a feature of NNUE style engines, which may or may not also take material into account exactly. Those type of engines look at positions and future positions and have winning expectancy value assignment based on how they built their nets.

 

But @justbefair isn't really that off. A lot of people using engines mapped certain evaluation losses to a particular term, such as blunder. That doesn't necessarily only mean material, since other things are accounted for in the evaluation. But that type of mapping has been around quite a while and isn't done creation if a chess com programmer. The exact mappings are designed or decided upon experimentally, along with some work around how to find edge cases where it may not make sense.

justbefair
magipi wrote:
justbefair wrote:

They are left over from when computer evaluations were largely based on gains or losses of material. Since the current evaluations reflect estimated winning chances

l" and "Good" with good old "dubious" or "?!."

I'm sure that both of those sentences are factually wrong.

First of all, the script that assigns those dumb adjectives is not some ancient thing, it was probably written 5-10 years ago by some chess.com programmer.

Second, computer evaluation has nothing to do with winning chances. (How could it be?!) It still evaluates future positions based on many factors, and one of the most important one is material.

I see your point that the numeric computer evaluation is still largely based on material evalation. 

However, the move descriptions "Blunder", Best, Mistake and so on are now based on the estimated winning chances using the "Expected Points Model." (That's what the linked Help page says.)   

My simple point is that every move that is not "Best" reflects a lowered chance of winning.  Therefore, it can be confusing to describe such moves as "Excellent" or "Good".

xor_eax_eax05

 Chess.com should drop these gimmicks and just display the evaluation values and that's it. They could also offer tools to measure centipawn loss between moves, average centipawn loss in a game, etc. 

 Better yet, the site could offer players "lessons" on how to analyse games with engines, how to set up stockfish, how to analyse a position, how to perform a deep position analysis with trees of moves, etc. 

  All titled chess streamers fail to do this as well - they never explain to their audience how to use tools to learn for themselves, how to use databases, etc., and I really dont understand why. 

 

  On this site I've read so many threads of players asking "why did the engine say this move is bad?" ... they don't even know they the engine is showing them the line right there, because they probably dont even know what those lines are there for and don't even realise it's showing them what the engine is "thinking" about.

awesome1184

please make this readable in dark mode

xor_eax_eax05
awesome1184 wrote:

please make this readable in dark mode

+1

Martin_Stahl
xor_eax_eax05 wrote:

 Chess.com should drop these gimmicks and just display the evaluation values and that's it. They could also offer tools to measure centipawn loss between moves, average centipawn loss in a game, etc. 

 

 

That exists. You can see the move evals on the eval bat and score graph or by stepping through the game on the Analysis tab.

 

I also believe the Avg Diff value is the average centipawn loss.

tygxc

@12
"Say, instead of a forced mate, one that allows counterplay but doesn't majorly impact the evaluation could be considered an inaccuracy."
++ No, a win is a win.
There may be a forced checkmate involving some sacrifice but some players may refrain from it fearing some miscalculation and instead prefer to simplify to a won endgame a pawn up.
A poor endgame player may refrain from simplifying to a won endgame and chose to sacrifice even if he cannot calculate all its ramifications.
One is not more accurate than the other.

justbefair
awesome1184 wrote:

please make this readable in dark mode

I have reposted the original. 

I hope it is now readable.

 

awesome1184
justbefair wrote:
awesome1184 wrote:

please make this readable in dark mode

I have reposted the original. 

I hope it is now readable.

 

tysm

Martin_Stahl
tygxc wrote:

@12
"Say, instead of a forced mate, one that allows counterplay but doesn't majorly impact the evaluation could be considered an inaccuracy."
++ No, a win is a win.
There may be a forced checkmate involving some sacrifice but some players may refrain from it fearing some miscalculation and instead prefer to simplify to a won endgame a pawn up.
A poor endgame player may refrain from simplifying to a won endgame and chose to sacrifice even if he cannot calculate all its ramifications.
One is not more accurate than the other.

 

Again, the determination of an inaccuracy is subjective. I gave an example of what someone could consider an inaccuracy.