Computer evaluations are not exactly right

Sort:
Tactrix

Ok so here's my thinking. When Stockfish(or whichever engine chess.com uses) evaluates your games in analysis it has the right idea but not all the time. Now if you were playing a game strictly as a computer for best moves, it would be 100% correct in its evaluation. However as a human it's evaluations aren't correct in the sense that it always finds the best move based on other best moves. Which is not how humans play. We TRY to get best moves, but most of the time we don't, we get just a decent move or a pretty good move. But more importantly overall we're aiming for a very specific strategy as humans, and the computer isn't aiming for that strategy. The computer is always trying to find the best mate in the best line, and we as humans try to find the specific mate we're trying to get which is usually not the same one.

So it's not exactly that the computer's wrong, it's just not always correct in it's evaluations, because it assumes stuff like "if you take this pawn you're up material" instead of "if you leave this pawn it will block the path for 2 moves allowing me to put my rook in another position to eventually get my checkmate".

Martin_Stahl
Tactrix wrote:

Ok so here's my thinking. When Stockfish(or whichever engine chess.com uses) evaluates your games in analysis it has the right idea but not all the time. Now if you were playing a game strictly as a computer for best moves, it would be 100% correct in its evaluation. However as a human it's evaluations aren't correct in the sense that it always finds the best move based on other best moves. Which is not how humans play. We TRY to get best moves, but most of the time we don't, we get just a decent move or a pretty good move. But more importantly overall we're aiming for a very specific strategy as humans, and the computer isn't aiming for that strategy. The computer is always trying to find the best mate in the best line, and we as humans try to find the specific mate we're trying to get which is usually not the same one.

So it's not exactly that the computer's wrong, it's just not always correct in it's evaluations, because it assumes stuff like "if you take this pawn you're up material" instead of "if you leave this pawn it will block the path for 2 moves allowing me to put my rook in another position to eventually get my checkmate".

 

Within inherent limitations of engines, the best move is objective. You can play something and hope your opponent doesn't find the best line, but there's still a chance they will.

 

If your move isn't best but is still good, from the engine standpoint, that's a different story. If it evaluates as bad, it really is, though you might get lucky and your opponent won't be able to navigate the complexities happy

 

 

x-8590175396
Ya
Tactrix
Martin_Stahl wrote:
Tactrix wrote:

Ok so here's my thinking. When Stockfish(or whichever engine chess.com uses) evaluates your games in analysis it has the right idea but not all the time. Now if you were playing a game strictly as a computer for best moves, it would be 100% correct in its evaluation. However as a human it's evaluations aren't correct in the sense that it always finds the best move based on other best moves. Which is not how humans play. We TRY to get best moves, but most of the time we don't, we get just a decent move or a pretty good move. But more importantly overall we're aiming for a very specific strategy as humans, and the computer isn't aiming for that strategy. The computer is always trying to find the best mate in the best line, and we as humans try to find the specific mate we're trying to get which is usually not the same one.

So it's not exactly that the computer's wrong, it's just not always correct in it's evaluations, because it assumes stuff like "if you take this pawn you're up material" instead of "if you leave this pawn it will block the path for 2 moves allowing me to put my rook in another position to eventually get my checkmate".

 

Within inherent limitations of engines, the best move is objective. You can play something and hope your opponent doesn't find the best line, but there's still a chance they will.

 

If your move isn't best but is still good, from the engine standpoint, that's a different story. If it evaluates as bad, it really is, though you might get lucky and your opponent won't be able to navigate the complexities 

 

 

Exactly, see you get it. It's not that it evaluates it as bad, it's just that it evaluates it sometimes as like a missed move, or maybe not the best move, but in the specific line that I'm going for it is the best move, because if I did the move that the computer suggested then it would hurt my overall strategy. Because as I'm playing I don't know what the computer's strategy is, and likewise it doesn't know what my strategy is, so the strategy it's basing it's evaluations on are what's best overall for the fastest outcome, but at my level my thinking is nowhere near the computers, so it's kind of a miss on both ends. 

zone_chess

Just to knock in the door - computer engine play is always better than human thinking. Since 1997 that's common knowledge. Except in some endgame situations.

But the thought that engines don't use strategy is incorrect. We just cannot see the strategy because it's algorithmic - on a much higher level than us organic humans ordinarily think. This is very multilayered and multiparametrical. I suggest reading the paper that explains how AlphaZero works. It's basically collaborative pathfinding for the quickest route to checkmate in a given position. And once you start to grasp its internal 'strategies' (paraphrased since it's a projection of the human mind - we tend to post-rationalize the human concept of 'strategy' onto an algorithm, not yet intuiting that any anthropomorphization creates a distractive discrepancy) it can be profound, alien at first. It raises the bar for our mental capacities.

Tactrix
zone_chess wrote:

Just to knock in the door - computer engine play is always better than human thinking. Since 1997 that's common knowledge. Except in some endgame situations.

But the thought that engines don't use strategy is incorrect. We just cannot see the strategy because it's algorithmic - on a much higher level than us organic humans ordinarily think. This is very multilayered and multiparametrical. I suggest reading the paper that explains how AlphaZero works. It's basically collaborative pathfinding for the quickest route to checkmate in a given position. And once you start to grasp its internal 'strategies' (paraphrased since it's a projection of the human mind - we tend to post-rationalize the human concept of 'strategy' onto an algorithm, not yet intuiting that any anthropomorphization creates a distractive discrepancy) it can be profound, alien at first. It raises the bar for our mental capacities.

That's why I'm saying, it's not that it's wrong per say, but it's just at such a completely different train of logic that it would be ludacris to follow it's evaluations after the game with the thinking of "oh if I had done this it would have been better." Because sometimes it would have been, but most of the time playing the way we play we would never find those moves.

magipi
Tactrix wrote:

So it's not exactly that the computer's wrong, it's just not always correct in it's evaluations, because it assumes stuff like "if you take this pawn you're up material" instead of "if you leave this pawn it will block the path for 2 moves allowing me to put my rook in another position to eventually get my checkmate".

If your strategy works, the engine probably also sees it, and gives it a high plus evaluation (checkmate is better than winning a pawn). If the engine says that it's bad, it's probably because it is bad. It is worth exploring your line and trying to figure out why the engine thinks that it does not work.

JamesColeman

You’re correct in the sense that sometimes it wouldn’t make sense to play the computer’s top choice (even if you knew what it was) if that demanded a high level of follow-up precision and in that case a nearly as good but much simpler option could be a lot more practical.

 

But it’s never the case that ‘the engine doesn’t like my attack because it isn’t aiming for that strategy’. It would already have considered all approaches and all possible defences to each approach and if there’s an idea it evaluates as much inferior to something else, then the idea is simply flawed- regardless of whether your opponent is good enough to see the flaw or not. 

Rocky64
Tactrix wrote:

Now if you were playing a game strictly as a computer for best moves, it would be 100% correct in its evaluation. 

Not true at all.

10 Positions Chess Engines Just Don't Understand

Tactrix
Rocky64 wrote:
Tactrix wrote:

Now if you were playing a game strictly as a computer for best moves, it would be 100% correct in its evaluation. 

Not true at all.

10 Positions Chess Engines Just Don't Understand

That's even better news, so not only are they not getting the evals perfectly for my games by they are also not understanding a bunch of other ones.

Tactrix
JamesColeman wrote:

You’re correct in the sense that sometimes it wouldn’t make sense to play the computer’s top choice (even if you knew what it was) if that demanded a high level of follow-up precision and in that case a nearly as good but much simpler option could be a lot more practical.

 

But it’s never the case that ‘the engine doesn’t like my attack because it isn’t aiming for that strategy’. It would already have considered all approaches and all possible defences to each approach and if there’s an idea it evaluates as much inferior to something else, then the idea is simply flawed- regardless of whether your opponent is good enough to see the flaw or not. 

It's just some games I've noticed that the computer upon evaluation will would tell me that the move I made wasn't the best not based on that move, but based on something that if I took would 100% invalidate the strategy I was trying to do. Like it would block my position. Now in retrospect I'm sure that the computer found a way around this issue, but I didn't.

I think the point I'm trying to get across is that while the computer might be right more often than not, it's not exactly designed to compute elo rating into it's equations. Because the best move for a 1500 is grossly different for a 900, since the 900 will never see it.

brianchesscake

The difference is that engines have NO PROBLEM analyzing super complicated, tactical, and/or tricky lines (they also don't get tired, make dumb mistakes, etc.) - so if they have the option of going for positions where you have a 10% chance of losing but 25% chance of winning, compared to a position where you have a 0% chance of losing but 14% chance of winning, the computer will go for the former choice all the time, whereas humans will naturally try the lower-risk approach.

mog926

Yeah, depends on many factors like engine depth, the actual position and how complicated it is. Let's also think about in Openings how there's so many options and equal strength openings and moves are optional. Engine only shows a few best moves it found or it's calculating. As a human, you should always validate your own moves and if you have ideas or plans for the moves then they are good enough at the time. Even a bad plan is better than random move or no plan. If you can understand the engine move though, and you know why it's better than your move then you can learn from it, but sometimes Engine moves can be quite subtle and hard to understand. Even though I have 3000 tactics rating, and always analyze my games, there's some occasions where the engines plans or moves confuse me and I don't agree with them (and they crush me if I'm playing a strong engine). I also rather go with some lines I learned from streamers or stuff I built for myself over years. Even if it's not the best, it's the best for me sometimes. An FM once told me "You can't always trust Engines" and he mostly just used his brain to analyze his own games. 

mog926
Rocky64 wrote:
Tactrix wrote:

Now if you were playing a game strictly as a computer for best moves, it would be 100% correct in its evaluation. 

Not true at all.

10 Positions Chess Engines Just Don't Understand

Well, those kind of positions are almost impossible positions to begin with, I've never been in those positions in 20 years  (off and on) of chess with 10k games. 
It's rare you're going to get the engine stumped from a real chess game, not created problem.

MARattigan
mog926 wrote:
Rocky64 wrote:
Tactrix wrote:

Now if you were playing a game strictly as a computer for best moves, it would be 100% correct in its evaluation. 

Not true at all.

10 Positions Chess Engines Just Don't Understand

Well, those kind of positions are almost impossible positions to begin with, I've never been in those positions in 20 years  (off and on) of chess with 10k games. 
It's rare you're going to get the engine stumped from a real chess game, not created problem.

Just pick a random KNNvKP position. SF's are out of their depth in the great majority.

It can only get worse with more pieces.

sndeww
MARattigan wrote:
mog926 wrote:
Rocky64 wrote:
Tactrix wrote:

Now if you were playing a game strictly as a computer for best moves, it would be 100% correct in its evaluation. 

Not true at all.

10 Positions Chess Engines Just Don't Understand

Well, those kind of positions are almost impossible positions to begin with, I've never been in those positions in 20 years  (off and on) of chess with 10k games. 
It's rare you're going to get the engine stumped from a real chess game, not created problem.

Just pick a random KNNvKP position. SF's are out of their depth in the great majority.

It can only get worse with more pieces.

and you get those positions how often?

MARattigan
B1ZMARK wrote:
MARattigan wrote:
mog926 wrote:
Rocky64 wrote:
Tactrix wrote:

Now if you were playing a game strictly as a computer for best moves, it would be 100% correct in its evaluation. 

Not true at all.

10 Positions Chess Engines Just Don't Understand

Well, those kind of positions are almost impossible positions to begin with, I've never been in those positions in 20 years  (off and on) of chess with 10k games. 
It's rare you're going to get the engine stumped from a real chess game, not created problem.

Just pick a random KNNvKP position. SF's are out of their depth in the great majority.

It can only get worse with more pieces.

and you get those positions how often?

Never had KNNvKP against anybody. Closest was two moves.

But probably get them all the time with different material. Just that I'm even further out of my depth than Stockfish with more than five men on the board. (Only talking about closely matched positions of course.)

sndeww

Usually positions that computers don't "understand" are positions you will never get or get very rarely. It is safe to assume that you will not be getting said positions, and that generally speaking the engine is more correct than you are.

MARattigan
B1ZMARK wrote:

Usually positions that computers don't "understand" are positions you will never get or get very rarely. It is safe to assume that you will not be getting said positions, and that generally speaking the engine is more correct than you are.

I'm sure that SF (which is what I normally play) is generally more correct than I am, otherwise I wouldn't keep losing.

But in closely matched tablebase positions SF gets worse as the number of men increases. I don't see why it should suddenly get better after the tablebases run out. I would expect it to carry on getting worse. No reason at all to assume these positions are rare.

Would you count this as a particularly rare kind of position? It's played by Rybka with the Nalimov tablebases, so is not only perfect but perfectly accurate. It's evaluated by SF without tablebase assistance.

SF thinks there are 3 inaccuracies, 2 mistakes, 4 blunders and a missed win (in 34 moves).
 

All positions are endgames; we just stop thinking of them as endgames when we can't fully comprehend the position. Neither can the engines but they don't not comprehend them as badly as we don't comprehend them.

And they're not bad at what you'ld generally call endgames. They're only bad at the ones you've had a good look at. They're sh*t hot at at the ones you haven't.