Computing the value of fairy pieces beyond one move?

Sort:
pds314

First off, I realize "run thousands of full games between hundreds of masters who understand the piece perfectly and have spent years memorizing a variant-specific opening book and endgames for multiple variants with that piece and see how it plays" is obviously the "best" option for figuring out piece values. Right, the price is what the market will bear.

That being said, this is for a fairy engines' evaluation function that should be able to competently evaluate a piece to within maybe plus or minus a few percent or something within like a second or less. Right, in order to play thousands of expert games with a piece, you must first have an engine that can even play even one game somewhat competently with that piece so it won't do something wierd and ridiculous like trade 2 knights and 2 pawns for a rook in the opening, or throw away its queen because it can't move in the starting position.

Currently what I'm doing is checking the mobility of the piece from every square on the board and averaging them, decreasing the mobility value of a square the higher the probability that something might be in the way. There are 2 obvious weaknesses here.

1. This doesn't account for colorbound pieces. A knight and an alibaba (alfil-dababbah composite, jumps 2 moves orthogonally or diagonally) have the exact same average (5.25) and maximum (8) single move mobility. But obviously a piece that can only move to 16 squares ever and one that can move to as many as 35 within 2 moves or less on an 8x8 board are not equally valuable.

2. This doesn't account for piece speed. An archbishop (knight bishop combo) in the middle of the board, even on a pretty cluttered board, has a high chance to be able to attack any square it wants in one move. A wazir (moves 1 step orthogonally) takes 14 moves to cross the board diagonally unobstructed. If you can think 27 plies ahead, you can probably also think of something better to do than waste that many tempo marching an anemic piece across the board for some tactic.

In reality these weaknesses are sort of the same, right, never being able to reach a square because of your move geometry, being unlikely to reach a square quickly on a cluttered board, and taking an absolute age to get there even on an open board are all kind of variants of the same problem. Poor multi-move mobility.

The issue is, I really don't know what a good way is to value multiple-move mobility accurately. Like, a Wazir on an infinite empty board can reach 4 new squares in 1 move, 8 new squares in 2 moves, 12 new squares in 3 moves, 16 in 4, 20 in 5, etc. So can a Ferz. Though the pattern expands much faster because it only hits every other square.

A king or "mann" can reach 8 new squares in one move, 16 in 2, 24 in 3, etc. It can reach as many squares as the sum of its parts, although distributed differently.

However, a knight does not act in the same manner. A clockwise-only "half knight" is effectively colorbound to one fifth of the board and has a move tree that looks like 4, 8, 12, 16, etc new squares it can reach.

A counterclockwise half-knight has the exact same move tree mirrored.

BUT a knight, which simply combines the move set of the half knights, has a move tree that looks like 8, 32, 68, 96, 120, 164, and is completely space-filling.

That is, it is greatly more than the sum of its parts but only in multiple move mobility. Not primary mobility.

The question is, how should this be accounted for in evaluation of a piece?

One idea is "take every square's probability of being reached without being blocked and divide it by the moves to get there." This greatly understates the importance of first move mobility and overestimates the value of, e.g. 6th move mobility. For example, it suggests a knight in the middle of the board is worth substantially more than a bishop in the endgame. It also suggests that the "clockwise half knight" type pieces would be worth noticeably less than 1/3rd of a knight. Also it suggests that the less dramatically colorbound but definitely very awkward to use "diagonal half knight" pieces would be noticeably stronger not just because of an advantage on their third move, which, sure, matters in like a few centipawns kinda way but also because they can wiggle their way almost into the corner over the course of 6 moves and that somehow this very weak extra mobility 4-6 moves out is worth more than an entire first extra square right now or two in one move. No. This is tapering much too slowly. What a piece can do in 6 moves is likely of sub-centipawn value unless it can promote or checkmate.

2. One simple option is to switch to an exponential system. e.g. (probabilistic) Mobility on move 1 is worth 40 centipawns per square. Mobility on move 2 is worth 8. On move 3 is worth 1.6, on move 4 is worth 0.32, etc, quickly fading to irrelevancy. That would put a knight on an infinite empty board at 320 + 256 + 108.8 + 30.72... etc. Obviously that's a >7 point knight which is preposterous, but keep in mind that the way it evaluates is on a finite board with blocked mobility and equal probably the piece will be on any square. So everything but the middle 4 squares in reality drags down a knights' usefulness. This knight is on an infinite board with unblocked mobility.

Incidentally, the "half knights" with this infinite empty model get 160, 64, 19.2, etc. So knights are assumed to be worth something like 2.8 half knights on an infinite board which feels about right based on playtesting. Keeping in mind on a real cluttered board will not put as much value on secondary mobility.

The alibaba on a real, cluttered, 8x8 board has a 5/6 chance of getting to 8 squares if it starts in the middle 16, a 5/6 chance of getting to 5 squares if it starts in 32 squares near the edge, and a 5/6 chance of getting to 3 squares if it starts in 16 squares near the corner.

In the 5 squares case, it then has a 6 squares it can reach. One with 3 25/36 chances. 4 with 2, and one with 1. That means the chances of FAILING to reach those squares are 11/36 = 30.56%, that squared = 9.3%, and that cubed = 2.85%. In short, it has about 5.3 second move mobility if you don't count going back to the initial square (and maybe 3.4 tertiary?).

So that's about 168 + 42 + 5 = 215 centipawns? Where I suspect a knight would have more like triple the expected second move mobility of its first and higher third move mobility to its second, but to get really good numbers I would need to run the calculations on a computer program designed for this purpose.

Still, it does feel like this magic number "nth move mobility is worth 5^n times less" is pretty arbitrary. Is there a good way to calculate this that has a bit more reasoning to it? Or is just known to work?

I also don't really know what this will produce for a valuation with something like an archbishop. Which is an important test case as it's known to be worth dramatically more than the sum of its parts.

HGMuller

You show a deep understanding of what factors affect piece value.

In the AI of Interactive Diagram I use a simplified method to assign piece values: I count the average mobility on a board that is 25% filled, tapering the white/black ratio from 100% to 0% across the board. (Such that forward captures weigh heavier than backward captures.) I calculate both the average mobility and its standard deviation, and then add one standard deviation to account for the fact that in games the players will place the pieces preferably on good squares, and will avoid bad squares. (Who cares whether a Knight has only 2 moves in a corner; no one will put it there!)

I did not explicitly determine 'future mobility', but it is anticipated a bit by including a term proportional to the square of the mobility in the piece value. This accounts for the fact that when you combine two move patterns, the 2-move mobility also includes combinations of one move of each pattern. It doesn't account for color binding, however, so it would indeed assign equal value to Alibaba and Knight. For the reasons you mention this is probably wrong. Although pieces bound to 50% of the board, such as Bishops, seem to hardly suffer from this binding, as long as you have one on each color. More severely bound pieces do not occur often in chess variants, though. I never measured the empirical value of an Alibaba in games. The algorithm also underestimates the Archbishop. It might be possible to cure this by giving an extra bonus for attacking orthogonally adjacent squares. (Which would also explain why a Rook is worth more than a Bishop even on a cylinder board.) For the purpose of teh Diagram (to act as a sparring partner for a novice in any variant) the values appear to be good enough, though.

In my engine CrazyWa I did use the future mobility explicitly, for deriving location-dependent piece values in Kyoto Shogi. In this variant moving a piece changes its type (i.e. its move pattern); a Rook becomes a (shogi) Pawn and vice versa. I did account for blocking of sliding moves. (Since Kyoto Shogi is a game with piece drops the board tends to retain its population density.) And then I supposed the value was proportional to the immediate mobility, plus a factor (< 1) times the value after the move, averaged over all moves. I iterated that (starting with value = 0 everywhere) until the values stabilized. This seemed to work pretty well.

BTW, the level of play for determination of a piece value in games does not have to be very high. It must not be completely foolish, but it certainly does not have to be GM quality.

Note that the entire concept of a piece value is an approximation; in real life the value of an army is not just the sum of values of individual pieces, but is also affected by how well the pieces cooperate, (e.g. a Bishop pair), and by what opponent they face. That an army of 3 Queens is totally crushed by an army of 7 Knights cannot be explained by any reasonable value for Q and N. So assigning values more precise than about a fifth of a Pawn could be a pointless endeavor.

jaakezzz
HGMuller wrote:

Note that the entire concept of a piece value is an approximation; in real life the value of an army is not just the sum of values of individual pieces, but is also affected by how well the pieces cooperate, (e.g. a Bishop pair), and by what opponent they face. That an army of 3 Queens is totally crushed by an army of 7 Knights cannot be explained by any reasonable value for Q and N. So assigning values more precise than about a fifth of a Pawn could be a pointless endeavor.


I was also thinking of this point. The roster of pieces on the board, and potential pieces that can appear on the board, are going to weigh into the value of an individual piece. The values are simply references to compare pieces with; there is no zero point, only a reference to the other pieces in the game.

It makes me wonder how accurate the estimated values for this variant are:
https://www.chess.com/join/dragonanga

HGMuller

Not very accurate. Embedded in a FIDE context the Xiangqi Knight is worth almost exactly half an orthodox Knight. (Which, according to Kaufman's analysis, would be worth 3.25 Pawn on a scale where Rook = 5.00.) Two Ferzes on opposit shade are worth slightly less than a Knight (3.00, so 1.5 a piece, but part of that probably comes as a pair bonus). A Wazir is worth slightly less than a Ferz (say 1.3), but, like all orthogonally moving pieces, it loses about 0.25 Pawns in value when trapped behind his own Pawn line. Since it can result only from promotion that doesn't seem an issue here.

Common Shatranj wisdom has it that the Alfil is about worth as much as a Pawn. Because Pawns promote more easily here, these could be worth slightly more than an Alfil.

The Dragon Bishop is hard to estimate. The compound of Bishop and orthodox Knight is worth nearly as much as a Queen; the difference is less than a Pawn (8.75 vs 9.5). There is an enormous synergy between these moves. The easy blocking of the Knight moves will hurt a lot, though. I still would expect it to be worth more than a Rook. Perhaps even 6 Pawns.

jaakezzz
HGMuller wrote:

Not very accurate.

While all of that is well and good analysis... I'd like to instantly dismiss all of it.

Reasoning: this is not your typical game. The piece roster will greatly affect the value of the pieces in a game. The fact that there are no rooks, no queens, no regular bishops, etc, makes all the piece values extremely different.

The fact that pawns cross centre to promote already changes the ENTIRE system's value, since (1) is actually no longer (1).

jaakezzz

I would strongly suggest playing at least a hundred games of Dragonanga before even contemplating the piece values in the game. happy.png

HGMuller

In general the context does not very much affect piece values, as long as you play against a varied mix of pieces. If that were not the case piece values would be pretty meaningless even in orthodox Chess, as games there will progress to a wide variety of remaining material.

That Pawns promote early could have an effect. But in this variant they promote to pieces that themselves are hardly worth more than a Pawn. Which makes it more like Pawns do not really promote at all. (This is already the case in Shatranj.)

PraseodymiumSpike

Fabian Fitcher has made a trainer for efficiently updatable neural networks that can train them for many different types of chaturanga-derived games, though unfortunately, it does not support Dragonanga in its entirety. Still, you could use it to find the values for many pieces in many variants. It's here on Github: https://github.com/fairy-stockfish/variant-nnue-pytorch.

PraseodymiumSpike
HGMuller wrote:

In general the context does not very much affect piece values, as long as you play against a varied mix of pieces. If that were not the case piece values would be pretty meaningless even in orthodox Chess, as games there will progress to a wide variety of remaining material.

That Pawns promote early could have an effect. But in this variant they promote to pieces that themselves are hardly worth more than a Pawn. Which makes it more like Pawns do not really promote at all. (This is already the case in Shatranj.)

I think you underestimate the impact of the royal piece only being able to move as a ferz.

jaakezzz
PraseodymiumSpike wrote:

Fabian Fitcher has made a trainer for efficiently updatable neural networks that can train them for many different types of chaturanga-derived games, though unfortunately, it does not support Dragonanga in its entirety. Still, you could use it to find the values for many pieces in many variants. It's here on Github: https://github.com/fairy-stockfish/variant-nnue-pytorch.


yes  I've been meaning to download and dig into fairy stockfish!

HGMuller
PraseodymiumSpike schreef:

I think you underestimate the impact of the royal piece only being able to move as a ferz.

Why should that make a difference? Both armies have many pieces that all cooperate to defend their royal. The army that has the best tactical capabilities will have the best chance to overcome the opponent's defenses. But the royal in any case just contributes very little to the tactical power, so if it is somewhat more or less is hardly noticable compared to the total.

jaakezzz
HGMuller wrote:
PraseodymiumSpike schreef:

I think you underestimate the impact of the royal piece only being able to move as a ferz.

Why should that make a difference? Both armies have many pieces that all cooperate to defend their royal. The army that has the best tactical capabilities will have the best chance to overcome the opponent's defenses. But the royal in any case just contributes very little to the tactical power, so if it is somewhat more or less is hardly noticable compared to the total.


Can we go back to your own point:
"Note that the entire concept of a piece value is an approximation; in real life the value of an army is not just the sum of values of individual pieces, but is also affected by how well the pieces cooperate, (e.g. a Bishop pair), and by what opponent they face. That an army of 3 Queens is totally crushed by an army of 7 Knights cannot be explained by any reasonable value for Q and N. So assigning values more precise than about a fifth of a Pawn could be a pointless endeavor."

The pieces that are present or not present on the board ultimately make the largest difference. A piece by itself is worth nothing; it needs a royal to defend and an enemy to defeat. Both sides have armies, and how the pieces in said army work together is going to change how valuable each individual piece's contribution is. This is not just logical in chess but also in any team oriented endeavor. In hockey if you put a dangler on the hitting line he probably won't do as well as if he was on the playmaking line. It's all relevance.

PraseodymiumSpike
HGMuller wrote:
PraseodymiumSpike schreef:

I think you underestimate the impact of the royal piece only being able to move as a ferz.

Why should that make a difference? Both armies have many pieces that all cooperate to defend their royal. The army that has the best tactical capabilities will have the best chance to overcome the opponent's defenses. But the royal in any case just contributes very little to the tactical power, so if it is somewhat more or less is hardly noticable compared to the total.

I think it does make a difference because the value of a long-range diagonal piece like the Dragon Bishop is somewhat inflated because of its usefulness in mating a Ferz-type Piece. Another example of a piece whose value changes greatly depending on what the situation is the cannon from janggi. If there are only a few pieces on the board, which is a situation that doesn't really occur too often in janggi itself as far as I know but can often happen in other variants like Daniel Lee's Synochess (https://www.pychess.org/variants/synochess), its value drastically reduces.

HGMuller

"...  and how the pieces in said army work together is going to change how valuable each individual piece's contribution is."

It changes how valuable the army is. As soon as it matters how well the pieces work together, it is no longer possible to assign values to the individual pieces. Such contributions typically only amount to some tenths of a Pawn. The 3Q-7N situation is extreme, because of the huge value difference between the pieces (which makes it impossible to relax a tight position by trading).

"I think it does make a difference because the value of a long-range diagonal piece like the Dragon Bishop is somewhat inflated because of its usefulness in mating a Ferz-type Piece."

Mating potential usually hardly contributes anything to piece value, in most chess variants. Because in the overwhelming majority of end-games that are reached there are sufficiently many Pawns left to provide it. Here the situation might be different because of the weak promotion. But playing King + piece vs King in the presence of some Pawns or Wazirs will still lead to elemination of all enemy Pawns and Wazirs, after which you can promote a Pawn of your own to assist in checkmating.

Note that a Wazir can checkmate a Royal Ferz all by itself! (But to force it it would need help of its own royal; if the royals are on the same shade, Royal Ferz + Wazir versus Royal Ferz is a nearly 100% won end-game. Unfortunately the royals start on different shade in this variant. Which I think is a mistake, as it makes the game much more drawish: royals can basically not participate in forcing checkmate. Of course you could make up for this by promoting to a non-royal Ferz on the shade of the opponent royal. Knight + Wazir is also enough to force checkmate, no matter on what shade the Royal Ferzes would be.)

And indeed, for hoppers like the Cannon it is completely different. Although there it depends more on how many pieces there are still around than on exactly what pieces that are. But no hoppers participate in Dragonanga, so this is beside the point.

jaakezzz
HGMuller wrote:

But playing King + piece vs King in the presence of some Pawns or Wazirs will still lead to elemination of all enemy Pawns and Wazirs, after which you can promote a Pawn of your own to assist in checkmating.

Note that a Wazir can checkmate a Royal Ferz all by itself! (But to force it it would need help of its own royal; if the royals are on the same shade, Royal Ferz + Wazir versus Royal Ferz is a nearly 100% won end-game. Unfortunately the royals start on different shade in this variant. Which I think is a mistake, as it makes the game much more drawish: royals can basically not participate in forcing checkmate. Of course you could make up for this by promoting to a non-royal Ferz on the shade of the opponent royal. Knight + Wazir is also enough to force checkmate, no matter on what shade the Royal Ferzes would be.)

You need to look a little closer at the rules:
- stalemated players win
- bare piece players forfeit
you do not need to checkmate an isolated royal in Chaturanga based variants

So far you haven't provided any evidence of accuracy in the values of Dragonanga pieces. Again I recommend playing it a lot before making hypotheses.

HGMuller

If bare King loses, mating potential does not matter at all, and thus won't affect piece values in any way.

What I state here are not so much hypotheses, but rather a general summary of observations on how context affected piece values in tens of thousands of games with dozens of different piece types. By far the major factor is the number of move the piece will on average have (with a higher weigt on forward moves and captures), in the regime where assigning values to individual pieces works at all.

The values of Ferz, Wazir, Xiangqi Horse in not-very-different contexts are well established. So the 'burden of proof' is actually on Dragonanga: evidence must be supplied that values are deviating.

Playing a few games will not ell you much about piece values. That would require at least a few thousand games, and only if these already start from positions with a material imbalance. Otherwise you would need more like a million games.

jaakezzz

I can tell you already that I've played hundreds of games, and I don't need as large of a sample size as your average analyst. Since you are the one here with zero Dragonanga experience, perhaps you should get on my level first. Then perhaps we could continue the discussion. Maybe beat me at Dragonanga and then you will have a say of some type.

Out of all of my hundreds of games, the suggested Dragonanga piece values have proven perfectly accurate. Unless you're an IM or GM and have well rounded Dragonanga theory, or have fairy Stockfish actively analyzing Dragonanga as we speak, then unfortunately you're the one that lacks any proof.

HGMuller

Sorry, but I don't play chess myself anymore. Why would I want to do something my computer would do much better and faster? Stockfish is not the only chess engine that is far stronger than a GM, you know...

So I trust the judgement of my engine, with the additional advantage that it has played tens-of-thousands of games, rather than some measly hundreds.

jaakezzz

Has your engine played Dragonanga? If not then how could you know?

PraseodymiumSpike
HGMuller wrote:

"...  and how the pieces in said army work together is going to change how valuable each individual piece's contribution is."

It changes how valuable the army is. As soon as it matters how well the pieces work together, it is no longer possible to assign values to the individual pieces. Such contributions typically only amount to some tenths of a Pawn. The 3Q-7N situation is extreme, because of the huge value difference between the pieces (which makes it impossible to relax a tight position by trading).

"I think it does make a difference because the value of a long-range diagonal piece like the Dragon Bishop is somewhat inflated because of its usefulness in mating a Ferz-type Piece."

Mating potential usually hardly contributes anything to piece value, in most chess variants. Because in the overwhelming majority of end-games that are reached there are sufficiently many Pawns left to provide it. Here the situation might be different because of the weak promotion. But playing King + piece vs King in the presence of some Pawns or Wazirs will still lead to elemination of all enemy Pawns and Wazirs, after which you can promote a Pawn of your own to assist in checkmating.

Note that a Wazir can checkmate a Royal Ferz all by itself! (But to force it it would need help of its own royal; if the royals are on the same shade, Royal Ferz + Wazir versus Royal Ferz is a nearly 100% won end-game. Unfortunately the royals start on different shade in this variant. Which I think is a mistake, as it makes the game much more drawish: royals can basically not participate in forcing checkmate. Of course you could make up for this by promoting to a non-royal Ferz on the shade of the opponent royal. Knight + Wazir is also enough to force checkmate, no matter on what shade the Royal Ferzes would be.)

And indeed, for hoppers like the Cannon it is completely different. Although there it depends more on how many pieces there are still around than on exactly what pieces that are. But no hoppers participate in Dragonanga, so this is beside the point.

Mating potential does matter a lot for beginners who aren't always going to play the best move and might fall into traps or hang mate. Stockfish isn't a good indicator of power for actual players. Also, engines have different biases than human players. This is usually completely overshadowed by their being superhumanly good at almost every aspect of the game, but it does matter when we're talking about relative piece value. Pieces in hand seem to be more relatively powerful in the hands of experienced human players than in the hands of computers for example.