A Statistical Approach to the Point Value of Pieces in FFA

Sort:
hest1805

Since I started playing 4 player chess (4pc), I have always been curious about the impact of points in FFA. Every type of piece, the pawn/1 point queen, the knight, the bishop, the rook, the queen and the king, are worth different amounts of points. It is important to find the most correct values of each piece to make a balanced game.

How have these numbers been reached? For a chess player, it is clear most of them come from 2 player chess (2pc). The pawn is the unit piece, so the piece values are measured in pawns. At the same time, the pawn is one of the most complicated pieces. The knight, rook and queen all have values corresponding to 2pc. However, it is not necessarily true that the values of the pieces are the same in 2pc and in 4pc. For example, the bishop is worth 5 points instead of 3, which is generally how much the bishop is worth in 2pc. Why? The qualitative explanation is that the board is bigger, so the bishop should be worth more than the knight. Translating the king value to 4pc is tricky because its value is infinite in 2pc.

A more fundamental question: which factors decide the value of a piece? How powerful it is, you could say, which translated to chess terms is how many squares it generally controls. Controlling squares is all good, but it doesn’t win you any games on its own. You need to get points. So, the strength of a piece can be measured by its ability to grab points. With this in mind, we can go into the game itself to find out how much each piece should be worth. Let us for a moment not think about 4pc as a battle between 4 players, but rather a battle between 6 types of pieces.

I would like to find an exact way to derive the correct piece values, based on mathematics instead of intuition. Perhaps the current piece values are based on some algorithm, but if so, I have never heard about it. Thus, I have tried to do it myself, using statistical methods. I have picked 30 high level FFA rapid/blitz games with a variety of players as a basis for my results. How can the value of a piece be described statistically? For example, why is the rook worth 5 points? Let us define the variable Expected Point Gain for each type of piece. I have decided to count the pawn and the 1-point queen as the same piece. Sticking to the same example, the rook being worth 5 points means that a rook is on average expected to capture 5 points. But that is not all. In FFA, a piece can turn grey instead of being captured, meaning that it does not give away any points. So, the definition also needs to have granted that it becomes captured. In total, that becomes The value of a piece is the expected amount of points that the piece will capture, granted that it becomes captured. Using this as a basis, I have made the following indicator: EPG(Expected Point Gain)=Expected Points Won – Expected Points Given Away. That is, for every one of those 30 games, I have gone through every single capture to see how much each type of piece has been able to capture and how much they have given away by being captured. Then I have taken the average over all the games and divided by the amount of the type of piece in the starting position(32 pawns, 8 knights etc), to get the EPG. In general, if the EPG is close to 0, it implies the piece is worth its value and balanced. If the EPG is very big, the piece takes more than it gives away, so the piece could be worth more. If the EPG is negative, the piece is probably worth too much.

That is the general theory, here is an example to make things clearer: let us say the EPG of the rook is 2. That means that on average, the rook takes 2 more points than it gives away, which implies that the rook should be worth more than it is currently. It does not necessarily mean that the rook needs to be worth exactly 2 points more = 7. Why? Two reasons: it is a complex system consisting of 6 different pieces with one EPG each, and they are dependent on each other. Changing the value of one piece will affect the EPG for other pieces, it is a compromise. The second reason is that the value of pieces affects how we play with them. Changing the value of the rook from 5 to 7 would for instance make a rook for bishop trade less tempting. The numbers found are indicators.

To clarify, here is one of the 30 games:

So, we can see that there are 3 numbers for each type of piece. “Won” means how many points that type of piece has captured, “Given Away” is the amount of times that piece has been captured times its value. And then the “Gain” is simply the difference. Feel free to go to the game and see if I have counted correctly. 😉

A couple of things to note: the king has a very negative score and that is completely normal. How often do you see a king capturing 20 points worth of material? Anyhow, that is beside point, the king is supposed to be a bonus piece, you get a bonus reward for “capturing” it. Other things that meet the eye is that the queen scores very well and the rook scores very badly, but it is only one example so we cannot draw any conclusions from it. You might wonder why the total score is -40 and not 0. The reason is that very often, at the end of a game, there are one or several kings left that have not been checkmated, so the kings give away 20 points each without being captured. Another element that could lead to strange total scores is double checks, but over the course of the 30 games double checks have been quite rare.

If you have read everything so far, you are probably curious to see what my experiment has yielded. The knowledgeable reader might have an idea or two about the results already. Here are the EPGs I have found:

A bunch of different values have been calculated; these are the ones I consider the most important. I decided to add the amount of checkmates for reasons you will see later. Let’s go through each piece separately.

Knight

With an EPG of 0.1, it seems like the knight is doing fine at 3 points.

Bishop

Despite its ability to develop fast and point straight at an enemy flank player, with a score of -0.9 it is clear that the bishop is not worth 5 points. Setting it down a point or two should be considered.

Rook

The rook scores close to 0 and seems balanced currently.

King

The king is by far the piece with the worst score, which is understandable as the king is not designed to be worth its value. Whether or not that is a good idea, I will not discuss that here. What this analysis can show is how the bonus points that you get from checkmates/capturing kings are distributed among the other pieces.

Queen

4.1 is much farther away from 0 than anything seen so far. It implies that the queen should be worth several points more than it currently is, which is strange considering that 9 points is balanced for the queen in 2pc. Having gone through the games and looked at what happens, the high score of the queen can be seen to be caused by the value of the king. In the games, the queen takes 54 % of all checkmates, and 30 % are dealt by pawns/1-point queens (read: queens). That leaves only 16 % of all checkmates for non queen types of pieces and contributes to making the queen look like an unbalanced piece.

Pawn

Scoring 0.7 for the pawn might look innocent, but it is not! Remember, this number applies to every single one of the 32 pawns on the board. Every pawn is expected to capture 0.7 points more than it gives away. The main reason is how powerful 1-point queens are and how often pawns get promoted in FFA. But I believe there is reason to think that pawns by themselves are more powerful in 4pc than in 2pc. In 2pc there is less space for pawns to move, you easily get blocked pawn structures where the pawns are immobile. In 4pc pawns are very mobile and thus much easier to promote. Blocked pawn structured are usually only seen on the flanks. The impact is devastating; as the pawn is defined as the unit value, instead of increasing its value we would have to divide the value of all other pieces by 1.7. Then knights would be 2 points, bishops and rooks 3 and queens 5. And the king? Who knows.

 

I would also like to share the results of an identical analysis of a format I have made myself, the chaturaji hyper fiesta. The only difference is that I have used 15 games instead of 30. The game looks like this:

The rules are Capture the King, 3 points for kings, promotion into (5-point) rook on the 8th rank and ¼|0 hyper bullet time control.

The Expected Point Gain of the pieces are the following:

We can see that these numbers are all looking close to 0, at least compared to normal FFA.

The pawn looks balanced.

The bishop underperforms the most, which I think is understandable. It is worth 5 points, the same as the rook. But with a normal 8x8 chess board, one would think that the piece values should be the same as in 2pc. That’s why I would like to ask our developers to add the possibility of adjusting piece values for different variants.

Knights do well, maybe because they often get to trade with 5 point bishops.

Rooks do ok, underperforming a bit. Could be because it's hyperbullet.

Kings do much better than in normal FFA, helped by its reduced value and the ability to sacrifice itself for another piece. They still underperform a little bit, which I believe is due to the fact that the last man standing gets the points for all remaining kings.

 

To conclude, I think there are several reasons to believe that the current point system in FFA is inaccurate. Queens, whether they are worth 9 points or 1 point, are the supreme rulers of FFA. Pawns are powerful. Kings and bishops are point donators. Rooks and knights are the only pieces that seem to be worth their price. For variants with different board sizes, I think we should have the option of adjusting the value of each piece.

Questions and comments are much appreciated. Is it easy to understand how the experiment has been conducted, and why it has been conducted in this particular way? Do you think this method is useful for evaluating the strength of the pieces? Is there something you would have done differently? Is there any step in the process that is unclear? Do you agree or disagree with my interpretation of the results? 

Sigma_1984

This is one hell of an analysis. You should apply for statistical analyst at chess.com, they could certainly use some one like you! Could you explain 'Given Away' with an example though? I don't exactly understand what you mean by it. 

pjfoster13

great article hest. How does your analysis handle an even trade? Like if 2 players exchange light-squared bishops, technically the LSB that initiates the trade is +5/-5, whereas the LSB that gets captured is 0/-5 and whatever piece did the recapture (probably the queen but sometimes a rook/knight) gets the +5. Does your analysis make any adjustments for that

hest1805

@goldenwriter I've been working on it off and on for a few weeks. All the data is in a spreadsheet.

@Sigma_1984 "Given Away" means points "lost" by a piece being captured, but I didn't want to write "lost" because it sounds like someone is losing points. 

@pjfoster13 that's a good question. If there is a trade of bishops, there are 4 things happening in this analysis: bishop captures 5 points, bishop loses 5 points, *capturing piece* captures 5 points, bishop loses 5 points. In total that is +5 for the other capturing piece and -5 for the bishop. While that might sound unfair for the bishop, I think on average this evens out as whenever there's another piece making a trade, there might be a bishop benefiting from it etc. So I don't think you need to compensate for that. 

 

JCrossover08

My goodness llama man, the smartness and nerdiness of this report made my brain explode after reading one paragraph of this like wtf Jesus this is some smart sht

Cha_ChaRealSmooth

I was also out at one paragraph but good work hest

pjfoster13

it really gets me thinking about the relative importance of bishops and rooks. Bishops are more easily deployed early, and more likely to engage in early swaps for rooks especially when there are tricks when the rooks get caught in the corner. Rooks are more likely to be lost prematurely before their value manifests itself.

However, from a practical perspective rooks have much more in-game position value due to the abilities to wall off kings, push pawns, stop promotions, form batteries, etc. They are powerful pieces that bely their observed capture value. Imo their value should probably be 6. 

Queens are extremely OP due to their mobility and ability to control the entire center and deliver mate. Their value is probably more like 12-15 (33%-66% higher than in 2pc).

It also makes me wonder if the idea of 1-point queen is fundamentally wrong as compared to the face-value promotion system (optional QRBK). There's so much strategy with how to trade, how not to trade, how to attack with & defend against multi-queens, and how to value the full-point queen vs. the 1-point queen. My guess is that most of us are probably playing 1-pt-queen FFA completely wrong from a game-theory-optimal strategic perspective

JkCheeseTheIdotNoob

Great article. This data can show people how valuable their pieces are to them and why they shouldn't be careless with them. I'm pretty sure all we need to know is on this page, but would you mind sharing the spreadsheet with us? I would like to look through it sometime. It's fine if you don't.

MayimChayim

btw kings can be 3 points

MayimChayim

if they are not the "royal" piece

Sigma_1984

You mean for example : I capture a rook with my knight and lose my knight and queen. My 'given away' would be 9+3=12?

MayimChayim

kings are 3 pts in war for throne

MayimChayim

and in all other variants if they are not the ones getting mated

BabYagun

It is an amazing article.

Did you also think that a piece value should also reflect another thing: How easy is to capture such a piece?

GustavKlimtPaints

this is really interesting! 

Brother_Communist

Very nice!

grable

Firstly, I should say (as others have already) this is all very impressive. It's clear that this project was a big undertaking, and the results show a lot of hard work and thought. In the interest of discussion though, I'll bring up a few points with my commentary.


hest1805 wrote:

In total, that becomes The value of a piece is the expected amount of points that the piece will capture, granted that it becomes captured... In general, if the EPG is close to 0, it implies the piece is worth its value and balanced.

 

I'm not so sure this is a 100% reliable method of determining a piece's worth. It seems like there's probably a substantial correlation, but there are many other factors at play. This ignores positional play and exchange sacrifices, to name just two. In the example game you listed, the first capture was 6.Nxn6, after 5.Nl5, forking the bishop and rook. Based on your numbers, you're suggesting that the right play by red should have been to take the rook instead (bishops are overrated by 0.9 EPG)? I don't have much experience in standard 4pc, but I think most players would take the bishop right? As mentioned, I'm sure this analysis points in the right direction much of the time, but is that enough to start changing piece values? I'm not sure.

 

hest1805 wrote:

Feel free to go to the game and see if I have counted correctly. 😉

 

I did =] All good!

 

hest1805 wrote:

Scoring 0.7 for the pawn might look innocent, but it is not! Remember, this number applies to every single one of the 32 pawns on the board. Every pawn is expected to capture 0.7 points more than it gives away. The main reason is how powerful 1-point queens are and how often pawns get promoted in FFA.

 

I wonder what the results would look like if you consider the 1 point queen as a separate piece. I know it's a little odd, because there are none at the beginning of the game to be captured, but maybe results would show that promoted queens should be worth 3 or 4 points, instead of one, and that would resolve the discrepancy with pawns altogether.

 

hest1805 wrote:

I would also like to share the results of an identical analysis of a format I have made myself, the chaturaji hyper fiesta... Rooks do ok, underperforming a bit. Could be because it's hyperbullet.

 

I think the reliability of this analysis scales with time control. In hyperbullet, the moves fly out so fast that it's not so much about calculation as it is about pressure, premoves, and anticipation. I think any results coming from hyperbullet analysis are doubtful at best.


 

Again, I'm not meaning to discredit any of what you've done. It's commendable stuff, and I'm sure your findings show some great mathematical truth behind the game of 4pc. I just thought I'd add my two cents. Take it for what it's worth!

ChampionOfChess100

hell this is complex! you should send this to chess.com support 

ChampionOfChess100

however in hyper-bullet, it's all about your ping. just flag everyone

JonasRath

I think it's perfectly reasonable for the Q to be worth much more in 4PC than in 2PC - it gains a lot more squares than any other piece on the 4PC board.