How to automatically extract FEN code of several ammount of games?

Sort:
SimonOrellana17

Hi everyone,

I asked this question in the general chess.com forum and people referred me to this comunity.

I have a problem. I want to make some statistical analyses using the final position of several games. For this purpose, I need to know what pieces were left in the final position and in which square they were.

Due to the huge ammount of games (arround 1000), I need an automatic process or something similar.

I know that FEN code gives me that information, but I don't know how to extract the FEN code of the final position of this several ammount of games.

Does anyone have an idea on how to do this? Is there any software that allows me to do that?

All suggestions are welcome.

Juzkus

Hi Simon,

One thing you could try is the Chess.com Published-Data API.
https://www.chess.com/news/view/published-data-api#pubapi-endpoint-games-archive

The "Complete Monthly Archives" API will allow you to retrieve game data for a specified year and month. You could call the endpoint with different usernames and month/year combinations to retrieve the FEN data you're interested in.

The endpoint is in this format (you can put it in your web browser URL or write a script):
https://api.chess.com/pub/player/{username}/games/{YYYY}/{MM}

For example, this link returns your games in June of 2020:
https://api.chess.com/pub/player/simonorellana17/games/2020/06

The data is organized as an array of games. Each game object includes a "fen" property with the value you're after. 

stephen_33

SimonOrellana17, it might help to know something about your experience of coding?

SimonOrellana17
Juzkus escribió:

Hi Simon,

One thing you could try is the Chess.com Published-Data API.
https://www.chess.com/news/view/published-data-api#pubapi-endpoint-games-archive

The "Complete Monthly Archives" API will allow you to retrieve game data for a specified year and month. You could call the endpoint with different usernames and month/year combinations to retrieve the FEN data you're interested in.

The endpoint is in this format (you can put it in your web browser URL or write a script):
https://api.chess.com/pub/player/{username}/games/{YYYY}/{MM}

For example, this link returns your games in June of 2020:
https://api.chess.com/pub/player/simonorellana17/games/2020/06

The data is organized as an array of games. Each game object includes a "fen" property with the value you're after. 

Thank you for showing me this powerful tool. The problem is that the games I need to analyze are Bobby Fischer ones. It is possible to use this tool anyway? I am asking this because in the Bobby Fischer database that is avaible on chess.com it does not show the month, only the year. So I don't know how to fill the format. 

By the way, I have a Bobby Fischer database in ChessBase software (I mention this in case that makes things easier)

SimonOrellana17
stephen_33 escribió:

SimonOrellana17, it might help to know something about your experience of coding?

I really don't have much coding experience. I only use Matlab, and the codes I've done has not been of excessive difficulty.

Juzkus

Hey Simon,

Thanks for sharing more context. I don't think that this API can be used to retrieve the Bobby Fischer games. The PGN formats of Bobby Fischer's games can be downloaded here: https://www.chess.com/games/bobby-fischer

You can check the box in the top right of the table (next to the Year column) to select all games for that page. The download button is right above that. There are 48 pages, though. This provides the PGN format.

You mention having a database already, does your database have the PGN data? If so, you might want to try using PGN to FEN converter software. 

I have seen some converters do this online for individual games, but if you want to automate it then it might be worth looking at something more like this:
https://github.com/SindreSvendby/pgnToFen

What format is your database? Is it a plaintext file, or sqlite perhaps? The repository I linked provides code that would transform PGN into FEN, but you still need code to read your database and run it through this conversion. If you can provide more information about the format of your data, I could try to show how that might work in code.

SimonOrellana17
Juzkus escribió:

Hey Simon,

Thanks for sharing more context. I don't think that this API can be used to retrieve the Bobby Fischer games. The PGN formats of Bobby Fischer's games can be downloaded here: https://www.chess.com/games/bobby-fischer

You can check the box in the top right of the table (next to the Year column) to select all games for that page. The download button is right above that. There are 48 pages, though. This provides the PGN format.

You mention having a database already, does your database have the PGN data? If so, you might want to try using PGN to FEN converter software. 

I have seen some converters do this online for individual games, but if you want to automate it then it might be worth looking at something more like this:
https://github.com/SindreSvendby/pgnToFen

What format is your database? Is it a plaintext file, or sqlite perhaps? The repository I linked provides code that would transform PGN into FEN, but you still need code to read your database and run it through this conversion. If you can provide more information about the format of your data, I could try to show how that might work in code.

Yes sir, I have the games in this .pgn database (so I assume that all games are in that format)

stephen_33

If it's in standard .pgn format is it fair to assume that file's in plain-text? This would be a trivial problem to solve in a language like Python, which is my preference but SimonOrellana says that he's familiar with only Matlab.

That's not one I know & from what I've read on Wiki, it looks like a maths-based language? Anyone know if it can manipulate strings?

* Of course this isn't strictly relevant to the activities of this club because the data required can't be downloaded from any of the chess.com endpoints available on the API. Still be nice to help with a solution though.  😊

Juzkus

Hi Simon,

I wrote a script that you can use to do the batch PGN to FEN conversion:
https://github.com/Juzkus/PGN2FENBatch

To use this, you will need Python 3 installed on your machine. Your Python installation will include a program called "pip" which is a package manager. My script has a dependency on the "python-chess" library. You can install this from a console/terminal by running this command:

pip install python-chess


You can get my script by cloning the repository or downloading the latest version as a zip file:
https://github.com/Juzkus/PGN2FENBatch/archive/master.zip


You can run the script like this:

python batch.py 
--pgn 'c:\users\your_user\desktop\BobbyFischer.pgn'
--out 'c:\users\your_user\desktop\BobbyFischer.fen'


Note - the command should be one line, I used multiple lines in this post for formatting purposes. The full description is in the Github repository.

The output file will have a list of FENs - one per line per game that represent the end position. They will be in the order the games are in the PGN file.

I hope this can be useful. Please let me know if there are any specific changes that would help for your project. Feel free to use the script or make any changes. You can also create an issue on my Github project if you'd like further changes to the project.