A Turing test for chess engines

Sort:
skorj

Not so long ago the idea of building a chess playing computer that could seriously compete with a competent human was a leading edge challenge in the world of artificial intelligence, with many perfectly reasonable people doubting it could be done. As I sit typing these words on my simple laptop that by virtue of a free to download software could also beat a super GM, it seems odd to imagine the idea was all that controversial. But while that fundamental question was answered in the affirmative some time ago and the focus of the AI community at large has moved on to other things, the drive to push artificial playing strength ever higher seems to have continued on unabated. As popularly conceived it seems a good chess engine has come to mean no more and no less than a strong chess engine.


If you're not into engine tournaments though, and you can't find your name in the FIDE top ratings list without having to scroll down a bit, I'm one of those who questions whether having this year's 3320 Elo engine instead of last years 3275 Elo engine is about anything other than keeping up with the Jones's. Used as an analysis tool any decent engine can handily find the kinds of tactical blunders I make. That's easy. What's hard is finding a chess engine I'd actually want to play against, and as raw playing strength increases year by year the situation only seems to get worse. Playing a dumbed down analysis engine combines the dis-satisfying experience of playing against a calculator with the depressing sense of being up against a much stronger opponent who you know is purposely blundering from time to time in order to give you a chance. The best solution I've come up with lies in my little collection of weak (by current standards at least) engines that I can play full strength without having to accept odds of success that lie somewhere between a lottery win and being struck by a meteor. This is better, but I still can't help but be keenly aware of the futility of wondering what my opponent must be thinking. There's no thought there, just numbers being crunched. As a result I've probably invested more time into tracking down these little gems of brilliantly sub-optimal code than I have playing against them.

 

Now that the days of wondering whether computers will ever be able to play a decent game of chess evokes the sort of nostalgic sense folks of my generation feel for big hair and leg-warmers I think it's fair to ask if we could do better on this front. I'm not thinking simply in terms of trying to adapt the playing styles of today's powerful engines to make them more human like so much as designing software from the ground up with sole mission of creating a machine that will play chess in a manner indistinguishable from human play, where Turing, not tourney, is the real measure of success.

 

Given the easy access to human like opponents that online play has made possible via, you know, actual humans, one could reasonably ask what the point of this would be. With glad acknowledgment to all the goodness online chess has given the world, consider the value of an opponent that isn't just ready and willing whenever you need, but willing to play at any time control, with either colour and from any opening position you chose without complaint, will adjourn whenever you ask, be available whenever you want to resume and who will never troll you in chat no matter how badly you blunder. For myself, while I have always enjoyed slower time controls, I have all but given up on this when it comes to live online play. Once the difficulty of finding a willing opponent in the first place is overcome there's that nagging fear in the back of your mind that two and a half hours invested in play could be flushed away by a WiFi glitch or temporary service outage. Then there's the dual fear of an unexpected ring of the doorbell will necessitate an adjournment request your opponent might not accept, coupled with the possibility of having to decide what to do with an adjournment request that could be an unexpected ring of your opponent's doorbell, or maybe a sense that their position is starting to go a bit pear-shaped brought on a sudden urge to conveniently disappear. Lastly, well, let's just say that if I'm playing against an engine I at least want to be aware of that fact before the game starts. There are plenty of advantages to having chess engines as opponents but they're largely undone by the simple fact that the experience usually isn't that satisfying. The point is, what if it were?

 

Programming a computer to play a good game of chess isn't just about creating a marketable product though. Once it was the very symbol of our progress towards creating machines that could in some way be said to think. But even as developers continue to push chess engine playing strength higher into the stratosphere level by level the artificial intelligence community at large has moved on to other things. There's a world of difference in between understanding how to get a machine to perform a thinking task that a human can do and learning to do it the way a human would, making decisions that betray a thought process, exhibit some semblance of understanding, even make the same sort of mistakes for the same reasons. This strikes me as an even more profound undertaking than simply beating a Grandmaster.


The measure of progress here could be a literal Turing test, where players in an online game don't know whether their opponents moves are being generated by a computer or being played by another human. The more difficulty players have determining when they are playing against an engine rather than another person, the more successful the chess engine would be considered regardless of playing strength (which I'd argue should be about equal to an average club player). Similar methods might also be possible, such as presenting game scores to experienced players and asking them to guess which player was the machine. I imagine chess coaches would be particularly difficult to fool here.

 

To me, this challenge could define a whole new frontier for the chess AI community, one that is much more relevant to the wider field of AI today. In the end I think we stand to gain as as much understanding or more than was learned along that path that lead to Deep Blue, Fritz and Komodo.

 

I don't imagine I'm the first to have floated a suggestion of this kind, but it obviously hasn't gained the kind of attention I think it deserves or I'd have heard it being discussed before. If someone has proposed this before me then, consider this my hearty second of their motion, and if they haven't, well, you're all welcome to name it after me.

 

I eagerly welcome your thoughts, criticisms, witticisms, tidbits of knowledge and pearls of wisdom.

Martin_Stahl

Interesting read but needs spaces between paragraphs.

skorj

Agreed Martin. It seemed like the spaces were added automatcially when I was composing but they didn't come through. I've gone back and fixed it now, but I'll be sure to preview before I post next time.

final_wars

I made a game that engines can't play :)

due to the:

1. empty game board at the start of the game

2. the 2 billion random game queues on the side boards

3. the depth of the game queue = 18 at start of the game

Engines can play the game when the game queues are fully deployed or almost fully deployed (say depth of game queue <= 6)

The problem is pattern recognition, in chess you have patterns via one initial setup and opening book, this is the crux of the matter re a true Turing machine and AI.

But what do you do if the game board is empty?

How does the machine know what pattern to get to?

What do you do if the number of transposition tables is actually crazy?

Anyway, my big problem is that my board is 9x9 and the usual 64 bit bitboards (normal and rotated) that has been used in chess (for decades) goes out the window.

My other big problem is that my current code has a lot of loops, so I have been looking at source code written in C, to try to reduce the loops.

I have responded to this post because I have been looking at game engine and game server code for the last week, for chess but also for games not on 8x8, like Shogi, Capablanca, etc

Various approaches to get around the problem.

This approach for the Bonanza Shogi (9x9) engine was quite clever

typedef struct {

    unsigned int p[3];

} bitboard_t;

White uses p[0] and p[1] From MSB with << shift

Black uses p[1] and p[2] From LSB with >> shift

White plays up, Black plays down :)

No overlap, you carry the extra var around so you can bit fiddle, but black must shift from p[1] and p[2] to p[0] and p[1] with extra shift to move to MSB before comparing to white.

Anyway

I figured out my own way, uses one UInt64 only on a 9x9 board

Looked at quite a lot of code over the years, nothing intelligent about it, just humans refining an idea that somebody had in the 70's, usual story, not even close to AI.

Martin_Stahl
AdamovYuri wrote:

whats this nonsense about? there are already engines which can beat Magnus Carlsen 300:0 without even trying too hard...

no need to waste time on programing turing machineries

 

I think the premise is to develop an engine that plays in a manner indistinguishable from a human player.

thegreat_patzer

its actually a fascinating idea and a very scary one.

I DON'T want to wander into a forbidden subject- but as I see it- a chess engine that Passes the Turing test ( a very real and interesting "artifical intelligence" challenge), would make it utterly impossible for online chess sites like this one to enforce reasonably cheat-proof chess games between patzers.

its like a nuclear weapon to chess servers- I hope it Never happens!

thegreat_patzer
AdamovYuri wrote:

whats this nonsense about? there are already engines which can beat Magnus Carlsen 300:0 without even trying too hard...

no need to waste time on programing turing machineries

you know you really ought to google things you don't know much about.

DavidPeters2

My humble suggestion: a program with a giant database of all games played online. You select what rating you want to play against. It looks up all games played by anyone e.g. 100 points either side of that rating and plays the most popular move. This continues until the game reaches a never reached position. Ummm, then I guess you still need a human like engine but at least you played a human opening maybe middlegame too!

Also would need program to somehow randomise openings a bit otherwise the system I proposed would produce the same moves over and over.

final_wars

@DavidPeters2

1. Nf3, d5

2. Ng1 !!

:))

u0110001101101000
skorj wrote:

 To me, this challenge could define a whole new frontier for the chess AI community

It was thought of long ago, and some pursued it, but it was much harder. Less results meant less funding. You can hardly even call today's engines AI at all.

final_wars

@AdamovYuri

It is not about using machines to play better chess.

It is about using chess to make better machines.

These better machines will then make your life better.

When machines were very, very basic they tried to program chess

So basic in fact that a 8x8 board was too big, the computer was primitive.

The people learning about machines by using chess.

The process continues...

u0110001101101000
AdamovYuri wrote:

there wont be enough food nor air on the planet soon

<3

final_wars

I remember when I was young.

I thought that I knew everything.

I now know that I was actually just young and stupid.

chantaduro

skorj:  I can relate to what you're saying about playing a lesser chess engine at its full strength.  An old DOS program called Bluebush Chess was the first program I beat.  It played surprisingly well for how tiny it was.  Beating it did wonders for confidence and motivation.  A daughter found similar satisfaction when she finally crushed Chess Titans.

Bellyache99

Here's a tip for writing threads:

Make them short. I have no doubt that what you wrote is very interesting- I am myself fascinated by the idea of artificial intelligence. However, I didn't even bother to read more than two lines, and I'd guess most people would be the same.

u0110001101101000
Harbinger50000 wrote:

Here's a tip for writing threads:

Make them short. I have no doubt that what you wrote is very interesting- I am myself fascinated by the idea of artificial intelligence. However, I didn't even bother to read more than two lines, and I'd guess most people would be the same.

I don't mind reading long posts, but this one was really verbose.

I think the same thing could have been said in 10 sentences or less.

EscherehcsE

I think the OP's idea is neato keeno. (Yorps are known to use 60s vernacular.)

 

I'd like to see chess servers offer "Turing Games". A player would sign up for a Turing game and be matched with another player, not knowing whether his opponent was a human or engine. I imagine that chat would be disabled, as it would be too easy to use chat to get Turing tips. After the game, the humans would have to vote whether he had just played a human or engine. The server could then give the player instant feedback on whether he guessed right or wrong. The server could then save the game metadata into its Turing database and use that information to publish "Turing lists" so people could see which engines did best on the Turing scale.

 

I do wish the OP had given more specific information regarding which dumbed-down engines he has played (and which of those he either liked or didn't like), and which full-strength weak engines he's tried and either liked or disliked.

 

Regarding dumbed-down engines, has the OP tried the commercial HIARCS or Delfi Trainer engines? Those two are my favorites.

skorj

EscherehcsE wrote:

I'd like to see chess servers offer "Turing Games"... 


A fantastic notion that. It would provide an easy vehicle for developers interested in making engines that mimic the human thought process to measure their own progress as well as how their efforts compare to others without the need to recuit special volunteers for formal testing, and should provide sample sizes large enough to obviate concerns with the statistical significance of the results.

It's my notion that however it is done that an important design contraint on chess software designed for this purpose is that its playing strength should fall within a certain range, roughly equal to say the average human club player. Chess players at this level make mistakes at a regular basis but know enough that it's ususally not hard to tell what sort of thinking lead to the mistake. It would therefore present the greatest challenge for developers hoping to replicate that sort of thinking while providing a large pool of human testers in the same general Elo range.

EscherehcsE wrote:

Regarding dumbed-down engines, has the OP tried the commercial HIARCS or Delfi Trainer engines? Those two are my favorites.

 

I have not. Are these single engines that can have various contraints put on their performance? If so how do you have any sense of how play compares to inherently weaker engines that play at roughly equivalent strength?

EscherehcsE

I assume that it would be best for the server to offer rating bands, say bands of 400-500 elo, 500-600 elo, 600-700 elo, etc, as high as they want to go with the bands.

 

Yeah, the Delfi Trainer engine can be set to an elo between 1000 and 2600. The HIARCS engine can be set to something like 750 elo to over 3000 elo. (The version I have, ver 13.2, is a bit strong on the low end; The 750 elo setting plays more like 1000 to 1100 elo.) I think the newest HIARCS engine is only available with the HIARCS Chess Explorer GUI. I have the single-core version of the engine. A multi-core version is also available for almost double the price.

 

Regarding free/demo versions of these two engines:

HIARCS doesn't really issue demo versions of its engines; The only available free version is an ancient DOS one from 1991. (You could run it using DosBox. Using the D-Fend Reloaded frontend for DosBox makes it easier to use. If you download D-Fend Reloaded, it also includes the DosBox program.)

http://www.hiarcs.com/freechess.htm

http://dfendreloaded.sourceforge.net/

 

There is a free demo version of the Delfi Trainer engine, but it only works at either full strength or at the 1000 elo setting. This is just the engine; You'll have to install it into your own GUI. Download the "Free Delfi" near the bottom of the page:

http://www.msbsoftware.it/delfi/

One trick that's available - You can also get an even older version of Delfi from the Wayback Machine. It's Delfi 4.6, which was before it went commercial. You can set it from 1250 to 1700 elo, but it's a Winboard-only engine. (Also, I'm not sure how accurate the elo settings are for this version.)

https://web.archive.org/web/20060506112927/http://www.msbsoftware.it/delfi/

skorj

I may even have an old copy of Delfi somewhere but I was never aware it had an adjustable Elo feature. In my experience with any system involving a strong chess engine with adjustable levels the resulting play usually involved the computer making two or three powerful moves, countering my most subtle threats and exposing weaknesses in my own position I didn't know existe, followed by an unbelievably stupid blunder. This is why I prefer the natually weak engines. At least they are consistent.

I found most of the engines I collected through the WBEC Ridderkerk site. Interestingly there's a statement on the site explaining that engine tournaments they used to run were stopped because, in esscence, developers were copying too much code from each other. There's another sign that chess software developers need a new challenge.