Can you trust the stats in openings explorer? - Chess Forums - Page 2

NM ozzie_c_cobblepot

Feb 8, 2012

0

#21

I think it looks great, what do you mean?

Vease

Feb 8, 2012

0

#22

ozzie_c_cobblepot wrote:

I think it looks great, what do you mean?

Absolutely, I edited my previous post to make me look smarter....

NM ozzie_c_cobblepot

Feb 8, 2012

0

#23

:-)

Michael-G

Feb 8, 2012

0

#24

Pawnpusher3 wrote:

Only if there is a good amount of trials. If an opening was played once and has a 100% win rate, it might be terrible. But if it was played 7000 times with a 60% win rate, it's excellent

How good a line or an opening scores in a database means nothing.

First you must all realise that , these are OTB tournament databases that have very little usefulness in correspondence chess.An opening or a line that may be a very good surprise weapon in OTB and may score well , is maybe useless in correspondence because the element of surprise doesn't exist.

Another important fact is that some openings or lines score very well only if you know very well all the lines and if you understand them perfectly.So the numbers you are seeing are Master games(both opponents have more than 2200 rating) numbers that actually mean nothing to most of you.

For example, Sicilian Dragon scores 21.7% wins(for black) in Masters Games database but the same line scores 33.7% in the Big database.If you remove master games and if you keep only the games of the dedicated Sicilian Dragon players , the percentage reaches the impressive 42.4%.

Why keep only the games of dedicated Sicilian Dragon players?Simple, because the ones that play Sicilian Dragon as a surprise weapon have a very low percentage(I didn't make exact calculations but it's around 18%) for obvious reasons(difficult opening that needs deep understanding)

What is the conclusion?If you know Sicilian dragon , it is a good line.

But isn't this true for every opening or line?

p.s There are other things that many of you don't know and can also affect the numbers of a database.For example the first and the last round of a tournament are different from the othe rounds.The players usually don't have the time to prepare for the game of the first round because they don't know their opponent so it is not unusual to avoid play their usual opening or their usual line.The last round , very few play for the win because very few have a reason to play for a win.Most chessplayers prefer a peacefull draw after an exhausting tournament.Some because they are dissapointed from their performance , others because only care in packing and resting a bit before travelling for the next tournament.It is very difficult to calculate how this affect the databases as there are players that play all games for win and they don't care if it is first or last round but in my club, players that have a total percentage of draws around 15-20%, they have around 30-35% draws last round.You do the math.

Pawnpusher3

Feb 8, 2012

0

#25

I'm not a stats expert, I'm just giving my opinion to a reasonable degree of certainty about the stats in the database.

Michael-G

Feb 8, 2012

0

#26

Don't take it personally , it's a usual mistake all do.You try to translate numbers without understanding the facts that create them and that leads to wrong conclusions.There is no "reasonable degree of certainty".The only certainty is that:

"Numbers lie"

You can trust them ONLY if you know how and when they lie.

Think this , 10 cars have an accident the same day , at the same road.Can you reach the conclusion that the road is dangerous?I think not, because you don't know if it is one accident involving 10 cars or if it is 10 separate accidents.So , how the numbers are produced can be more important than the numbers themselves.

Pawnpusher3

Feb 8, 2012

0

#27

That's similar to what I said. I said you need multiple trials before the stats are accurate :) you sound like an expert in stats, so I'll take your word on this though :D great posts btw, very informative.

NM shepi13

Feb 8, 2012

0

#28

I know what's wrong, its transpositions. People who play Nbd5 from your position prior to that always win, but people who reach it through other move orders don't win as frequently. Trust the later stat more, as it is all of the games from that position.

NM ozzie_c_cobblepot

Feb 8, 2012

0

#29

@shepi13 Nope, that used to be a problem, but apparently chess.com fixed it. This was different. The OP just mis-read the stats.

Pawnpusher3

Feb 8, 2012

0

#30

I see NM Ozzie

Michael-G

Feb 10, 2012

0

#31

Pawnpusher3 wrote:

That's similar to what I said. I said you need multiple trials before the stats are accurate :) you sound like an expert in stats, so I'll take your word on this though :D great posts btw, very informative.

I had stats class in University and an excellent proffesor/teacher.

My teacher's favorite quote was:

"The best way to lie is numbers"

Kingpatzer

Feb 10, 2012

0

#32

The big thing missing from the opening explorer stats is an idea of how players of various levels perform using the opening.

If players rated above 2600 play up rating points in the opening, but players rated under 2400 tend to play down rating points in the opening, that would suggest it's not a good choice for lower rated players even if the overall win percentage is good.

Michael-G

Feb 10, 2012

0

#33

Kingpatzer wrote:

The big thing missing from the opening explorer stats is an idea of how players of various levels perform using the opening.

If players rated above 2600 play up rating points in the opening, but players rated under 2400 tend to play down rating points in the opening, that would suggest it's not a good choice for lower rated players even if the overall win percentage is good.

You are right but there are a lot more missing.If you want to get an idea from how an effective is an opening shouldn't you be able to see only the games of the experts of the opening?

There are openings in which if you exclude all the non-experts the numbers change dramatically.That is because there is a factor many don't consider, how easy to handle a position is.An opening position may is equal but that doesn't say a thing about how easy is to handle the middlegame that occurs.One of the main lines in Catalan(1. d4 d5 2. c4 e6 3. Nf3 Nf6 4. g3 Be7 5. Bg2 c6) scores the impressive 55% wins for white(212 games) in the Big Database and only 30% in the Masters database(played 10 times only!!!).According to the books , black equalises with various ways(easily) in that line.

So who tells the truth and who lies?

How can a line that gives easy equality to black to score 55% for white?

Another very nice my teacher in stats used to say is:

"Stats is like bikini, it reveals a lot but hides you the most important"