Looking at the description of Glicko-2 at http://glicko.net/glicko/glicko2.pdf
Before calculating anyone’s rating, ratings deviation RD, or volatility sigma, a value Tau needs to be determined, which “constrains the change in volatility over time.” Different abstract games may use different values of tau.
The Little Golem server features lots of abstracts and has attracted many strong players of these games, relatively speaking. For an obscure game, 100 players is a lot. The Elo system is used there, which IMO is probably the best choice with under 1000 players, because of greater transparency without losing much if any accuracy.
I’m just curious about how this value tau should be arrived at. LG has a large database of game outcomes, and everyone’s rating at the start of a tournament is listed. Also each player has their own graph of rating over time. I get that such data could be used to see what effect different values of tau would have on the overall predictive accuracy of a Glicko-2 rating system. But I’m unclear on how accuracy should be measured. Maybe just browsing a sample of ratings graphs would provide a reasonable estimate of tau for a specific game, but I have at best a vague idea of how to do that.
Thanks for any clue.
Heh maybe I should ask professor Glickman?
Doesn't that linked paper suggest values for Tau? (In step 1)
Looking at the description of Glicko-2 at http://glicko.net/glicko/glicko2.pdf
Before calculating anyone’s rating, ratings deviation RD, or volatility sigma, a value Tau needs to be determined, which “constrains the change in volatility over time.” Different abstract games may use different values of tau.
The Little Golem server features lots of abstracts and has attracted many strong players of these games, relatively speaking. For an obscure game, 100 players is a lot. The Elo system is used there, which IMO is probably the best choice with under 1000 players, because of greater transparency without losing much if any accuracy.
I’m just curious about how this value tau should be arrived at. LG has a large database of game outcomes, and everyone’s rating at the start of a tournament is listed. Also each player has their own graph of rating over time. I get that such data could be used to see what effect different values of tau would have on the overall predictive accuracy of a Glicko-2 rating system. But I’m unclear on how accuracy should be measured. Maybe just browsing a sample of ratings graphs would provide a reasonable estimate of tau for a specific game, but I have at best a vague idea of how to do that.
Thanks for any clue.
Heh maybe I should ask professor Glickman?