Are several players of the same rating the same statistically? There can be huge differences. A player may be very good at winning against weaker opposition while another may frequently lose to much lower opponents. Likewise, one player may rarely score against players above, while another frequently scalps highly rated players.
In the 70s some ratings had an m(many) ?(unreliable) or ??(unreliable) attached to them, based on the number of games played. Its shortcoming was inconsistency is not directly connected to the number of games played.
My idea is to publish a Varibility score for each player, which is a positive number from 0 to potentially several hundred. (It will be not applicable, if no games have been played.) It will be based on a players's PERFORMANCE rating and results for the rating period. A low number indicates consistent predictable results while a high number indicates widely ranging and variable results.
The very simple (but effective) algorithimn is:
var = (total of diffs)/number of opponents
where diff is the difference in rating, if the lower rated wins or half the difference in rating if it is a draw. (If the higher rated wins, diff is zero.)
One option is to calculate 2 scores, one for results against higher rated opponents and another for results against lower rated opponents. The scores will tend to be similar, but perhaps not identical becauses of bonuses and other specific features of the rating system.
A small or large score is neither good or bad, but gives some information about that player. I chose "variability" as some of the other choices, reliability, consistency, stability, predictability etc. have negative connotations. (Another idea is variance instead of variablity score).
An example : Player A's PR is 1500 and plays 20 games in the rating period. In 16 games the higher player wins, but he draws with a 1400, a 1600 , beats an 1800 and loses to a 1300 and a 1400. His variability score is (50+50+300+200+100)/20 = 35. If he had also beaten a 2000 and lost to a 1000, his var would be instead 1700/20 = 85. I suggest that scores over 100 suggest very wide variability, while scores under 10 suggest very low variability.
It might be interesting for related information to be printed, eg. the lowest, highest, and average etc. variablity for a given chess player population (by state, age or rating etc.) Another idea is the print the "range" of each player (the lowest and highest opponents they had in which an upset had occurred.) In fact, stats on upsets could be interesting too.
A table could be printed whereby players could predict the probability of an upset based on ratings and variability scores. Predictions based on ELO differences alone are not that good. There are far more upsets when the difference is more than 300 points then the 1% or so predicted by the ELO system.
This score could provide useful feedback to a player to be interpretated in
their own way. (Is my wide var score due to my rapid improvement or am I
prone to blunders, is my low score due to a solid style or have I a
psychological weakness against higher rated opponents?) It may help to
predict results of individual results and people may find some other uses.
Its an easy idea to implement, so I suggest that the ACF considers
adopting it.
Back to Victorian Chess Pages.