On a post tournament high I decided to peruse my favorite roller derby stats robot, The All-Knowing Derbytron, to see how my Minnesota RollerGirls where stacking up after almost defeating North Central first ranked Windy City in the championship game of Mayhem.
I was totally surprised to see that Derbytron’s algorithm had actually dropped Minnesota down one point below Detroit in his post Mayhem rankings even though Detroit came in fourth and Minnesota paced second in the tournament.
Math is hard.
I’ll give Derbytron a break. Let’s just take a look at my second favorite roller stats site FlatTrackStats.com. Except FTS is currently ranking Minnesota over Windy City Rollers despite the thrilling but statistically expected Minnesota loss to them.
So who’s right?
Let’s look at who did the best in predicting the outcome of the Monumental Mayhem. We’re going to consider both Derbytron’s and Flat Track Stat’s algorithms as well as WFTDA’s member voted system and whatever DNN uses to make their power rankings.
We find the difference for each team’s final tournament rank from their predicted rank and then find the sum of those. If one method predicted team X to come in 4th place but team X actually ended up in 8th the difference would be 4. We do that for all the teams in the tournament and add up the differences. A method with a lower score does better and a score of 0 means that method predicted perfectly. This is how they ranked.
- Derbytron 
- FTS 
- DNN [8-12*]
- WFTDA 
DNN gets a variable score because their Power Rankings don’t go low enough to rank 3 of the teams in the NC tournament. Best case scenario they would get a score of 8 and worst case a score of 12.
Ok, So Derbytron did the best of the four at predicting the tournament but the question becomes can we do better? When meteorologists forecast the weather they use different models and the most accurate of these models is usually the average of all the competing models and not a single model. What if we take the same approach with derby rankings?
If we remove DNN from the math and leave WFTDA in as our human element we end up with a score of 11 which isn’t great. I removed DNN because their rankings lacked the 8th, 9th, and 10th place teams. I plan on running these numbers again with Championals since I’ll be able to leave DNN in and remove WFTDA rankings.
What if we look only at the nerds? If we combine Derbytron and FTS we end up with a score of 3 which is incredible! We can refer to this average of the two methods as ‘Derby-Track-Stats’ or DTS.
The main conclusion we can draw from this is that WFTDA needs to ditch the voted ranking system. Voting was the best solution in the very early days of modern roller derby but with more bouts and consistent reffing it becomes obvious that some teams were not invited to the region tournaments who should have been.
I’ll be using the DTS system to complete my Championals brackets in a couple of weeks and I’m excited to win the Bracket Bonanza.