Jam Expectancy Study: Introduction
The essence of roller derby is 5 skaters on each team: 1 jammer who must get through a pack of blockers, and 4 blockers who must both aid their own jammer and impede the opponent’s. At the beginning of every bout, these 10 skaters line up on the track. The odds are even, and each team has to account for the same number of opposing skaters. This is always true in the first jam, but it’s not really how roller derby is played throughout the bout. Eventually a skater is sent to the penalty box, and suddenly one team has a numbers advantage while the other is playing shorthanded. Roller derby is, as much as anything else, a game of varying pack situations. To be successful, a team has to learn how to manage the penalty box and perform effectively with smaller packs.
Imagine for a minute that the jam begins with one team holding a 4vs3 pack advantage over the opposition. This team gets their jammer out of the pack first with lead, and ultimately turns it into a 20 jam win. Let’s take a moment to evaluate that result. Do you think the team performed well in this jam? Do not forget about the advantage in skaters on the track. Did they translate that pack advantage into a satisfactory outcome? What basis are you using to make that judgement? What do you think one team can reasonably expect to get out of a pack advantage like this?
I’d like to introduce you to the concept I call “jam expectancy.” Basically, it’s the expected result of a jam based on the situation. —A baseline of what the average outcome of a jam will be. For example: A 4vs4 jam with neither jammer in the box has a simple, zero expectancy. Since the odds are even with even packs, the average performance for a team in that situation is +0 points. The 4vs3 situation I gave you above is more complicated than that. The team in question has an advantage, thus they would have a positive expectancy. They’re expected to win the jam in that situation, and there is an average number of points they are expected to win it by.
Did I lose you? Let me backtrack a minute. There is a baseball statistic called “run expectancy” which is the inspiration here. Statisticians have taken the thousands and thousands of samples of each situation that can come up in a baseball game (such as: a runner on second with 1 out), and they calculated an average performance for the rest of the inning based on each situation. The result is a chart you can use to get a pretty accurate idea of the chances of scoring in any situation. Beyond that, you can use such a chart to evaluate a strategy like a sacrifice bunt (Would it increase or decrease the chances of scoring?). If you want to see what a run expectancy chart looks like, you can find one here. It’s a brilliant tool for analyzing the sport as it actually gives you a numerical value to each situation.
Well, I had the idea of doing an indepth statistical study and gathering some expectancy data for roller derby.
This is the point where I should probably tell you that the data from this study and resulting analysis is the subject of a series of articles for which this is the first. We’ll to be taking a look at all of the objective data and seeing what conclusions we can draw from it. That includes examining things such as how pack situation affects average jam results, the impact of power jams, etc. From that, we’ll be able to weigh conventional derby assumptions and strategies against objective data.
To begin, let me explain the parameters of this study. Roller derby competition isn’t structured like professional baseball. MLB teams play 162 games in a defined season, and the level of skill is comparable between all 30 teams. Neither of those statements is true of roller derby. Roller derby has no defined season with each team playing a set number of bouts. There’s a much smaller number of games played each year. A lot of those games are also blowouts, featuring 2 teams with mismatched skill levels. In order to come up with some useful expectancy data, we would need a set of scheduled bouts between similarlyskilled teams.
There were 12 teams that made the WFTDA 2011 Championships in Denver, Colorado: Gotham, Philly, Charm City, Oly, Rocky Mountain, Rose City, Texas, Kansas City, Nashville, Windy City, Minnesota, and Naptown. You can argue that there were comparable teams that did not make the cut, but this criteria satisfies the need for an objective cutoff point. In the WFTDA Big 5 tournament structure, these 12 teams were matched up against one another 20 times: There were the 12 bouts in Denver, plus 2 in each regional playoff tournament (each regional championship + the semifinal which featured the eventual third place finisher). 20 bouts is a decent enough sample size where we might be able to get reasonably accurate averages. I collected the data for this study using the WFTDA Big 5 video archive.
For tracking pack situations, I’m using a stat I created called Initial Pack Strength, or iPS for short. It’s simply the number of pack skaters each team had on the track. The beginning of the jam is generally when lead jammer is decided, so it marks the point where pack situation has the greatest impact on the outcome of a jam. Please note: I tracked the pack strength for each team as of the moment the jammers were released, not on the first whistle. —This is an important distinction because I essentially adjusted for poodling and for slowstarts that released blockers from the penalty box. I wanted the most accurate numbers possible. If one team poodles a skater, and the other slowstarts to get a blocker back in, suddenly a 3vs3 pack situation becomes a 2vs4. I chose to use the data as of the moment the jammers are released because it’s easy to identify, and it’s the actual situation each jammer faced.
Before we wrapup this introduction, I’ll give you some data to whet your stats appetite. In order to understand individual jam expectancy and how it applies over a bout, let’s examine some basic period/bout numbers. Take a look at how scoring and jam length can change as the game unfolds:
Avg Number of Jams

Avg Points Scored Per Team

Avg Team Points Per Jam


Period 1 
21.75

59.80

2.75

Period 2 
20.60

73.18

3.58

BOUT TOTAL 
42.35

133.58

3.15

We can see here that the scoring increases in the second period, but the average number of jams decreases. This is undoubtedly the result of thinner packs. As the bout goes on, packs are thinned by penalties and scoring increases. This is partially due to ghost points, but it is also a result of mismatched pack strengths. When one team has an extended pack advantage, they often stymie the opposing jammer. Since jams are usually called off to prevent the opposing team from scoring, having one jammer stuck in the pack more often leads to longer jams and more points being scored. The same thing goes for power jams, which would logically increase in the second half as penalties mount. Also, another factor here could be large pointspreads. If one team has a significant lead in the second period, they might choose to simply bleed the clock and let longer jams play out. The longer the jams run, the fewer jams can fit into a period or a bout.
In the next article, we’ll start getting into the meat of the data. You can expect a breakdown of individual pack situations, how often they come up, and the expected results from them. This is going to be fun.