Every year in March tens of millions of people predict the future. They are participants in the March Madness (Big Dance) forecasting ritual: filling out the National Collegiate Athletic Association (NCAA) Men’s Basketball Tournament bracket. They do it for fun, fame and fortune. Often there are monetary rewards for good predictions, but bragging rights for winning the office or family pool may offer more personal satisfaction than the money.
Two students at Ball State University, Tim Hoblin and Cody Kocher, decided to combine their passion for basketball with their desire to learn more about predictive modeling. As students in the honors program they were required to produce an honors thesis. Several months before the start of the 2017 NCAA Basketball Tournament, they asked me, their thesis advisor, if using predictive modeling techniques to build better brackets would be an acceptable thesis topic. My answer was immediate and affirmative; pursuing a passion and having fun can enhance the learning process.
As with many sports there is an abundance of publically available information about college basketball. There are statistics about an individual team’s performance during the regular season prior to the tournament such as points per game, margin of victory (or loss), shooting percentage, turnovers, rebounds, etc. There also are measures such as strength of schedule and tournament seed that were determined through algorithms and judgment. Kocher and Hoblin used data from 2006 through 2016 and the results of the NCAA Tournaments in each of those years to build their predictive models.
Building predictive models for the Big Dance has been going on for years. A search of the internet will turn up many models. Kaggle, a company that sponsors predictive modeling and analytics competitions, has held NCAA Tournament predictive modeling competitions.
It becomes increasingly difficult to make good predictions in later rounds of the tournament. As stronger teams defeat weaker teams earlier in the tournament, the winners face increasingly stronger teams as the tournament progresses. Predictions become less reliable.
The NCAA Men’s Basketball Tournament starts with 68 teams from Division 1, generally the colleges and universities in the U.S. with the strongest athletic programs. The First Four play-in has eight of the lower-ranked teams play against another team in this group of eight, and the four winners progress to the First Round. The First Round (the Round of 64) consists of 32 games with each of the 64 teams competing against another for a spot in the Second Round. The Second Round (the Round of 32) has 16 games which again eliminates half of the teams. Each round eliminates half the teams until the champion emerges in the National Championship final game.
What is a bracket? It is a series of predictions for the winning teams in each game throughout the entire tournament, from the First Round until the National Championship. So, a bracket has 32 + 16 + 8 + 4 + 2 + 1 = 63 predictions.
How do you score your success with your bracket? What is the objective function that you want to maximize? A commonly used scoring method assigns weighted points for each correct prediction. For each correct game prediction in the First Round you get one point. For each correct prediction in the Second Round you get two points. If N is the round, then you get 2N-1 points for each correct game prediction in that round — the total number of points possible in each round is a constant 32. If you build a perfect bracket (good luck with that!), then you would attain the maximum possible 6 x 32 =192. ESPN uses this system but multiplies the values by 10.
Kocher and Hoblin analyzed the data and built their predictive models using GLMs and random forests in R. Excel was also extensively used in the project. As good modelers do, they constructed models using subsets of the data and then validated and scored them with other data subsets. Because some predictive data was highly correlated with other data, they applied principal component analysis to create new predictors. They also had to deal with basketball rule changes relevant to their modeling. For example, in 2009 the three-point line was moved back a foot lowering the three-point shooting percentage and in 2016 the shot clock was reduced from 35 seconds to 30 seconds which increased the pace of play and points scored per game.
After their models were built and tested, they entered 24 brackets on ESPN.com: eight GLM-modeled brackets, 12 random forest brackets, one bracket from a computer model they designed (the beginning of a stochastic model) and three control brackets. The ESPN system scored their brackets and ranked their success as the tournament unfolded against the other 18.8 million brackets that had been submitted. Two of the control brackets were those filled out by Kocher and Hoblin individually relying on their own intuitions. The third control was a bracket constructed by choosing the higher seeded team in each game.
After the first day of the tournament with 16 games completed, their brackets were doing extremely well. Five of their 24 brackets, or 20.8 percent, had correctly predicted the winners of all 16 games. Among the 18.8 million submissions to ESPN only 0.8 percent were perfect after the first 16 games. When their average score for all 24 brackets was compared against the 58,000 other group entries that included a similar number of brackets, they were ranked an impressive 7th out 58,000.
Seeing the results after the first day, I congratulated them but they immediately replied that it would not last. As they discovered in their modeling, it becomes increasingly difficult to make good predictions in later rounds of the tournament. As stronger teams defeat weaker teams earlier in the tournament, the winners face increasingly stronger teams as the tournament progresses. Predictions become less reliable.
Their standing took a big hit with the defeat of Villanova by the University of Wisconsin in the second round. Sixteen of their brackets had Villanova as the national champion. None of their brackets had predicted that the University of North Carolina would become national champion.
Removing the three control brackets, 17 out of their 21 submissions finished above the 50th percentile among the 18.8 million submissions when the tournament ended. If there is a 50 percent chance that a one submission would above the 50th percentile then the probability of getting 17 or more out of 21 submissions into the top 50th percentile is only 0.36 percent. The two controls based on their individual intuitions did not fare too well ending up in the bottom 50th percentile.
The control bracket constructed by choosing the higher seeded team in each game finished at a respectable 72nd percentile among the 18.8 million submissions. In their modeling and testing, Kocher and Hoblin had observed that tournament seed was a powerful predictive variable and decided to make this simple model a benchmark to compare against. If you are filling out a bracket, you probably will beat the majority of your friends and colleagues with this simple algorithm.
Kocher and Hoblin graduated in May and have started their actuarial careers. Kocher joined Nyhart and Hoblin will be at Allstate. Their thesis is online at cardinalscholar.bsu.edu.
Curtis Gary Dean, FCAS, is the Distinguished Professor of Actuarial Science at Ball State University in Muncie, Indiana.