Analytics have long been a source of useful information across various industries including sports. While it may not be obvious to the casual football fan, the National Football League (NFL) started using league-wide analytics about five years ago. Recognizing a need to grow their analytics and data collection process, the NFL created the Next Gen Stats (NGS), which includes location, speed, acceleration and velocity for all players on the field, and started using NGS on Thursday Night Football in 2015. They began providing the data to individual teams and wider distribution to fans in 2016.

Next Gen Stats

Since then, the league has further revolutionized the use of analytics in a collaboration with Amazon Web Services (AWS), which has led to the NFL’s annual Big Data Bowl, a football analytics competition that is now part of the NFL Scouting Combine held annually in Indianapolis. The competition affords college students and professionals the opportunity to utilize historical data sets of the same player tracking data used by teams and suggest innovations about how football is played and coached.

The creation of the event also provided project opportunities for graduate assistants and students in the stats and computer science realm, including three University of New Mexico graduate students in statistics, Brandon DeFlon, Kellin Rumsey and Zach Stuart, who teamed up to compete in this year’s event.

The focus of this year’s competition among collegiate participants in the Big Data Bowl was to predict the outcome of rushing plays using data from the 2017 season. Participants were provided with the NFL's Next Gen Stats, including speed, direction and location information for all 22 players on the field at the moment a ball carrier receives the pigskin. The participants were then tasked with predicting how many yards the ball carrier would gain and presented their statistical methodology at the Big Data Bowl on a stage in a huge ballroom at the JW Marriott Indianapolis in front of all 32 teams.

Six collegiate teams, including DeFlon, Rumsey and Stuart presented their work to NFL club analytics staff held at the NFL Scouting Combine as part of the competition. UNM’s team was one of three that finished as finalists behind a Harvard team that was named grand finalist in the competition. Three additional teams earned Honorable Mention.

“That’s one thing our project really spoke to – we play football, we know football, we understand football, which is why we took the initiative to reach out to The University of New Mexico’s football team. It’s something we all really enjoyed and it’s a specific part of our project that differentiated ours from everyone else's.” – UNM graduate student Kellin Rumsey

As part of the competition, participants are given the position of every player at one moment in time when the ball is handed off. Originally, they wanted to predict how many yards the play would gain, an approach that most of the collegiate teams attempted to predict using a method called neural networks to get good results. While the teams were able to predict well, the downside was the fact that teams didn’t really learn what was happening for a ball carrier to gain or lose the yards he did.

The UNM team argued that advanced sabermetrics, a system for applying statistical analysis, failed to capture the importance of the run game in the modern NFL, and explained that a need exists for mathematical metrics that described the dynamics of the rushing attack quantitatively.

The UNM team thought outside the box and decided to take a different approach. They talked with former UNM offensive line coach and run game coordinator Saga Tuitele. They showed Tuitele examples of the data, asked him to break it down for them and what he saw when he examined the data. The team then input those ideas into a computer to try and figure out not only how many yards the ball carrier gained, but also why.

“Every time we started talking about a play, he (Tuitele) would talk about leverage, which was the very first thing he talked about on every play,” said Rumsey, who is working on his master’s degree in computer science and a Ph.d. in statistics at UNM. “So we thought if that’s the thing we want to do, then we need our computer to understand the concept of leverage.”

Big Data Team
UNM students Kellin Rumsey, Zach Stuart and Brandon DeFlon at the NFL's Big Data Bowl. UNM's team was one of three finalists behind a Harvard team that was named grand finalist in the competition.

Rumsey, DeFlon and Stuart came up with some very simple geometric definitions of leverage and constructed preliminary metrics to capture the well-known concept and the battle between blocker and defender and their position on the field relative to the ball carrier. In their paper, they defined offensive and defensive leverage, and studied the statistical properties of these metrics. 

The team defined blocker leverage as a quantitative measure of the blockers strength of position, with respect to the defender and the future position of the running back, and defined defender leverage as a quantitative measure of a defenders strength of position, with respect to the location of a blocker and the location, speed and direction of the running back.

“We showed that defenders who are doing a good job of generating leverage are more likely to make a lot of tackles,” said Rumsey. “Where we really differed from the other finalists, again they used neural networks, which is great, but it’s really hard to learn form them, whereas we wanted to take expert knowledge from Saga and code it into our analytics.”

“Leverage is everything. In pee wee football the term leverage is used. In high school, college and in the pros, the NFL – the top level of play, this term still comes up,” explained DeFlon, who is working on his master’s in statistics. “That is a very common term that is used. I think the beauty of this project is taking this simple concept of leverage and being able to explain it by denoting it through angles.

“We have defender leverage and blocker leverage, and if you look at these leverages in action, you see an actual play being ran that the NFL gives us with the Next Gen Data. At that point the a blocker and defender are in contact with eachother to a certain amount, which we were able to determine when we were coding, then we could start generating leverages for each play.”

Using the statistical properties of leverage they identified, the UNM team plotted the running plays from the first six weeks of the 2017 NFL season, a total of 4,164 plays from all 32 NFL teams and illustrated the leverage calculations throughout each play to achieve their results. They found that Green Bay’s Blake Martinez was among the league’s best at generating defensive leverage. Martinez finished the 2017 season with the third-most solo tackles in the league with 96.

According to NFL execs running the event, one of the key areas that led to the high finish in the competition for the UNM team was its creativity in utilizing expert knowledge from Tuitele.

“Taking input from Coach Saga was one of the best things about our project. We did not feel like we really knew how to approach this problem so we spoke with football personnel to determine how to answer the question,” said Rumsey. “I think that really resonated with the NFL community. They want people who know the game of football and who are willing to talk X’s and O’s in their departments. They don’t necessarily want a guy who’s really good at analysis, but can’t speak football.

“That’s one thing our project really spoke to – we play football, we know football, we understand football, which is why we took the initiative to reach out to The University of New Mexico’s football team. It’s something we all really enjoyed and it’s a specific part of our project that differentiated ours from everyone else’s.”

Not only does the competition allow the NFL insights to huge amounts of data, but it also allows the league to provide an opportunity for these statisticians to enter into the field of professional sports. In turn, the competition provides an avenue to open the door to sports for these students that are building these tools and techniques in college.

Stuart, who is earning his Ph.D. in statistics at UNM and a main contributor on the research, is currently working for the New York Jets.

“The NFL is really revolutionizing sports through the Big Data Bowl,” said DeFlon, who has interviewed with the NFL’s Arizona Cardinals and San Francisco 49ers. “They realize that there are a lot of up-and-coming statisticians and computer scientists and are trying to figure out how to get these people into the sport. The exposure you get from these events is great and allows you to build a portfolio that you could possibly take into other sports.”

For more information on the event, visit Big Data Bowl.