An Interactive (DATA) dump: Using the NBA to Explore Python's Visualization Capabilities

Recent advances in data acquisition in the NBA allow us to gain insight into the sport and it's players, and, we can use this data to play with Python's visualization abilities. In this post data from the current season is used to see just how dominant the Golden State Warriors are; Fivethirtyeight's ELO rankings help to make an interactive graph (not usable on a mobile device); shot charts from help to build a 3D visualization of a player's shots; and, we will look for any changes in Stephen Curry's play that may explain his explosive 2015-2016 season.

James Harden's Shot Chart. In 3D!!!!!!

James Harden takes a lot of shots. Savvas Tjortjoglou's blog post here is a jumping off point for the 3D shot bar chart. The post details how to grab the (x,y) coordinates for all of a player's shots during the 2014-2015 season. Using the API found at we use Python to create the chart. His stunning visualization shows us the density of James Harden's field goal attempts (FGA) during the 2014-2015 season. Harden_2D_Shot_ChartThis is a beautiful picture but we can dig deeper. Another way to visualize this data is in 3D. A 3D histogram in Python gives us another way to view his shots,Harden_3D_Shot_Chart1While they both gives us the same information the 3D chart makes his tendencies a bit more obvious. He finishes most often left - his strong side - at the rim but he spreads his three point shooting out to all sides. It does seem that he prefers the left corner to the right corner.

The 2015-2016 NBA season using Interactive Python

When making visualizations I often struggle with how to make the graph easy to understand. At times the complexity of the data makes this very difficult. The Python package mpld3 outputs html code that can use javascript and so the user can interact with the graph and make my life a bit easier.

The current NBA season has an anomaly of a team, the Golden State Warrior's, that may break the all time win total for a season, and features Stephen Curry, a player that is on his way to having the greatest Player Efficiency Rating (PER) of all time. One reason for this is their three point shooting. Take a look at the following graph:

Use your mouse cursor to find out which teams are represented by which bubbles. I have annotated the Warriors so it is obvious what they are doing. They are shooting a lot of three's and they are making a higher percentage than any other team. The Houston Rockets, by design, also take a lot of three's but as the graph shows, they shoot a percentage under the league average, possibly contributing to them firing a coach and being an average team so far this season.

The next chart hammers this point. The Warriors are shooting the 2nd most three pointers and are 6th lowest in two-point attempts. 

Oddly enough, the Cleveland Cavaliers shoot the least amount of two-point shots in the league.

ELO Rankings

Fivethityeight's ELO rankings attempt to rank teams. The chart below graphs the last two regular seasons ELO ranking game-by-game for all 30 NBA teams. Without using mpld3 I would have to produce something like this:

ELO_2Seasons This works but the following is much more fun!

It seems to work best when your cursor is to the right most point on the graph. The black line at 1500 is the league average.

Stephen Curry's Explosion

The graph below shows where Stephen Curry is getting his points. The dark blue are 2-point baskets, the grey are 3-point buckets and the light blue are free throws.

Curry_Raw It's hard to see anything at first glance except maybe that he is scoring more this year. This years statistics began at game 102. The next graph looks at these same shots but as a percentage. For example, in game 1 of the 2014-2015 season Curry scored 24 points and hit two 3-point shots, or 6 points of his 24 (25%) were 3-pointers.Steph_Curry_Raw_PercentLooking for trends is difficult but we can try a simple linear regression to figure out what is going on. A naive approach is to fit a statistic to time with a line and look the slope. For example, if we look at his points total, beginning in the 2014-2015 and up until today's data we find the linear fit has a slope of 0.068. In other words, his overall scoring output is trending up at approximately 6.8 points per 100 games. This is commensurate with what is happening this year. His scoring average is up a little over 7 points per game this year from last so this jives. When we fit the other data we get what we expect:

  • 3P: slope = 0.016
  • 3PA: slope = 0.0289
  • FG%: slope = 0
  • FGA: slope = 0.033
  • FTA: slope = 0.019

Basically, he is taking more shots and is therefore scoring more. He is taking and making two more free throws a game this year along with making about 1.5 more 3-points a game (2 + 4.5). His field goal attempts (non 3PA) are up a bit as well so there is the 7pts/game increase we see this year.

Was he as good last year as this year? It seems that way. He is being more assertive this year and the results are absurd. His team is 28-1 at the time of writing this.

Join over 5 people that are waiting day and night for latest from THE HOLY MATH.
We hate spam. Your email address will not be sold or shared with anyone else.
This entry was posted in Sports Statistics, Visual Projects and tagged , , . Bookmark the permalink.