Friday, August 13, 2010

The Anatomy of a Block: Repeatable Skill? (Part 4)

In case you missed it, check out Parts 1, 2, and 3 in this series on "The Anatomy of a Block": Introduction, By Shot Location, and By Shot Type.

In this post, I'll take a look at whether or not value of blocks as measured by shot location and value by shot type is a repeatable skill. The premise is that if a skill in sports is measured effectively, then there should be reasonable expectations that the statistic measuring the skill will remain consistent from year to year, making the skill repeatable. One of the tests of the value of a statistic in objective evaluation is looking at how much that statistic varies with time for each player. Essentially, if a set of numbers fluctuates from year to year for not just one but most players, then that provides evidence that a certain skill (such as blocking certain shots based on location or type) may not be repeatable. This type of analysis looking at the correlation between what a player does one year with what a player does the next year is used in preliminary studies before determining the existence of the hot hand, clutch hitting in baseball, and how much control a pitcher has over the number of hits he allows.

Here's a look at the block value by shot location of the top 25 shot-blockers in total blocks since 2007 along with their season-by-season values to see if they fluctuate:

The color scale of the cells should give you an idea of how much points saved per block by shot location vary from season to season for these 25 players. For instance, if you look at the '2007' column and the '2010' column from left to right, you can see the colors of the cells change or stay the same, particularly for players with high value blocks such as Andrew Bogut, Emeka Okafor, and Josh Smith as well as players with low value blocks such as Andris Biedrins and Brendan Haywood. You can also notice some players with a little bit higher statistical fluctuation in their season-by-season value per block numbers, such as Ronny Turiaf. The standard deviation column on the far right is a measure of how spread out these numbers are, and captures a sense of how much the numbers fluctuate. A further discussion on the validity of this measurement gets into a statistical discourse covering other stats terms that I won't get into right now, but for the most part, these standard deviation values are relatively small and speak that the data is uniform from season to season (more likely that blocking shots based on location is a repeatable skill) more than it is volatile (less likely that the numbers are affected too greatly by random fluctuations).

Now here's a look at the block value by shot type of the same top 25 shot-blockers in total blocks since 2007 with their season-by-season values:

Again, the color scale of 2007, 2008, 2009, and 2010 shows mostly consistency from left to right, save for a few players. This time, Jermaine O'Neal has the most fluctuations in his values by season, ranging from 0.903 PPS in 2009 to 1.102 PPS in 2007. Other than O'Neal, however, it seems to me that shot-blockers with high value per block tend to have high values for all four seasons, and vice versa with shot-blockers with low values. Brendan Haywood did attain a 1.011 PPS by shot type in 2007, but followed that with some of the lowest PPS numbers with 0.839 in 2008 and 0.847 in 2009.

All of this brings me into another thought about whether or not block value by shot location is correlated with block value by shot type. Let's look at a scatter plot of value by shot location against value by shot type using all 122 players I sampled:

R2 value of 0.3952 is decent, which is a measure of how correlated the two sets of data are, ranging from 0 to 1 where 0 is completely unrelated. However, shot location and shot type may speak to different types of blocks and skills needed in order to attain those blocks, even if different shot types are precisely defined by where they are taken on the court (all dunks and layups are at rim and 3-pointers are at a distance, for instance). Also, I believe a more intrinsic evaluation on the interaction effects between types of shots and where they are on the court would need to be conducted in order to properly assess how the block value models affect one another and combine both models.

I'm sure many of you think a lot of this is just statistical blabber jabber, and I do too, so I'll just leave it at that for part 4. In my next and last post on this rather long series, I will be concluding with a summary (in words, not tables and charts and numbers) on what I found and detailing an exhaustive list I have compiled on the limitations of the study on blocks as well as possible future improvements and extensions. I hope to be getting back to doing more posts on what I've been working on with shot location heat maps, baseball PITCHf/x, and football play-by-play data, so in a sense, I'm rushing along to get this series done ;).


  1. How do you choose your sample of players?

  2. Hey Chirag, thanks for commenting. I just took the top 100+ shot-blockers in total blocks since 2007, and added 20 more players I was interested in (some of the rookies and second-years).