Thursday, January 20, 2011
MLB "Stat Guru" Phil Birnbaum Explains Why "Advanced Basketball Statistics" Don't Work
You know all those player evaluation statistics in basketball, like "Wins Produced," "Player Evaluation Rating," and so forth? I don't think they work. I've been thinking about it, and I don't think I trust any of them enough put much faith in their results.
That's the opposite of how I feel about baseball. For baseball, if the sportswriter consensus is that player A is an excellent offensive player, but it turns out his OPS is a mediocre .700, I'm going to trust OPS. But, for basketball, if the sportswriters say a guy's good, but his "Wins Produced" is just average, I might be inclined to trust the sportswriters.
I don't think the stats work well enough to be useful.
I'm willing to be proven wrong. A lot of basketball analysts, all of whom know a lot more about basketball than I do (and many of whom are a lot smarter than I am), will disagree. I know they'll disagree because they do, in fact, use the stats. So, there are probably arguments I haven't considered. Let me know what those are, and let me know if you think my own logic is flawed.
------
The most obvious problem is rebounds, which I've posted about many times (including these posts over the last couple of weeks). The problem is that a large proportion of rebounds are "taken" from teammates, in the sense that if the player credited with the rebound hadn't got it, another teammate would have.
We don't know the exact numbers, but maybe 70% of defensive and 50% of offensive rebounds are taken from a teammates' total.
More importantly, it's not random, and it's not the same for all players. Some rebounders will cover much more of other players' territory than others. So when player X had a huge rebounding total, we don't know whether he's just good at rebounds, whether he's just taking them from teammates, or whether it's some combination of the two.
So, even if we decide to take 70% of every defensive rebound, and assign it to teammates, we don't know that's the right number for the particular team and rebounder. This would lead to potentially large errors in player evaluations.
The bottom line: we know exactly what a rebound is worth for a team, but we don't know which players are responsible, in what proportion, for the team's overall performance.
------
Now, that's just rebounds. If that were all there were, we could just leave that out of the statistic, and go with what we have.
But there's a similar problem with shooting accuracy.
I ran the same test for shooting that I ran for rebounds.
For the 2008-09 season, I ran regression for each of the five positions. Each row of the regression was a single team for that year, and I checked how each position's shooting (measured by eFG%) affected the average of the other four positions (the simple average, not weighted by attempts).
It turns out that there is a strong positive correlation in shooting percentage among teammates. If one teammate shoots accurately, the rest of the team gets carried along.
Here are the numbers (updated, see end of post):
PG: slope 0.30, correlation 0.63
SG: slope 0.40, correlation 0.62SF: slope 0.26, correlation 0.27
PF: slope 0.28, correlation 0.27
-C: slope 0.27, correlation 0.43
To read one line off the chart: for every one percentage point increase in shooting percentage by the SF (say, from 47% to 48%), you saw an increase of 0.26% in each of his teammates (say, from 47% to 47.26%).
The coefficients are a lot more important than they look at first glance, because they represent a change in the average of all four teammates. Suppose all five teammates took the same number of shots (which they don't, but never mind right now). That means that when the SF makes one extra field goal, each teammate also makes an extra 0.26, for a team team total of 1.04 extra field goals.
That's a huge effect.
And, it makes sense, if my logic is right (correct me if I'm wrong).
Suppose you have a team where everyone has a talent of .450, but then you get a new guy on the team (player X) with a talent of .550. You're going to want him to shoot more often than the other players. For instance, if X and another guy are equally open for a roughly equal shot, you're going to want to give the ball to X. Even if Y is a little more open than X, you'll figure that X will still outshoot Y -- maybe not .550 to .450, but, in this situation, maybe .500 to .450. So X gets the ball more often.
But, then, the defense will concentrate a little more on X, and a little less on the .450 guys. That means X might see his percentage drop from .550 to .500, say. But the extra attention to X creates more open shots for the .450 guys, and they improve to (say) .480 each.
Most of the new statistics simply treat FG% as if it's solely the achievement of the player taking the shot, when, it seems, it is very significantly influenced by his teammates.
------
Some of that, of course, might be that teams with good players tend to have other good players; that is, it's all correlation, and not causation. But there's evidence that's not the case, as illustrated by a recent debate on the value of Carmelo Anthony.
Last week, Nate Silver showed that if you looked at Carmelo Anthony's teammates' performance, and then looked at that performance when Anthony wasn't on their team, you see a difference of .038 in shooting percentage. That's huge -- about 15 wins a season.
Dave Berri
responded with three criticisms. First, that Silver weighted by player instead of by game; second, that Silver hadn't considered the age of the teammates (since very young players improve anyway as they get older); and, third, that if you control for age and a bunch of other things, the results aren't statistically significant from zero. (However, Berri didn't post the full regression results, and did not claim that his estimate was different from .038.)
Finally, over at Basketball Prospectus, Kevin Pelton ran a similar analysis, but within games instead of between seasons (which eliminates the age problem, and a bunch of other possible confounding variables). He found a difference of .028. Not quite as high as Silver, but still pretty impressive. Furthermore, a similar analysis of all of Anthony's career shows similar improvements in team performance, which suggests the effect is real.
To be clear, this kind of analysis is the kind that, I'd argue, works great -- comparing the team's performance with the player and without him.
What I think *doesn't* work is just using the raw shooting percentages. Because how do you know what those percentages mean? Suppose one team is all at .460, and another team is all at .490. The .490 means that you have more players on the team above average than below average. But, the above average players are lifting the percentages of the below average players, and the below-average players are reducing the percentages of the above-average players. But which are which? We have no way of telling.
Here's a hockey example. Of
Luc Robitaille's eight highest-scoring NHL seasons, six of them came while he was a teammate of Wayne Gretzky. In 1990-91, Robitaille finished with 101 points. How much of the credit for those points do you give to Robitaille, and how much of the credit do you give to Gretzky? There's no way to tell from the single season raw totals, is there? You have to know something about Robitaille, and Gretzky, and the rest of their careers, before you can give a decent estimate. And your estimate will be that Gretzky that should get some of the credit for some of Robitaille's performance.
Similarly, when Carmelo Anthony increases all his teammates' shooting percentages by 30 points, *and it's the teammates that get most of that credit* ... that's a serious problem with the stat, isn't it?
------
So far, we've only found problems with two components of player performance -- rebounds and shooting percentage. However, those are the two biggest factors that go into a player's evaluation. And, additionally, you could argue that the same thing applies to some of the other stats.
For instance, blocked shots: those are primarily a function of opportunity, aren't they? Some players take a lot more shots than others, so the guy who defends against Allen Iverson is going to block a lot more shots than his teammates, all else being equal.
------