Stats, like any other tool, are no better than those that wield them. As we’ve entered the Big Data age we’ve witnessed more malpractice. But nothing we’ve seen is as egregious as the attempts to pass off Real Plus-Minus and Adjusted Plus-Minus as good data.
Luckily we don’t write for ESPN nor are we trying to hitch a ride to Bristol so we have no reason not to call out this statistical streaker for what it is, a dumpy overweight dude that spends too much time in the basement, has a disturbing mole in the middle of its back, and doesn’t have anything all that useful to say.
That, my friends, is RPM
For those unfamiliar with it, it’s been foisted on viewers by ESPN as a kind of be-all-end-all product, which maybe has some use for idiots arguing in a bar. So ESPN has this idiot-bar-stat, but suddenly you start seeing smart people you know start quoting it unaware it’s made with old women’s nylons and beer pull tabs. (Okay, we exaggerate a little.)
Plus-Minus is sort of an old-fashioned stat adopted from hockey. The concept is simple – how much did the other team score when they were on or off the court. Of course, what may work for hockey isn’t necessarily suited to basketball.
For one thing, hockey lines play roughly a third of the game, and they play in shorter spurts, whereas starters on a basketball team may play as much as 90% of the time. Why would that be a problem?
Basing Your Grade On Your Study Group’s Performance
Because ESPN is using RPM as a way of evaluating individual players. But it’s not actually an individual “stat” per se, and that’s the real problem. It’s very hard to separate the contributions of four other guys to the fifth guy’s stats.
One of the problems you run into is multicollinearity. If 80% of the time when Draymond Green is on the floor, Stephen Curry is on the floor, there’s a strong suspicion that Curry’s greatness will make Green look good simply because he was on the floor when Curry was draining 35 foot jumpers.
Want a good example how that works? Think back to last year’s Finals. Andre Igoudala won the Finals MVP because he was +62 during the six-game series (while Curry was +55 and averaged 42.5 mpg to Iggy’s 37 mpg).
That beat out LeBron James’ 35.8 ppg, 13.3 rebounds and 8.8 assists, double what Iggy produced in each, the first time in Finals history one player has led both teams in all three categories.
Iggy played a great series and was instrumental in the Warriors win, but there is no way in Hell he should win the MVP over the best single series performance in history. Might as well change the trophy title to “Most Valuable Guy on Winning Team” because that’s what the trophy means now. Thanks Real Plus Minus!
The other issue is that RPM doesn’t care how those scores are accumulated, only who’s on the floor when it happens. Euclid-based scout Bob Chaikin, helped pioneer the use of statistical simulations to aid in the evaluations of players and teams, and currently scouts for the Miami Heat. Chaikin hates RPM, because it’s “not real data.”
“If you have two sets of data from two different seasons where the data is completely identical for every player involved, they would say it was different because of what happened during the other 12 minutes of the game [so called, off-court time which sort of becomes a “control”],” Chaikin complains. “That makes absolutely no sense.”
He breaks out another example: “You have a guy play 36 minutes, shoot 0-8 from the field and makes five turnovers those are his only stats, but his team does well…Another guy shoots 8-8, has five steals and no turnovers but his team does poorly, RPM will say that the first guy did better. Does that that make any sense?”
We’re willing to concede that when confined to the same team if can provide some useful lineup data, but even that’s noisier than sharing a wall with Tara Reid’s hotel room. The question is why would you look at this over aggregated “number” as anything but spurious shit?
Why Soylent Green Instead of a Four Course Meal?
The GDP is good data if you’re looking for the nation’s output, but it doesn’t really tell you a good goddamn about what’s going on in Dubuque, IA. You could probably try to tease something out based on models of rural economic growth in the Midwest over the last couple decades, but why would you?
Why not just look at better discrete stats, like retail sales in Dubuque, local exports, trade balance and net population and job growth?
Similarly, Real Plus Minus is a stat that tells you something but you really don’t know what exactly it’s really telling you. It gives no indication what’s good or bad about the player. As we noted earlier, you can’t be sure it’s not simply telling you that he lives and works in a better neighborhood than those poor suckers in Sacramento or Philadelphia.
“Real plus minus is what I would call ancillary data. It’s not true data,” Chaikin says. “Say you know one guy shoots 50% and second guy also shoots 50%. A guy could shoot 50% and only take shots near the basket and a guy could shoot 50% and only take deep two’s. You know that’s different because guys generally shoot 60% close in, and 40% on deep twos.”
“You can sit there and say yes it is the same but there are conditions why they’re different,” Chaikin says. “It doesn’t mean that one is better than the other, but you can understand that they’re different, and that’s real data.”
Similarly, it makes a lot more sense to examine the more precise stats available from Synergy and SportsVU on the NBA.com site, which can tell you some very specific things. How well do they defend the area around the rim? How do they shoot in the midrange, or how effective are they in the final two minutes of the game?
“[Real Plus-Minus] is good information when you want to look at it in conjunction with real data,” Chaikin is willing to concede. “But if you try to say that what it’s telling you is different from what the real data is telling you…”
Garbage In, Garbage Out
RPM’s younger brother, Real Adjusted Plus Minus, takes RPM and adjusts it in a way that’s even less transparent than football’s QBR. It winds up with a qualitative rating that’s been adjusted to represent what the player can be expected to do next year.
This, of course, makes it much more marketable to the NBA teams, even if that’s more salesmanship than anything else. That’s something that gets under David Berri’s skin from the start.
They’ve essentially regressed out the contributions of the four other players like reducing soup stock to a demi-glace. It’s a bit inscrutable and “trust us,” is hardly a great selling point for a statistic.
“What they try and do is optimize it so that it will predict the next season which is a little ridiculous, because what you’re saying is I don’t care how well you performed in this game that I’m watching right now I’m worried how your performance in this game right now predict how you’re going to do next year,” says Dr. David Berri.
He’s a sports economist and professor of economics at Southern Utah University and former president of the North American Association of Sports Economists. In 2006 he co-authored, The Wages of Wins
, a sort of statistical myth-buster book that earned the endorsement of The New Yorker
’s Malcolm Gladwell.
In debunking RAPM, Berri notes that they’ve got things backwards with regards to typical scientific method. You look first to explain what happened. Then if you have a theory that explains what happened, you examine the model’s predictive power.
“The first step in any kind of model is to explain what you’re looking at. What
are you trying to explain? You've got to explain that first and then comes the issue of predictive power,” Berri says. “You first deal with what’s going on in your sample. There has to be some kind of sample predictive power. You've got to be able to explain what your sample is like, and they don’t seem concerned about that. That’s adjusted plus minus.”
The mad thing – beyond the fact that it’s done in a black box like a perpetual motion machine – is that they’ve turned to boxscore stats (which in some sense +/- are seen as encompassing) to enhance the predictive power, which is sort of like using a telescope to examine where in space your finger is pointing. There’s something tremendously backwards about it.
“What they've done is they’ve been tried to regress the boxscore statistics on this stuff and use this measure to figure out the value of the boxscore,” Berri says. “So they’re not using the boxscore statistics to explain wins. They’re using the boxscore stats to explain their measure. But their measure doesn’t explain anything! They're using box score statistics to explain something that doesn’t explain anything, which doesn’t make any sense either.
“Even if what they were doing was valid and made sense, their underlying argument is that their plus minus measure captures everything in the game,” he continues. “They argue that that boxscore stats don’t capture everything in the game. So you’re regressing something that explains everything with something that doesn’t explain everything?”
Berri has developed his own basketball metrics and spoken out against the various player “efficiency” metrics that value scoring over efficiency. Berri suggests it’s really more philosophy than science, which suits ESPN’s embrace of stats heavily driven by usage, not efficiency. People love to see scorers, didn’t you watch the Kobe farewell tour?
“It’s very clear that wins in basketball are determined by your team’s ability to get possession of the ball without the other team scoring,” says Berri, explaining the basics of basketball to demonstrate his point. “You have to take the ball down to the other end of the court without giving it up a turnover and you’ve got to convert the possession into points, which means shooting efficiently and getting rebounds. That’s wins.
“So if you’re going to evaluate a player, players are good if they shoot efficiently they rebound and they avoid turnovers and get the other team to commit turnovers,” Berri says. “If that’s all true than Carmelo Anthony isn’t actually a very good basketball player. And then when you say that the people they say you must be wrong.
“‘No you must’ve done something wrong,” Berri chuckles. “You cannot get a job in the NBA telling them that Carmelo Anthony is not a good basketball player. Because they know that can’t be true. Until Carmelo Anthony, Allen Iverson and Kobe Bryant are great players, when you got that, now you’ve got a measure.”
Voila, Real “Adjusted” Plus Minus. And you wonder why ESPN loves it? It dovetails nicely with their role as a broadcaster. Stats are just another thing to market their product with. If the stats don’t “say” what the Disney-run company wants, they can just look for a stat that does.
“What they’ve done is they’ve sold this to people who don’t know anything about statistics,” Berri says. “They say this is an advanced measure. What does that mean? Advanced compared to what? How did you decide that? It all ends up being tremendously silly.”
It’s really a load of bull, if you ask Chaikin.
“Right now Hassan Whiteside is rebounding and blocking shots in a combination that no one has ever done before. He’s rebounding in over 20 rebounds per 48 minutes, blocking shots at almost 10%, which is outrageous in combination,” says Chaikin. “On top of that he scores about 20 points per 40 minutes and shoots over 60% on twos. So why would this guy be rated as the 31st best center?
“When you talk to people who do plus/minus and adjusted, they say that’s not what the ratings really telling you,” he continues. “And I’ll say fine, what is it really telling you?”
The thing that got us so frustrated by plus/minus is that it’s frequently used by sports journalists and others to evaluate defense. While plus/minus certainly reflects something defensive, we’re very skeptical that you can effectively represent what it really says. Plus/minus is a pretty noisy stat to use to examine a pretty noisy subject. Defense is determined by how all five players work as a team. Trying to pare that down to a individual stat is perilous at best.
Certainly steals and blocks are useful, but even those only offer a window into part of a player’s defense. Perhaps he’s overplaying the passing lanes. Or maybe he goes for blocks and gives up position, resulting in layups. Or maybe he goes too easily for head fakes. But maybe the guards are putting the bigs in that position.
That’s why we like to look at specific things. Look at rim protection. Look at defensive FG%. You can also look at how often they contest shots as well. But even these are subject to the type of defense the team’s playing. For this reason Berri believes basketball defense is ultimately as difficult to keenly evaluate as football plays.
“The problem with that is that even with all the stuff we’ve got, it’s still very hard to ultimately assign credit to these things. Because players are switching all of the time. Teams play defense as a unit,” he says.
“In order to evaluate what happened you have to know what they were trying to do, and that’s a problem when you’re looking at basketball defense. What exactly are you trying to do out there? Are you trying to stop Stephen Curry or you trying to stop his teammates? What’s the focus of your defense,” Berri asks. “If you say hey you didn’t challenge that guy, he could say I wasn’t supposed to do that…. Oh okay, will that’s just fine, I did know that. I don’t know what your intention is because the coaches aren't telling us.”
That leaves us much where Bob Chaikin left us, comparing different stats to come up with a holistic view from a variety of disparate perspectives, from defending the pick and roll and isolation (via Synergy) to DFG%, rim protection, contests and other variants (via SportsVU).
That makes a lot more sense to us than trying to use one big over-generalized stat like RPM or RAPM, which we’re not even sure what (if anything) it’s measuring.
“You want to figure out the stats you have before you go out and get new numbers,” say Berri.
The Cavaliers match up with the Hawks tonight in the big rubber match. Their big players Kyle Korver, Paul Milsap and Al Horford had poor games, and will look to bounce back, but the Cavs plan is to make other guys beat them.
The Hawks will try new stuff to free Korver. As we showed in yesterday's column
, J.R. Smith and LeBron James were switching off the ball the effectively thwart everything the Hawks were doing to get Korver open.
He’s a big part of their offense. Also look for Teague and Schroder to spend more time on the court together. The Hawks didn’t get as much penetration as they’d probably like, and will likely try to attack the rim even more in Game 2. They took more threes than the Cavs, and we don’t see how they can win doing so.
Meanwhile the Cavs will look to reestablish themselves on the boards after getting beaten in Game 1 on the offensive boards 15-10. The Hawks definitely will be looking to steal homecourt, but we expect the Cavs to be better prepared than they were after eight days off.
We’ll be at the Q for the game, posting video, analysis and snark. You can follow along on Twitter @CRS_1ne, and read our postgame column here in the Scene & Heard section Thursday morning.