If you don’t have anything more than a passing interest in baseball statistics, this post will bore you even more than it will if you do have more than a passing interest in baseball statistics. You have been warned.
We talk a lot about wins above replacement level in baseball now. Wins Above Replacement, WAR as it’s often abbreviated, the hilariously complicated statistic that expresses a simple idea: how many wins was a player worth above his Triple-A replacement. And, because it’s expressing a simple idea, WAR has slowly worked its way into the baseball mainstream. The always-interesting Joe Sheehan has a column in nearly every issue of Sports Illustrated, and he uses WAR a good amount; cover stories in the magazine about players almost always mention their wins above replacement total. It’s not on TV broadcasts, yet perhaps, but if it’s in Sports Illustrated, I think it’s safe to consider mainstream. So I feel okay saying WAR has made it to the big time.
So here’s what I’ve been thinking about: Why do other, equally useful and maybe even more useful, baseball stats fail to make it so far? There have been dozens of new baseball stats over the past few years. Why do some fail where others succeed? What makes a successful stat?
And here comes the rant. I’m going to pick on wOBA first. wOBA, although it expresses something simple – how good a hitter is at producing runs — is never ever going to work its way into the mainstream. It’s an excellent statistic, maybe the best at describing a player’s offensive contribution. It says a single is worth this much, a walk this much, a double this much, triple this much, etc., then adds it all together and averages it out. That’s wOBA. But it’s not going to be widely used, and I think there are two reasons:
Reason 1a.: It has a bad name. It’s a lower case letter in front of three capital letters, wOBA. If I’m saying it out loud, it seems like I should whisper the “w” and then yell the “OBA.” And it’s awkward to begin a sentence with. xFIP – see how weird that looks? – is doomed for similar reasons. I’m not trying to say that people can’t understand acronyms with lowercase letters in them because people are book-burning hobgoblins. But I am saying that putting a lowercase letter in front is like calling it “Dungeons-and-Dragons-OBA.” Rightly or wrongly, it’s going to be off-putting to some because it’s going to look needlessly complex.
Reason 1b: It’s a number based on the on-base percentage scale. The OBA in wOBA stands for on-base average, and the number is supposed to look like on-base average/percentage. So if a hitter’s wOBA number would be a good on-base percentage, he’s a good hitter.
The problem is that, five years ago as a more casual fan, I had no idea what a good on-base percentage was. I knew what on-base percentage was and how it was calculated, but I had no meaningful scale for the number like I did for batting average and ERA. I didn’t know what the single-season record was for OBP, who the all-time career leader was, what a terrible OBP looked like, because it wasn’t something talked about that often.
That’s a problem. Because taking a statistic and scaling it to on-base average is like making a baseball stat and then scaling it to the weight of apples. I have an idea how much an apple weighs, because I’ve held them and eaten them and thrown them at my siblings. But I have no idea exactly how much an apple weights or how much a big apple weighs compared to a small apple. So if one hitter is eight ounces on the apple scale and another hitter is nine ounces, I have no idea if they’re both above-average hitters, both below-average, or if one is good and the other is bad. I have to go on Wikipedia and read all about apples first.
If you want casual fans to pick up wOBA without much effort, they have to understand the scale for OBA already. (And know that OBA is the same as OBP.) And I don’t know how many people know the normal range for on-base percentage off the top of their heads, because it’s not something most baseball fans grow up thinking about. I know I didn’t.
There might be an argument that OPS, On-base Plus Slugging, is an offensive statistic on a strange scale that has been accepted into general use. OPS is used on ESPN broadcasts and it shows up on a decent number of scoreboards. So maybe there is hope for wOBA. But OPS, like wOBA, is on a crazy scale that ranges from .500 for terrible hitters to 1.100 for Barry Bonds-like players. But it’s accepted anyway. My best guess why OPS works for a general audience is that:
A. Enough baseball fans know what on-base percentage is, enough know what slugging percentage is, so adding them together is easy enough to explain. I think. I’m just guessing here.
B. OPS happens, entirely by accident, to have a scale similar to how we grade things in school. For grades in school, a 90 and above is an A, between 89-80 is a B, 70-79 a C, and so on. But where a 90 grade on a test is an A, an OPS of .900 happens to be an A-grade OPS for a batter. Hitters with an OPS in the .800s are grade B, in .700s grade C, then Ds in the .600s and those failing hitting in the .500s or below. The great hitters, the ones earning extra credit, come in above 1.000. Carlos Beltran has a .904 OPS with the Mets this season, clearly earning an A-grade on offense. David Wright has an .801 mark, a B, and Jason Bay comes in with a .704, earning himself a C. Jason Pridie earns a D with his .634 OPS. Brad Emaus failed baseball with a .424 OPS in the majors this year, and the Mets let him go for remedial work in the minors.
OPS isn’t as accurate as wOBA for identifying the best hitters; wOBA is easily the better of the two. But “OPS, Oh-Pee-Es” rolls off the tough better than “wOBA, Whoa-Bah,” and it has a scale that’s familiar enough to most. At least that’s my theory why we use one and not the other.
On a similar note, you can argue about the merits of Ultimate Zone Rating (UZR) against Defensive Runs Saved (DRS), the two most-popular defensive metrics. But, in my opinion, DRS has two huge advantages over UZR that have nothing do to with the actual numbers: The people who produce defensive runs saved round it off to a whole number, and they gave it a self-explanatory name. Look, which sentence is easier to understand without further explanation:
Nick Evans has three defensive runs saved at first base this season.
Nick Evans has a 1.9 ultimate zone rating at first base this season.
Maybe I’m wrong, but I have to go with the first option here. And if I’m a writer concerned with brevity and making things understandable for the reader, as I sometimes pretend to be, I’m going to use the first option every time. Because if I didn’t know anything about defensive statistics, I could understand what the first sentence meant. I’d have to spend a while on Google to understand the second.
Plus, the “Ultimate” in “Ultimate Zone Rating” gives a Warhammer 40,000 feel to UZR.
Maybe I’m just shouting into the void here. But I sometimes wonder if new baseball stats had different names, that it might be all over TV broadcasts already. I suspect part of the reason football accepts passer rating, sometimes known as QB rating — a statistic that has a complicated formula and is weird — is because it’s called passer rating and not “weighted ultimate completion percentage.” I tend to believe the name matters.
So it seems to me that Wins Above Replacement is making its way into the mainstream because it’s struck the balance between usefulness and ease of use. Its acronym is WAR — all capital letters *and* spells an actual word we can say out loud — and it’s scaled to wins, which most fans have a feel for already. It’s still a little off-putting because it’s almost impossible to explain and it has decimal points . . . decimal points in a stat can be tough. We all have a feel for seven wins because it’s a real thing. We don’t all have a feel for 7.4 wins, because .4 wins is an abstract concept. You can’t win half a game. But WAR is a tough stat to round, because the differences between players are often so small that you need the .4 thrown in. It’s a necessary evil.
I have no idea if any of this is interesting to anyone besides me. Or if I’m even right, because most of this isn’t based on hard fact, but my subjective opinion. I don’t actually know why people picked up OPS and not wOBA, and I’m just guessing. I’m probably wrong on plenty of stuff, if not all stuff, here.
But if you’re a sabermagician cooking up a new stat, it seems to me that it would probably be worth the extra time to come up with a good name and scale that are self-explanatory. Because if you want people to use something, it seems that setting up fewer hoops to jump through would be a good idea.