OPS is Tired, Cubs Need New Stat to Define ‘Production Over Talent’ Era

If one were to pick a single statistical measure most key to Theo Epstein’s success in Boston and Chicago, OPS would instantly leap off the spreadsheet. From scooping up David Ortiz and his .839 OPS after the Twins released him to tanking in Chicago to hit draft pay dirt with a Kris Bryant, Epstein has focused relentlessly on maximizing his offenses’ combination of on-base percentage and slugging.

But given Epstein’s proclamation at his end-of-season news conference that evaluation of Cubs hitters must now be “more about production than talent,” the natural question is what measurement should supplant OPS. After all, OPS does not directly measure production. Rather, it tracks several precursors that, under the right ancillary conditions, can lead to run production.

This follows a major tenet of sabermetrics, which says you should measure outcomes directly controlled by a player and not those influenced by luck or other players’ contributions. Therein lies the rub for Epstein. To fairly evaluate his hitters based on production, he must partially turn his back on the statistical revolution that launched his career and those of so many other other new-generation baseball executives.

Falling on your saber(metrics)

Infusing some old-school production stats back into player evaluation actually wouldn’t be so bad at this point. The pendulum has swung so far toward sabermetrics that blind adherence to them has in some cases actually lowered win probability. I covered one example of this last month in analyzing the misapplication of the third-time-through-the-lineup theory for playoff starting pitchers.

It’s understandable Epstein might be reluctant to make a change. At the start of his executive career, he was one of the first to effectively leverage OPS for a competitive advantage. His insight that high OPS could more than offset a high K-rate helped the 2004 Red Sox become the first team to win a title while leading its league in whiffs.

But like most competitive advantages, the benefits narrow to nil as competitors copy the technique. In the case of OPS, many hitters now dutifully take their walks and tilt their swings to pad slugging percentages with more launch-angled doubles and homers. As strikeouts continue to soar, however, overall OPSs are no longer increasing to offset them. Or in saber parlance, the delta has turned negative.

Take the 2004 and 2018 Red Sox championships. The former offense struck out a league-leading 1,189 times while leading the majors with a towering .832 team OPS. Today that same whiff total would have tied for lowest in the majors. And while this year’s World Series winners once again led all of MLB in OPS, their .792 mark was 40 points lower than 14 years prior.

Analytically retentive

Evidence of the diminishing returns in OPS was even there in the Cubs’ 2016 World Series campaign. That team featured the second-highest OPS in the NL (.772, behind the mile-high Rockies), but its hitters never quite achieved the same offensive heights or consistency as Epstein’s Boston teams.

Led by superior starting pitching and defense, the 2016 Cubs hitters were shut out four times in the postseason. By comparison, no other World Series winner since the playoffs expanded to three rounds (1995) suffered more than one shutout. This offensive inconsistency only worsened for the Cubs in the 2017 NLCS against the Dodgers, then more infamously “broke” in the second half of 2018.

So if Epstein appears to vacillate between saying he needs to fix the offense and later saying all the team’s answers can be found internally, it may just be his soul shuddering at the notion of embracing old-school measurements of production. Kind of like those Progressive commercials where once-young adults start behaving like their parents.

Get off my premises!

The limits of OPS can be found in its two most basic premises. The first is that as long as an offense consistently takes its walks, gets on base at a high rate and slugs at a high rate, runs will always come. Second, except for a player’s run-prevention abilities on defense, higher OPS totals are generally always preferred.

But once teams over-subscribed to OPS and the launch-angle revolution, several foundational cracks emerged. Notice in the first premise the word “consistently.” It’s an article of faith in the sabermetrics community that hitting proficiency should be largely consistent across game situations. The idea that one player might be particularly good or bad in the clutch is largely seen as a chimera of small sample size.

But just as free markets are not always rational, hitting is not always consistent across game situations. And once you grant this, it follows that not all similar OPS marks are equal. Take the 2018 hitting results of Kyle Schwarber and Ben Zobrist, who had very comparable plate appearances (510 to 520).Theo Epstein

Although Schwarber edged Zobrist in OPS, this masks significant differences in the their relative productivity. First, Schwarber was horrible with runners in scoring position (RISP). In fact, his most successful at-bats came with no runners on, including 20 of his 26 homers. In other words, Schwarber’s OPS was not consistent across game situations, nor was it in 2017 (.144 RISP average) or over his career (.184).

Man was not made to produce alone

Second, a larger percentage of Schwarber’s OBP came from walks than hits as compared to Zobrist. In 2018, he safely reached base 181 times, but 43 percent of those came from walks. Zobrist reached safely 196 times, but only 28 percent came via walk. Since hits are more productive than walks, Zobrist’s hits advantage translated into greater productivity and run-scoring probability for the team.

If only Schwarber could just shift a fraction of his hits – especially the homers – to RISP opportunities, then he would easily close this productivity gap with Zobrist without any change in his slash numbers. Of course, this is just another way of wishing hitting could be more consistent across game situations.

Now let’s consider the Cubs’ trio of 2018 centerfielders, who also logged a roughly similar number of plate appearances ranging from 462 to 489.Theo EpsteinBased solely on OPS, Joe Maddon obviously should have given Ian Happ as much playing time in center as possible, right? Of course not. As we all know by now, OPS does not measure production, and Happ was by far the least productive of these three in 2018. (Interestingly, all three posted the exact same oWAR of 1.1, which underscores the challenges of that sabermetric stat as well.)

Give the guy a production credit

So how can Epstein best value production in this statistical brave new world where everything old could become new again? I’m sure someone could come up with an extremely complicated equation with lots of weighted factors, but my general rule is if you can’t explain how to compute a statistic, you probably shouldn’t use it. Thus I’m quite partial to Total Runs Created (Runs + RBIs – HRs).

To equalize for playing time, I convert totals into plate appearances per run created. Here is how 2018 Cubs with more than 100 plate appearances ranked according to this measure. Theo EpsteinTwo names probably jumped out. First, David Bote’s high ranking largely reflects a one-month period after his promotion on July 26 when he had a superior Runs Created rate of one every 3.7 PAs. Once scouting reports caught up with him, however, his rate dived to a more pedestrian 5.5.

The second name, Jason Heyward, does not require any qualifiers. Though it doesn’t conform to most fans’ prejudices, Heyward was highly productive in 2018 when not on the DL. In fact, not only was Heyward the Cubs’ best hitter with RISP (.324), he was the team’s second-best hitter against power arms (.317) after Zobrist (.330).

Not to say Total Runs Created should be the new be-all, end-all stat to replace OPS or wOBA or WAR. It just provides an extra tool for comparing relative production created. But to compare between players on different teams, the numbers should be equalized for variations between teams’ varying offensive potency. This would involve multiplying by simple factors representing each team’s deviation from the league average.

By incorporating these measurements, the Cubs may be better able to quantify their hitters’ actual production rather than just hoping their talent will shine through.

6 Comments

  1. As with a lot of your analysis, I find myself going, “So?” I don’t see why combining two highly flawed stats reveals something greater than other holistic offensive stats like wRC+ just like I don’t see the Cubs performance against bucketed pitchers with no proof as to its repeatability or validity as a pressing need.

  2. Love it, Jeff. You have once again poked at and broken down something that has been gnawing at me. Theo needs to look at the new metric of WWROB – Whiffs With Runners On Base. While I don’t have the quantification prowess that you do, I perceived all season that in far too many games, WWROB stifled or killed the Cubs offense. Even if it’s just a man on first, strikeouts put the absolute least amount of pressure on the other teams defense.

    1. Too kind Gator. My best guess — and it’s only a guess — is the organization does analyze the metric you brought up and the productivity metrics I noted. A gap might be is they use this to try to help with development of their young players, but not to determine which to keep and which to part with. This I guess is just another way of saying they grow too attached to their own players and value them higher than the rest of the market. This proved good in their holding onto Javier Baez, but not so good in holding onto players past their peak trade value like Starlin Castro, Carlos Marmol, and now perhaps Kyle Schwarber.

  3. I wish there was a way to rebalance BA OBP and SLG into a more weighted way. Walks are valuable, but don’t drive in a guy on base. SLG can be misleading if a guy is putting up a .200 campaign with a few dingers.

Leave a Reply

Your email address will not be published.

Back to top button