Not all wins are created equal: When the (sabermetric) data is mightier than the sword.

By: S. Christopher Michaels

How do you describe the joy of winning? Image from mysanantonio.com

 

Scoring more runs than your opponent wins baseball games (checks math). Within that context, run differential (Rdiff) has long been considered key to unlocking wins and losses.

Many have tried with varying degrees of success. My sabermetric idol, Bill James, is credited with crafting the first attempt at modeling wins based on Rdiff. He developed a formula called Pythagorean Wins for its similarity to the geometric theorem discovered by the Ancient Greek Pythagoras. Following James’s groundbreaking work, others came and went. A simple web browser search using the query, “baseball run differential correlation wins,” reveals dozens of entries from various would-be statistical titans, if only their work could best that of Sir G.W. James of Holton, Kansas (okay, I added the title “Sir.”)

Niceties aside, run differential wasn’t really an interest of mine. I’ve generally steered clear of it, offering the cursory smile and wave. The smattering of available theories that weren’t buttressed with an explanation of how Rdiff occurs kept me off that path. However, like so many times before, I was passing time with the stat I love (OPS) only to find myself knee-deep in a social media exchange defending the virtues of on-base-plus-slugging.

On-base-plus-slugging, and its various derivatives, is the Leatherman tool of baseball statistics.

We have known for 40 years that OPS describes offensive productivity. In recent years, we’ve examined how a pitcher’s OPS (how well hitters produced against him) provides an excellent measure of effectiveness. We can compare batters across seasons using the adjusted-OPS statistic. Earlier this year, I introduced the same adjusted comparison for pitchers. Yet, the most important use for on-base-plus-slugging may be its ability to explain wins.

A generation ago, Dr. Cyril Morong shared remarkable data describing something he referred to as the OPS Differential (OPSdiff). This statistic measured the gap—or difference—between a team’s on-base-plus-slugging and their opponent’s OPS. For successful teams, this differential resulted in a net positive. For struggling teams, the opposite was true. Dr. Morong found that OPSdiff explained 87% of the variance in win totals for MLB teams from 2001 to 2003 (0.933 correlation).

I was wrestling with a similar dilemma a few weeks ago while on one of my data excavations. Unknowingly, I stumbled onto the same ground once covered by Dr. Morong. I shared my thoughts with my buddy, Arch Stanton of AbsoluteDegeneracy.com, and he directed me to Dr. Morong’s work. I was hooked, but I knew I wanted to dig deeper. It was evident that OPSdiff is the untapped gold standard. More importantly, I wanted to evaluate its connection to other metrics in my ongoing effort to map out the contours of baseball’s landscape. Specifically, I kept seeing the same two issues crop up in Kansas City Royals games: a lack of offensive productivity and a lack of run-scoring efficiency.

Game after game, it felt like the Royals were simply getting outmuscled. I keep daily box scores to run them through my Outcomes Runs algorithms. From May fourth through the twelfth, the Royals lost the OPS battle. They went two and five during that stretch. Their average OPSdiff (average of sum totals) was -244 points. In baseball parlance, their OPS was 0.244 below their opponents during this particularly rough stretch. With that in mind, I worked backward to track their OPSdiff for each game this season. As of this writing, they are at minus-98 points on the season. Oof…

But the Royals aren’t a very good club. (I unabashedly love them, but I’m no fool to what they are.) What about the rest of the league?

I looked at every season from 1993 to 2021. Over that period, OPSdiff explained an average of just over 80% of the variance in win totals (coefficient of determination, or R-squared). It’s huge. There isn’t another in-game statistic that comes close to this. The lesser-known component parts of Dr. Morong’s brainchild are on-base percentage differential (OBPdiff) and slugging differential (SLGdiff). These are measured using the same for-minus-against formula. They both correlate well with win totals, though not nearly as closely as their synergistic creation.

And it was this process of breaking down the internal mechanisms of OPSdiff that jarred my brain. I realized that the same sort of super-stat, like Pete Palmer’s genesis of on-base-plus-slugging in 1979, probably existed to understand run differential. I could see the eye-popping data for OPSdiff historically in the thirty years I studied. That described the productivity side of my idea. Nevertheless, I still needed a better understanding of the efficiency piece.

I noticed the Royals were losing another in-game battle most nights. In this case, it was runs-per-total-base (R/TB). If you’re not familiar with this metric, don’t worry. I discovered last year that it played a role in describing the component parts of run-scoring. This measurement may have been available to us the entire time, but nobody was using it. It has a decent correlation (-0.491) with run-scoring. When calculated for the difference between teams (differential) as it explains win totals, the correlation jumps above 0.800.

With my mind in a fog, I still hadn’t connected OPSdiff and R/TBdiff. I had a lackadaisical, ‘oh that’s nice,’ attitude treating the two statistics as parallel measurements rather than the complementary parts of a whole they were meant to be. Even in preparing for the original draft of this article, I hadn’t wholly pieced it together yet (I promise, that’s why I had to push this column).

More digging in the proverbial dirt of baseball’s data map led me to a moment of clarity. Yes, OPSdiff is a remarkable indicator of game-to-game productivity for a ball club. It’s also tremendously valuable at predicting win totals in its own right. But when combined with the efficiency measure R/TBdiff, the Frankenstein outcome has frightening accuracy.

Adding OPSdiff and R/TBdiff unlocks the formula that describes WHY run differential explains win totals.

And it really was that simple. I was circling the target without realizing it. Like most of my other discoveries, this will impact dozens of people following baseball (I kid, of course). Dating back to 2011, run differential explains an average of 87% of the variance in season win totals for all teams. The strength of the relationship between Rdiff and wins illustrates how keen Bill James’s insight was into developing Pythagorean Wins. We knew scoring more runs led to more wins, but there was something left unsaid. Aside from some cheeky comment like, ‘well, hit more home runs,’ the precise balance between productivity and efficiency hadn’t been put into words—until now.

Not only does the newly-minted productivity and efficiency differential metric (P+Ediff) align almost perfectly with Rdiff (0.9915 average correlation from 2011 to 2021), it explains wins and losses with essentially the same average coefficient of determination (0.932, or 86.9% explained variance). These factors tell us that run differential combines productivity and efficiency. It isn’t shocking to learn this. At the same time, we now have the means to monitor a team from game to game or across an entire season. We can make more meaningful projections—especially as teams head into the playoffs.

With newfound awareness, I expect we can all imagine games that fit better under the productivity or efficiency umbrellas. On April nineteenth, the Royals edged the Minnesota Twins by one run. The Kings of Kauffman won the OPSdiff battle handily by +265 points (+0.265). They were incredibly productive that day. They just weren’t efficient, struggling to score runners with a negative R/TBdiff (-14 points, or -0.014). This game was a textbook example of one team muscling out an inefficient victory (P+Ediff of +251, or +0.251).

By contrast, the Royals’ first game of the season on April eighth yielded an efficient yet fairly-unproductive win. Playing the Cleveland Guardians, Kansas City was underwater in OPSdiff (-126 points, or -0.126) while taking advantage of run-scoring opportunities when they came. KC had a R/TBdiff of +375 points (+0.375). Their final P+Ediff was +249 points (+0.249).

Each contest was close. Kansas City was behind in both games and did not lead until the sixth inning or later. By traditional standards, these wins might have been called lucky. In truth, luck had nothing to do with it. The Royals won because they outperformed their opponents in crucial categories. In fact, the P+Ediff formula has correctly identified the winner of every Royals game this season. It’s 42 and 0.

Wins and losses stop being luck when the algorithm is right day after day and season after season.

As challenging as the Royals’ season has been, they’ve still found different ways to win games. They are generally more productive than they are efficient. While each win may count the same in the standings, we can dissect them to examine their component parts. And it’s because of this, that we can look at run differential differently.

Please review the attached data sets and Productivity+Efficiency Differential Calculator attached below.

I hope you’ve enjoyed this column. I want to challenge your thinking about baseball statistics. Someday, my own research on the game will become outdated. Please feel free to spar with me about the ideas I’ve presented here—I enjoy the discussion because it challenges my thinking. I can be reached here on Baseball Almanac, via email at christopher.s.michaels@gmail.com, and I’m on the social media (Facebook, Twitter). As always, this has been the World According to Chris. Thanks for tuning in.

 

Leave a Reply