I’m still working on using the Pythagorean Expectation model for soccer in the background, but in the foreground I decided to apply to the PE model to its original sport — baseball. I decided to take a look at the Frontier League, an independent baseball league that has several teams near where I grew up. It also seems a little bit less pressure than applying it to the MLB.
Ideally, I would have many seasons of league data to go off. The best I could find was this past 2021 season. What I’m looking for is a team’s Wins and Losses, their Runs Scored and Runs Allowed, and Winning Percentage. Having this information would allow me to calculate every teams’ Run Differential (Runs Scored-Runs Allowed) and their Pythagorean Expectation Winning Percentage. This would allow me to see which teams over- or under-performed over the course of the season by comparing what the PE model spits out for both Winning Percentage and Projected Wins.
Getting the PE Winning Percentage for each team is relatively straightforward. Bill James used ‘2’ as his exponent when he first came up with this model, but other people have seemed to settle between 1.83 and 1.9 as a better option. I use 1.86 for this analysis, as it seemed like a solid middle ground. There’s a bunch of maths I could do to eventually get to that number, but that’s for another time. Here’s what the basics of the PE Winning Percentage model will look like:
PE Winning Percentage = RS^1.86 / (RS^1.86 + RA^1.86)
RS = Runs Scored
RA = Runs Allowed
Since the PE model uses Runs Scored and Runs Allowed as part of the formula, I want to visualize the relationship between a team’s winning percentage and their run differential before tackling what their expected winning percentage is.

It’s clear that there’s a positive relationship between a team’s run differential and winning percentage. FYI, the empty box between Schaumburg and Southern Illinois is Equipe Quebec. R doesn’t like accents on its variables!
So now we know that, typically, if your team has a positive run differential, you will have a higher winning percentage. But did these 2021 Frontier League teams over-perform or under-perform? That’s what we want to find out.
A cool thing about the PE Model is that it can be used to project a team’s number of wins. It’s simple:
Projected Wins = PE Winning Percentage * (W + L)
W = Wins
L = Losses
Let’s see how the Frontier League stacks for 2021. Here’s the actual standings, ranking by Winning Percentage (ignore the “NA” at the bottom):

Also, the team that’s 5 rows down with the funny characters…that’s Equipe Quebec. So based off actual winning percentage, the Florence Y’alls were the best team in the Frontier League this season, followed by the Washington Wild Things and the Evansville Otters.
Now let’s see how the teams stack up when comparing their PE Winning Percentages. Keep an eye on the projected wins for each team, too:

Things get shaken up a little bit. The Florence Y’alls fall all the way down to 6th place while Washington and Evansville move up. The first 5 teams in these updated standings all have PE Winning Percentages and Projected Wins higher than what they actually got during the season. However, Florence greatly out-performed this season. This indicates that they probably won the majority of their 1 and 2 run games.
Other teams that outperformed expectations are the Tri-City Valley Cats, the Joliet Slammers and the Gateway Grizzlies. While it might not feel good to come in near the bottom of the standings, maybe these teams can take solace in the fact that they outperformed expectations over the course of the season!
Below are some of the sources I used and a link to the code I wrote to come up with this stuff.
https://github.com/firstpitchstrike/Baseball_Analytics/tree/main