The Value of Base Stealing

Jack T.
5 min readJun 19, 2021

I’m working through the excellent “Analyzing Baseball Data with R” book and would like to begin posting the progress I’m making as I continue to complete the exercises. I’m learning how to create a lot of the visualizations and charts, but I also want to learn the best way to synthesize and convey my analysis. Charts and pictures might look pretty, but if I can’t explain them in words and in a way that people can understand, then they’re just pretty and that’s it.

The following analysis is from Ch. 5, but I’ll be going back to my work from previous chapters to post here. For now, I’m going to look at the expected run value of stolen bases in the 2011 season. I know that’s 10 years ago and baseball has changed a lot since then, but in 2011 I was fortunate enough to be living in Milwaukee and working for the Brewers as a 50/50 Raffle Ticket seller. This meant that even though I had to navigate the rowdy Brewers tailgate parties at 10 in the morning, I could sell my tickets in the stadium for the first four innings and then watch the rest of the game! I saw a lot of Brewers games that year.

I’m using the Lahman database of baseball stats, along with Retro sheet play-by-play data for 2011, to examine the value of stolen base attempts in 2011. There’s of course only two outcomes in an attempted stolen base — the runner is either safe, or the runner is out. The codes “4” and “6” in the play-by-play data correspond to whether a runner was Safe or Out, respectively. Creating a data frame of only plays where a stolen base was attempted allows us to examine frequencies of outcomes as well as expected run value generated by stolen bases.

Creating stealing df and grouping by EVENT_CD
Resulting tibble of SB/CS freqs

There were 3,727 stolen base attempts in 2011 and close to 77% were successful while 23% of runners were thrown out attempting to steal. Seems like players are fairly good at stealing bases and getting into scoring position. What are the most common situations stolen bases are attempted? To do this, I’ll make another frequency table showing the runners on base and how many outs there were in the half-inning when the stolen base was attempted.

Stolen base attempts by situation

The vast majority of stolen base attempts in 2011 came with a runner on first (‘100’) and either 0, 1, or 2 outs (‘100 0/1/2’). But the tibble shows a wide range of situations in which stolen bases were attempted. There were even eleven attempted stolen bases with a runner on third!

Every play in baseball has some run value attached to it, and stolen bases are no exception. The run value reflects the outcome of the attempt and the situation in which a runner attempted to steal a base. A histogram will show the runs created for all stolen base attempts in the 2011 season. The different colors of the bars represent the outcome.

Creating the histogram plot

The histogram shows that most successful stolen bases have a positive run value, but it’s very minimal. Most of the values are between 0 and 0.5. It makes sense that if a runner is caught stealing it leads to a negative run value. The three spikes in negative run value occur when a runner is caught stealing from first with 0, 1, or 2 outs. What’s interesting are those plays where the stolen base was successful but it still led to a negative run value.

Obviously, teams need to have a runner on base in order to attempt a stolen base. I’m interested in benefits of stolen bases for particular situations — let’s say a stolen base attempt with a runner on first and one out (‘100 1’).

With a runner on first and one out, runners successfully stole a base 74% of the time. We can also examine the outcomes of an SB attempt, seeing what happened during the actual play itself. Was the runner out, did the runner stay at second, or was there a bad throw by the catcher that allowed the runner to take third base?

In 52 occurrences of an SB attempt with a runner on first and one out, the runner was able to advance to third base.

Last thing on stolen bases — what’s the actual value of a stolen base in any situation?

While stolen bases are valuable, they aren’t really that valuable. In fact, a successful stolen base is only worth 0.03 runs per attempt.

So, this was my first crack at demonstrating what I’ve been learning the last couple of weeks as I work through the book. Yes, I know that this exercise followed the book, but this is where I’m starting at. I plan to use these types of analyses as templates for my own projects, as well. Hope you enjoy, and you have feedback, let me hear it!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Jack T.
Jack T.

Written by Jack T.

Data enthusiast. Topics of interest are sports (all of them!), environment, and public policy.

No responses yet

Write a response