Baseball BeatFebruary 20, 2006
A New Way to Measure Strikeout Proficiency
By Rich Lederer

Strikeouts. The out of choice. You don't have to be a stathead or a scout to know that pitchers who record a lot of strikeouts are generally preferred over those who don't. But the $64,000 question is: How do we best measure strikeout proficiency?

Once upon a time, we simply looked at the number of strikeouts. This method is certainly simple, and it has proven to be a good indicator of pitching prowess over the years.

Johan Santana led the majors in strikeouts last year with 238. Jake Peavy led the National League with 216. In the 1960s, mugshots of pitchers like Santana and Peavy would appear on Topps baseball cards honoring the league leaders in Ks.

TOP TEN LEADERS IN STRIKEOUTS

Johan Santana     Min     238
Jake Peavy         SD     216
Chris Carpenter   StL     213
Randy Johnson     NYY     211
Doug Davis        Mil     208
Pedro Martinez    NYM     208
Brett Myers       Phi     208
Carlos Zambrano   ChC     202
John Lackey       LAA     199
A.J. Burnett      Tor     198

With the proliferation of computers, we began to crunch numbers and value the rate of strikeouts in addition to the sheer quantity. The stat of choice soon became strikeouts per nine innings (or K/9).

TOP TEN K/9

Mark Prior        ChC    10.15   
Jake Peavy         SD     9.58
Johan Santana     Min     9.25
Brett Myers       Phi     8.69
Pedro Martinez    NYM     8.63
Jason Schmidt      SF     8.63
John Lackey       LAA     8.57
A.J. Burnett      Tor     8.53
Randy Johnson     NYY     8.42
Scott Kazmir       TB     8.42

Mark Prior, who ranked 12th with 188 Ks in 166.2 IP, led MLB with 10.15 K/9. Interestingly, Prior, Peavy, and Santana were the only pitchers who punched out more than one batter per inning. (For perspective, among pitchers with 162 or more innings, the average number of K/9 was 6.21 last year.)

Strikeouts per nine innings is very effective but strikeouts as a percentage of batters faced is even better. Why? Well, K/9 favors pitchers who face more batters and penalizes those who don't allow a lot of hits, walks, and hit by pitches. To wit, if a hurler strikes out the side but allows a couple of hits and a walk along the way, is he as effective in whiffing batters as someone who strikes out the side in order? The answer is clearly no.

Not surprisingly, strikeouts per batter faced (also known as K/BF, K/TBF or K/BFP) has become an increasingly popular metric among performance analysts the past few years.

TOP TEN K/BATTERS FACED

Mark Prior        ChC     .268
Jake Peavy         SD     .266
Johan Santana     Min     .262
Pedro Martinez    NYM     .247
Brett Myers       Phi     .230
Randy Johnson     NYY     .229
Josh Beckett      Bos     .228
A.J. Burnett      Tor     .227
John Patterson    Was     .226
Chris Carpenter   StL     .224

Prior, Peavy, and Santana--the only starters who averaged more than a strikeout per inning--also whiffed over 25% of the total batters faced. Prior led the majors in K/BF, striking out almost 27% of the hitters. However, his margin over Peavy narrows considerably because the latter, by not allowing as many hits and walks per 9, faced fewer batters per out than Prior. (Among pitchers with 162 or more innings, the average number of K/BF was .163 last year.)

Chris Carpenter climbs from 16th in K/9 to 10th in K/BF. The Cy Young Award winner allowed only 1.9 BB/9 last year and was 6th in baserunners per 9. Josh Beckett (13th in K/9) and John Patterson (12th in K/9) also make the top ten in K/BF due to the fact that they had considerably better BR/9 than those they replaced (Jason Schmidt, John Lackey, and Scott Kazmir).

Now, just as K/BF is a better gauge than K/9, strikeouts per total pitches is even better yet. In fact, it is the best one of 'em all. Yes, strikeouts divided by total pitches is the single greatest Defense Independent Pitching Stat out there. It measures dominance and efficiency.

Just as striking out the side in order is preferred over getting all three outs via the K regardless of the number of batters faced, a pitcher who strikes out hitters on three pitches is more effective than those who take five or six to get the job done. By definition, he is missing bats a higher percentage of the time and is also more likely to pitch deeper into games and record a greater number of outs than his counterparts.

TOP TEN K/PITCHES

Johan Santana     Min    .0714
Jake Peavy         SD    .0684
Pedro Martinez    NYM    .0683
Mark Prior        ChC    .0665
Chris Carpenter   StL    .0627
Randy Johnson     NYY    .0616
A.J. Burnett      Tor    .0600
Brett Myers       Phi    .0599
Josh Beckett      Bos    .0592
John Patterson    Was    .0582

Although the top ten in K/pitches (or K/#PIT) is the same as K/BF, the order is slightly different. Santana moves up from 3rd to 1st and Prior drops from 1st to 4th because Johan (3.66) averaged 10% fewer pitches per plate appearance (P/PA) than Mark (4.03).

What does .0714 K/#PIT really mean? That's a good question. In and of itself, that percentage is rather awkward. However, the decimal comes to life if we multiply it by 100. You see, Santana struck out 7.14 batters per 100 pitches last year. Not only do we now get a real number out of this exercise but the standard of measurement is almost exactly the average number of pitches per start during recent years.

The only difference in the list below vs. the one above is that the number shown represents how many strikeouts per 100 pitches. In an era of pitch counts, it may be more instructive to measure starters by the number of K/100 pitches than K/9 IP.

(For context, among those who qualified for the ERA title, the average starter last year threw approximately 98 pitches and completed 6 1/3 innings. The average number of K/100 pitches was 4.44.)

TOP TEN K/100 PITCHES

Johan Santana     Min     7.14
Jake Peavy         SD     6.84
Pedro Martinez    NYM     6.83
Mark Prior        ChC     6.65
Chris Carpenter   StL     6.27
Randy Johnson     NYY     6.16
A.J. Burnett      Tor     6.00
Brett Myers       Phi     5.99
Josh Beckett      Bos     5.92
John Patterson    Was     5.82

I thought it might also be fun to take a look at the worst pitchers in terms of K/100.

BOTTOM TEN K/100 PITCHES

Horacio Ramirez   Atl     2.62
Jose Lima         NYM     2.79
Kenny Rogers      Det     2.89
Kyle Lohse        Min     2.99
Bronson Arroyo    Bos     3.03
Jason Marquis     StL     3.09
Carlos Silva      Min     3.10
Jason Johnson     Cle     3.10
Jamie Moyer       Sea     3.12
Josh Fogg         Pit     3.14

All of the pitchers on the above list are more renowned for throwing strikes than getting outs via strikes. Carlos Silva is the best example. The man who led the majors in fewest pitches per plate appearance (3.06) was successful because he only walked a MLB-low 0.43/9 IP last year.

Pitchers who strike out a lot of batters tend to be much more effective than those who don't because they allow fewer balls in play (BIP). As a general rule, the more BIP, the more hits and errors. Hits and errors lead to runs, and runs lead to losses.

We have known for some time that strikeouts are the out of choice. The more Ks, the better. We also know that the fewer pitches, the better. Combining high strikeout and low pitch totals is a recipe for success. The best way to measure such effectiveness is via K/100 pitches. This stat can be improved upon by adjusting for ballpark effects.

Unfortunately, I don't have pitch totals for home and road splits. When this information becomes more readily available, we could rank pitchers by ballpark-adjusted K/100 (or K/100+). I believe this stat just might be the best way to measure pitcher dominance, if not overall performance.

[Additional reader comments and retorts at Baseball Primer and Scout.com.]

Comments

Interesting article. What is the correlation between Ks per inning and Ks per batter faced and Ks per pitch? Which pitchers move up or down the most in the rankings?

Your strikeout rankings are skewed, of course. NL pitchers have a tremendous advantage since they face their fellow pitchers who account for a high percentage of their strikeouts. NL pitchers generally suffer a 15 to 20% drop in K-rates when they switch to the tougher league. You might do some analytical research and factor that into your rankings. Randy Johnson for instance suffered a decline in K rates, or so it seems, but when you factor in the differential, his K ratio was just as high as last season's and he would rank ahead of every NL pitcher on your list. This needs fine-tuning and a bit more complexity added to serve as a comparision model.

On the flip side, what would K per pitches seen tell us about batting performance? (I'm asking, because I don't know.) I think it would tell us the same thing - good and bad batters.

What about pop-ups? An infield/IF foul fly is nearly as good as a K. If a pitcher gets three pop-ups on five pitches, is't he as effective as someone who strikes out the side on 9 pitches?
I would expect a strong correlation between K's and infield pop-ups, but it would be interesting if there are "hidden" dominant pitchers who don't strike out that many batters but induce a lot of infield flies.

I would argue that the K/BF is a more reactive than predictive stat and therefore not necessarily better than K/IP. While the premise for it is sound, particularly as it relates to walks, looking at it in the past tense is going to raise pitchers who had an anomalously low BABIP up the list, which is something that will tend to regress to the mean and undo some of the predictive qualities of the stat.

I guess the argument for it is that if a BABIP is very low, the pitcher had a commensurately low K rate because he faced fewer hitters. I don't know if I think that will correlate directly though I have no evidence for this.

If I want a nice predictive stat, I might also look at K/(outs+walks+HR allowed). This essentially equalizes the BABIP variability.

Could you try to measure how Strike/Ball ratio affects K/BB ratio it seems to me that it could give you a good correlation and let you see which pitchers were outliers in their K/BB rates.

thats some great stuff, but now my 'steals' of Scott Kazmir(top 10 k/9) and John Lackey(top 10 Ks,k/9) dont look quite so stealthy all of a sudden.

What is the correlation between Ks per inning and Ks per batter faced and Ks per pitch?

Among the pitchers with 162 or more innings in 2005, the correlation between K/9 IP and K/100 Pitches was .963. The correlation between K/BF and K/100 was .985.

Your strikeout rankings are skewed, of course.

My strikeout rankings are no more skewed than any other strikeout measure (including total K, K/9/, and K/BF), which combines the two leagues.

The purpose of this introductory article wasn't to prove that a pitcher in one league is better than a pitcher from the other league. Adjustments to account for the effects of leagues can certainly be made.

but now my 'steals' of Scott Kazmir(top 10 k/9) and John Lackey(top 10 Ks,k/9) dont look quite so stealthy all of a sudden

Do not despair. Lackey was 13th and Kazmir 18th in K/100. They were 3rd and 5th, respectively, in the AL.

I'm not sure that it would necessarily matter, but it occurred to me that another related metric would be how many pitches ended up getting third strikes in two-strike situations. One potential issue I see with the K/100P metric is that not all pitches are created equal; the first two strikes cannot be strikeouts.

One potential issue I see with the K/100P metric is that not all pitches are created equal; the first two strikes cannot be strikeouts.

Yes, but you need the first two strikes in order to get the third.

Your idea about the number or percentage of pitches ending up as third strikes in two-strikes situations is worthy of exploration.

There are differences between the first two strikes and the third, though. Foul balls for instance...

Strikeouts are important, but a pitcher can be quite successful without a high K-rate as long as he is consistently ahead in the count. Only the very best hitters hit well when they have 2-strikes on them. Each PA can be thought of as a race between pitcher and batter. If the pitcher can get 2 strikes on the batter before giving him a good pitch to hit or falling behind in the count he will succeed the vast majority of the time, with or without a strikeout. So the really key stat to monitor for pitchers (and batters) is the % of PA's which end with either an 0-2, 1-2 or 2-2 count. (I exclude 3-2 counts as they are essentially neutral.)

Rob's point is interesting to me, in that I've been suspecting that the A's look at something like that already. It seems to me (as an observation, not with any statistical evidence) that the A's always try to acquire young players with good K/9 numbers, but then when they actually have such a player in hand, they turn around and try to get them to pitch to contact--i.e. get the out as early in the count as possible.

So the really key stat to monitor for pitchers (and batters) is the % of PA's which end with either an 0-2, 1-2 or 2-2 count. (I exclude 3-2 counts as they are essentially neutral.)

Are we in some kind of time warp here? I thought this blog and others like it had moved beyond such observation arguments. Where's the data? Sounds like something right out of Joe Morgan.

Rich has put together an interesting find that advances the DIPS theories. You don't have to subscribe to its predictive value, but the data is sound and fits in with other DIPS-related discussions. If you value K/9, this indicates that K/P is even more worthy. That's all.

Example:

Pitcher A:
Pitches: 90
Strike-outs: 10
BF: 30
HR Allowed: 4
Walks Allowd: 0

Pitcher B:
Pitches: 90
Strike-outs: 8
BF: 30
HR Allowed: 0
Walks Allowed: 0

Which line would you rather have? A homerun allowed (park adjusted) should also be a part of any "Defensive independent stat".

vr, Xeifrank

I have to ask the question that any self-respecting Mariners fan would ask -- how would Felix have fared in comparision, if he were not eliminated based on IP?

I really enjoyed this blog. Have you considered an out to pitch ratio. Granted the first two pitches have a lower out percentage then the remaining pitchers. I think this stat could be used to evalute when a pitcher gets traded to a team with a modifier for a Better/Worst defense and differnt leagues and ballparks. When Randy Johnson went to the yanks you plug his out to pitch ratio and then modify it according to the new league ball park and Defense. It was hinted at earlier the ability to throw a 4 and 5 can also make the pitcher very effective.

If you're interested, I look at pitches per strikeout here.

I look at pitches per strikeout here.

For the record, pitches per strikeout are not the issue here. I'm measuring strikeouts per pitch, not the other way around.

Strikeouts per pitch (or K/P) takes into account every pitch thrown, including those on strikeouts, walks, hit by pitches, home runs, and balls in play.

David is only evaluating the number of pitches when the result is a strikeout. That is neither here nor there in my study and, unlike K/P, it has no correlation with run prevention.

Richard,

I've got a pretty significant question here, one which I feel isn't asked nearly often enough when people introduce new stats or measures.

What's the year to year correlation in this for players? As in, is this more representative of a skill than other strikeout measures? Does this really give us a better idea at their true talent levels?

If not, then what exactly can this be used for? Would inserting it into DIPS formula improve things? (I suspect it would, but I'd like to know.)

You've explained pretty well why this could be useful, but at the same time, it's all pretty speculative. What's the practical use of this knowledge? Either of the two setups for usefullness I suggested would do nicely.

I've now read Rich's piece four or five times. His thesis is that pitchers who strikeout lots of batters and are efficient in their use of pitches are very effective pitchers. I don't dispute this. What I dispute is that efficiency comes from striking out batters on fewer pitches. From my read of the article, this appears to be what Rich is implying. The point of the above chart at my site is to demonstrate that the efficiency doesn't come from fewer pitches per strikeout, but from other places.

I think David Pinto has a point. The problem is this statement:

"Just as striking out the side in order is preferred over getting all three outs via the K regardless of the number of batters faced, a pitcher who strikes out hitters on three pitches is more effective than those who take five or six to get the job done. By definition, he is missing bats a higher percentage of the time and is also more likely to pitch deeper into games and record a greater number of outs than his counterparts."

1. Rich - You say later that "pitches per strikeout are not the issue here." That seems to be incongruous with the statement above.
1. If it is the issue, then I would say that pitchers who work counts and set up batters for strikeouts can be just as effective as pitchers who strike out batters on three pitches.
2. By defining pitchers who strike out batters on three pitches, all you are doing is isolating the "power" pitchers, not necessarily the most "effective" pitchers. I'm not sure it's fair to assume that such power pitchers would last longer in games or get more outs. In order to record a strikeout a pitcher must throw three pitches, which is less effective than the pitcher who gets a ground ball out on one pitch. Greg Maddux, in his prime, could throw 80 pitches over nine innings and complete a game in an hour and a half. He was the most effective and efficient pitcher I've ever seen, but he couldn't have cared less where he fit in on the charts above.

There are many ways to skin a cat. Up to that part where it breaks down K/Pitches, there was a lot of merit to the analysis. I would disagree, though, with the apparent conclusion that striking out batters on three pitches represents the ideal for pitching effectiveness - if that is what Rich is saying.

This is valuable and creative research. I suggest doing the same as regards walks, a much more important category than strike outs, considering that the linear weights value of a walk is minus 1/2 run to a pitcher; whereas a strike out has a LW value of just plus 1/10 run to a pitcher.

Also how about OUTS PER PITCH? Could be something significant there, also.

Robby Bonfire........phillies.mostvaluablenetwork.com