clock menu more-arrow no yes

Filed under:

The Pablo Sandoval stat: Some thoughts about hitting metrics

New, 6 comments

We've entered the stage of the offseason where it's time to talk about evaluation. Some interesting ideas were brought up today that deserve some more attention.

Ed Szczepanski-USA TODAY Sports

There was a bit of flurry around Baseball Internet today about this post by Ken Arneson, titled "10 things I believe about baseball without evidence". You should read it all, but the short synopsis is that it's a critique of modern sabermetrics for assuming everything is independent by ignoring sequencing, from the perspective of a bitter A's fan who keeps watching the Giants win World Series. I don't agree with everything he says-- I actually think it's plenty plausible that the A's have been the victim of random chance and the Giants have benefitted from it, and there's not a systematic answer to their fortunes in the past 4 years-- but he brought about a few things that spurred some thoughts and I wanted to get some of those down. Actually, it spurred a lot of thoughts, but I'll stick to one particular about hitter evaluation and maybe come back to a few down the road (it's a long offseason).

I promise to get back around to the actual point but we will have to start out a bit on the technical side. Let's think about measuring something unmeasurable (hitting skill) with an imperfect but all-inclusive measure (let's say weighted on-base average or wOBA). In that framework, we think that with a large enough sample size, wOBA gets closer and closer to the player's true hitting skill. In a real statistical framework, we could calculate a standard error (which we could use to calculate a 95% confidence interval) associated with the measure of wOBA. As the number of plate appearances rises this standard error falls and we become more and more confident in the player's actual ability. Nothing groundbreaking here.

Underlying that whole framework is our faith that with a small sample, wOBA is on average correct but randomly incorrect at the level of an individual player. With enough plate appearances an individual player's wOBA approaches his skill. But what if we think wOBA doesn't actually approach level of real hitting skill, as it's biased in one way or another towards one type of player?

From a measurement error perspective, we're generally okay with an imperfect measure because we can in effect assume things balance out over a large enough sample. Secondly, we assume that the error is uniform across the distribution. The assumption that underlies this concept is a pretty good one: players aren't trying harder to get hits in some situations than other, or more explicitly, that hitters uniformly perform worse against good pitchers than they do against bad pitchers. But what if that's not exactly true?

That's what I think Arneson is getting at here:

To me, the biggest difference between the A's in the playoffs and the Giants in the playoffs is Pablo Sandoval. Because there may not be anyone in baseball right now better than Sandoval who does damage even when he does not get a good pitch to hit. He can turn pitches in the dirt, in his eyes, and/or six inches off the plate into a hit. He's almost immune to prediction state manipulation by opposing pitchers.

My interpretation of this is that he thinks that a certain type of hitter can post a good regular season wOBA year after year because they are particularly good at certain skills, such as guessing hanging pitches correctly and hitting them hard. But this type of player might not be well-equipped to handle a postseason, when pitchers are better overall. The theory behind standard hitter evaluation says that such a player who is able to handle good pitchers should be even better against bad pitchers, by definition. But this idea that a certain type of batter who can hit tougher pitches hard but might have a lower wOBA overall because he doesn't fit the uniform rule of "an individual batter performs better against worse pitchers" is something that is very interesting, and is something that can be investigated.

To sum up, in my view, we currently have two categories of hitting stats: context-dependent and context-neutral. The former measures how players perform in tight game situations. The latter measures how players perform without regard to game situations. None, to my knowledge, measures how a player performs with respect to the skill of the opposing pitcher. If Arneson is on to something here (and I'm not sure he is until I get some time to look into it), maybe we should think about making one that does.