Thursday, November 5, 2009

The Best Bandbox

Recently I was downloading some data for a project that I have planned relating to park factor, when I had a moment of inspiration. It occurred to me that one could get a rough idea of how easy it is to hit a home run, at a given stadium, simply by finding the percentage of batted balls that were home runs. In other words, dividing the number of home runs by the number of AB's where the batter did not strike out would yield a home run rate. The higher the rate, the easier it is to hit home runs. The equation is simple and looks like this:

R=HR/(AB-K)

Before doing this, my belief was that Coors field would not ave the highest rate. I also had a suspicion that a certain stadium would have the highest rate. So I ran the numbers, looking at both the home team's and away teams' home run rate for each stadium in 2009. Here are the results:

ClubStadiumTeamOpponentsTotal
NYYYankee Stadium III5.96%4.61%5.30%
TEXRangers Ballpark in Arlington5.80%4.02%4.87%
PHICitizens Bank Park5.02%4.42%4.71%
MILMiller Park4.83%4.55%4.69%
CHAComiskey Park II4.77%4.03%4.40%
TORSkyDome4.67%4.07%4.37%
CINGreat American Ballpark4.46%4.20%4.33%
TAMTropicana Field4.97%3.70%4.31%
BOSFenway Park5.38%3.26%4.30%
BALOriole Park at Camden Yards4.15%4.42%4.29%
LAAAngel Stadium of Anaheim4.01%4.53%4.27%
DETComerica Park4.26%3.98%4.12%
MINHubert H. Humphrey Metrodome4.22%3.87%4.04%
ARIChase Field4.07%3.82%3.94%
COLCoors Field4.59%3.30%3.93%
FLADolphin Stadium3.99%3.74%3.87%
MLB Average4.00%3.63%3.81%
CHNWrigley Field3.87%3.74%3.80%
HOUMinute Maid Park3.57%3.84%3.71%
WASNationals Park3.57%3.57%3.57%
SEASafeco Field3.50%3.51%3.51%
OAKNetwork Associates Coliseum3.25%3.07%3.16%
PITPNC Park3.39%2.92%3.15%
SDGPetCo Park3.05%3.16%3.11%
SFGAT&T Park3.11%3.07%3.09%
CLEJacobs Field3.10%3.05%3.07%
KANKauffman Stadium2.87%3.01%2.94%
LADDodger Stadium3.18%2.68%2.94%
NYMCiti Field2.22%3.61%2.92%
ATLTurner Field3.18%2.55%2.87%
STLBusch Stadium II3.06%2.41%2.73%

As you can see, the new Yankee Stadium comes out on top, with 5.3% of batted balls hit here turning into home runs. So that's it, the new Yankee Stadium is the easiest place to hit it out. Coors Field, as I guessed, was not really an easy place to hit home home runs. Unfortunately it's not that simple. These results may have more to do with each team's ability to hit home runs, and of their pitching staff's inability to keep the ball in the yard. So a team with lot of power and poor pitching is likely to score high on this list.

So in order to adjust for a team's ability, a new value must be found. The first step that I took was to recalculate the above table for each team while on the road. The following table shows these rates:

ClubTeamOpponentsTotal
PHI5.10%4.09%4.61%
CLE4.27%4.89%4.58%
TAM4.45%4.62%4.54%
BOS4.25%4.40%4.32%
NYY4.57%3.86%4.23%
TOR4.30%4.13%4.22%
MIL3.71%4.69%4.21%
DET4.02%4.27%4.14%
TEX4.70%3.58%4.14%
KAN3.63%4.65%4.13%
SDG3.56%4.55%4.05%
SEA3.69%4.23%3.95%
COL4.63%3.24%3.92%
ARI4.04%3.76%3.89%
WAS3.71%3.90%3.80%
BAL2.79%4.79%3.78%
MLB Average3.63%3.83%3.73%
MIN3.28%4.17%3.72%
STL4.15%3.18%3.67%
CHA3.55%3.66%3.60%
CIN2.88%4.27%3.57%
FLA3.33%3.72%3.52%
LAA3.57%3.45%3.51%
HOU2.83%4.21%3.50%
CHN3.62%4.01%3.39%
LAD3.23%3.45%3.33%
OAK2.72%3.90%3.29%
ATL3.47%3.02%3.25%
PIT2.42%3.89%3.17%
SFG2.54%3.83%3.14%
NYM1.98%3.47%2.71%

In this table it can be seen that the Phillies had the highest rate of batted balls becoming home runs. This was largely due to their ability to hit home runs at a high rate. While on the road, 5.1% of batted balls by the Phillies became home runs. While at home, only 5.02% of their batted balls were home runs. Their opponenets did benefit by playing in Philly, with 4.42% of batted balls at the Bank becoming home runs and only 4.09% becoming home runs in Phillies' away games.

The next step is to divide the data in the two tables to determine the increase (or decrease) in rate of home runs to batted balls when a team is in it's home park. If there is an increase in thee ratios when playing at home, then playing in that stadium is beneficial to hitting home runs. The following table shows the ratios:

ClubStadiumTeamOpponentsTotal
NYYYankee Stadium III1.30561.19621.2519
CHAComiskey Park II1.34161.10341.2201
LAAAngel Stadium of Anaheim1.12401.31481.2180
CINGreat American Ballpark1.54950.98561.2118
TEXRangers Ballpark in Arlington1.23301.12521.1769
BALOriole Park at Camden Yards1.49030.92301.1341
CHNWrigley Field1.06780.93341.1240
MILMiller Park1.30140.97041.1140
FLADolphin Stadium1.20001.00601.0988
MINHubert H. Humphrey Metrodome1.28590.92641.0865
NYMCiti Field1.12031.03931.0776
HOUMinute Maid Park1.26110.91291.0592
TORSkyDome1.08660.98501.0360
PHICitizens Bank Park0.98381.07891.0229
MLB Average1.10260.94691.0220
ARIChase Field1.00831.01661.0115
COLCoors Field0.99191.01831.0022
PITPNC Park1.40160.75240.9945
BOSFenway Park1.26650.74040.9943
DETComerica Park1.05960.93250.9942
SFGAT&T Park1.22550.80250.9834
OAKNetwork Associates Coliseum1.19730.78750.9600
TAMTropicana Field1.11690.80090.9511
WASNationals Park0.96430.91620.9393
SEASafeco Field0.94810.83100.8870
LADDodger Stadium0.98510.77830.8815
ATLTurner Field0.91570.84620.8812
SDGPetCo Park0.85760.69420.7679
STLBusch Stadium II0.73730.75890.7429
KANKauffman Stadium0.78970.64600.7110
CLEJacobs Field0.72570.62390.6706

As it turns out, new Yankee Stadium is the easiest stadium in the Major Leagues to hit a home run, with 25.19% more batted balls landing in the seats than in Yankee away games. Although Yankee Stadium provided the biggest increase in home run rate, the Yankees didn't benefit as much as some other teams. The rate of home runs was 55% higher at home for the Cincinnatti Reds than when they were on the road. Their opponents actually hit homeruns at a slightly lesser rate, when coming into Great American Ballpark. As it turns out, the Rockies had a tougher time in 2009 hitting home runs while on the road, than while at Coors. Their opponents did benefit slightly, but overall the rate was nearly the same as in away games.

This analysis provides a new look on whether or not a stadium really is a good home run park or not. Unlike park factor, which only considers the amount of homeruns per game, this method looks deeper and find the number of homeruns per batted ball. This is important since other factors may lead to increased number of plate appearances per game, in certain stadiums. The additional plate appearances add to the number of homeruns, thus slightly inflating the home run park factor. Like park factor, there is still a flaw which I will discuss further in a future post. Until then, hopefully I have shed some light on which ballparks really are home run friendly and which are not.

Thursday, October 22, 2009

My Thoughts on Sabermetrics

Sabermetrics, if you've never heard of it, is the use of various statistical methods to evaluate baseball players and teams. It is actually somewhat controversial because many people in MLB like to stick to their traditions, and sabermetrics flies in the face of that. There are countless jokes about sabermetricians all living in their mom's basement and don't actually watch the games. In spite of the resistance, the use of sabermetrics has actually become more common in recent years. Several teams, most notably the Oakland A's, Tampa Bay Rays, and Boston Red Sox have used high level analysis to help build their teams.

I decided to start blogging with the hopes I can add something to the science. If nothing else, I could provide a unique perspective to my hometown team, the Colorado Rockies. Before I go further with my work there are a few thoughts on the science I would like to share.

First of all, I actually believe that sabermetrics backs up a lot of what traditional baseball thinking has always taught. For example, the old saying "Don't make the first or third out at third base" can be supported using sabermetrics. Of course you shouldn't be willing to make any outs, but the 1st and third are especially damaging. I also think that things that happen on the field can be explained using sabermetrics, such as a player who seems to always find the hole may indeed have a high batting average on balls in play (BABIP).

I don't necessarily agree with everything that sabermetrics tends to support. For one thing I don't believe there is a tell all statistic. Every stat tells you something about a different player's ability. Even though, on-base percentage (OBP) is more valuable than batting average (AVG), I don't think it's totally useless. I do think that it can show the likelihood of a batter driving in a runner in "scoring position." Of course a high slugging percentage (SLG) will indicate a batter is more likely to drive in a runner who is NOT in scoring position. There is also the notion that sabermetrics is only about walks and home runs. I would disagree and feel that it is also about singles, doubles, and the occasional triple. Another common belief among stat guys, that I don't really believe, is that pitchers have no control over their BABIP.

I am doing this for the fun of it and to hopefully gain more insight to the game. But if any team's GM reads this and wants to hire me, please send me a message. I'll get back to you right away. I also welcome any constructive comments about my work. However, if you're going to drop a "momma's basement" joke on me, you can get lost.

There are a number of projects that I plan to work on as I write. I plan to look closer at park factors, BABIP from the hitter's and the pitchers perspective, and would even like to do some work with Pitch F/X data. I am really excited about all of this. I only wish I had started doing this sooner. There's a lot of discoveries to be made, so I had better get to work.

Sunday, October 18, 2009

Franklin Morales, LOOGY

It's funny how a bad outings (or a streak of them) can affect people's perception of a player. Case in point, is Franklin Morales who had a breakout season in a relief role. He pitched well enough that when Huston Street was injured, he took over the closer role. He recorded saves in his first 6 opportunities, then began to struggle. His struggles, highlighted by a 5 run, 7th inning in Los Angeles left many fans feeling that he should have been left off the roster. There was one good reason he made it: he had been very tough against left handed batters. With Philly's lineup of left handed, the move to include Morales, only made sense.

The move paid off, with Franklin pitching 2 and 1/3 perfect innings in the first three games of the series. All of that good work was quickly forgotten, however, thanks to his 3 walk (1 intentional) performance in game 4. Of course he did do one very important thing that inning - get Ryan Howard out. Many fans were again ready to show Morales the door, but I say hold on.

First off, I am never one to judge a player based on one stretch. It's much more telling to look at the player's body of work. One thing that pops out about Morales' career is how good he has been against lefty batters. I feel the Rockies should keep him since he has a great chance to be an outstanding left handed one out guy (a LOOGY).

Over his 3 year career, Morales has given up an AVG/SLG/OBP of .185/.276/.277 vs left handers (.175/.247/.275 in 2009). His numbers vs. righties are .274/.373/.396 in his career and .277/.366./405 in 2009. While facing righties he has clearly been hittable, although they have not hit him for much power. Against lefties he has been flat out dominant. That kind of arm is not easily replaced.

One of the big criticisms of Morales has been his tendency to walk too many batters. This is certainly justified, since has was walked 12.4% of the batters he has faced in his career. Interestingly, his splits again tell a deeper story. While walking 13.3% of right handed batters, Morales has walked only 9.2% of lefties, a more acceptable rate.

Hopefully the Rockies can take a close look at his effectiveness vs left handers and keep him around. More importantly, I hope they know to use him vs lefty bats and limit his work vs righties. After all, a misused reliever can do as much damage as a bad one.

Wednesday, October 14, 2009

Rocktober is Over

The Rockies season is over. After splitting two games in Philly, the Rocks came home and took two tough losses. I'm not going to write much specifically about the games. I will however tip my hat to Yorvit Torrealba, who got a big clutch double in game 4 and a home run in game 2. Although I still don't think he is as good as he played over the past month and a half, he did come through in those spots.

A few observations about the playoffs in general so far:

Three pretty good closers blew saves, while one lousy one managed to get two and blow zero. It just goes to show how anything can happen in a short series. Fair or not, Joe Nathan, Jonathan Papelbon, and Huston Street are all now rumored to be with new teams next year. Of course there was probably a chance all three would be gone anyway.

The umpiring was horrible around the league. While it's impossible to say that any team would have or wouldn't have won if the calls had been made correctly, it's reasonable to say that we shouldn't have to wonder. It should be up to the players to win or lose games, not umps. Now is the time to expand instant replay. Before that can happen, a few questions about the procedures must be answered:

Can replay happen quick enough not to slow down games?
What types of plays can be reviewed?
How will the plays be reviewed? (I.e. 5th ump in booth)
How will the Replays be initiated? (Personally, I think it should be up to the umps, although managers should have some way to ask for them. However, I do not think there should be any punishment for a failed review, like in football. I don't want to see anyone lose a game because they asked for a review, and was forced to give up an out. I also would not want to see a play not be reviewed that should, because a manager is afraid of being punished. If you do that you are changing the game too much.)
What will be done if continuity of a play is interrupted and how can you prevent that from happening? (My suggestion is that an ump can give a signal on a close play to "play it out" and that it will be reviewed when the play is over. If the review shows a fair ball, for example, then everyone stays where they ended up. If you do not have this, then a foul call that is over turned will require judgement by the umps to figure out where each runner should be.)
Where does the burden of proof lie? (In football, if a play is inconclusive then the ruling on the field stands. That may be logical, but in some cases, I don't think that would be the best way to handle it. For example if an ump says a fielder came off the base, or a runner missed a base, I feel that a replay should have to clearly show that happened, regardless of what the play was called on the field.)

Hopefully the league will spend some time this winter figuring out these questions and implement replay for next season, before something like this or this happens again.

Thursday, October 8, 2009

Who Should Start?

The Rockies are down 1 game to none vs the Phillies. Cliff Lee was dominant throughout the game, only giving up one run in the ninth. As with any playoff game where you only get 1 run, fans are obviously calling for changes to be made. The Rockies reputation as being a team the hits left handers poorly only heightens the sense of urgency, with Cole Hamels starting game 2. Changes need to be made to the lineup, but which ones?

I decided to look at splits vs left handers for all of the Rockies hitters to see if an optimal lineup could be found. There are several positions which are considered to be up in the air: 2nd base, catcher, third base, and up to all three outfield spots. I had planned on comparing slugging and on base percentages for each positional battle. If one player had a higher slugging but lower OBP, I would use baseball musing's Lineup Analysis Tool to determine which lineup would score the most amount of runs per game; the one with player A (high OBP) or player B (high SLG). It turned out to not be necessary, as in EVERY situation, the player with the higher OBP vs lefties also had the higher SLG. (There was a different order among the three outfielders, however the top three in SLG were also the the top three in OBP.)

The winners are with OBP/SLG:

C. Chris Iannetta .406/.580
1B. Todd Helton .369/.372
2B. Clint Barmes .314/.496
3B. Garrett Atkins .363/.428
SS. Troy Tulowitzki .382/.519
OF. Seth Smith .368/.500
OF. Dexter Fowler .377/.482
OF. Carlos Gonzalez .343/.466

It's unlikely that these are going to be the starting 8 position players. My guess is that Stewart gets the start over Atkins for his defense, and Spilly gets the start over Seth Smith. It should be pointed out that there may be some sample size issues, as these are 2009 numbers and not career numbers. However, I doubt the Rockies are really looking close enough to the numbers to care.

Hopefully whichever lineup they go with does something. I don't want to be sitting out in the cold on Saturday being down 0-2.

Monday, October 5, 2009

The Case for Chris Iannetta

Even before this happened, I've been saying the same thing to anybody that would listen: Chris Iannetta should be the Rockies' starting catcher. Unfortunately a lot of people out there don't, most importantly Jim Tracy, don't seem to agree. With the playoffs starting soon, I think now is the time to lay out my case.

Iannetta's critics like to point out a number of things why he shouldn't be in there. His .228 batting average and high strikeouts have doomed Iannetta in the eyes of many. Of course the biggest road block for Iannetta has been that his replacement has been on fire.

Since taking over the starting job on Aug 29, Yorvit Torrealba has had an AVG of .319 with an OBP of .373. Both numbers are very good, but his SLG over that span is .404. If we take a closer look we can see that Torrealba's BABIP has been an even .400. The typical BABIP is around .300. Although I do feel there are certain reasons that a player can have a higher than normal BABIP (such as speed, or gap power), YT doesn't have any of those. Even if he did, a BABIP that high is way beyond what any hitter can produce over a long span. It is pretty clear that he has been very fortunate over that span. Even with a .325 BABIP over that stretch, YT would have posted a very light .255/.313/.340 line. So unless the balls keep finding holes at that incredible rate, Torrealba's productivity should be expected to drop.

Meanwhile Chris Ianetta's BABIP has been a very low .253. If he has a more normal BABIP of .300, he add 9 hits to his total, bringing his AVG up to a more respectable .261. What about all those strikeouts? CI finished the year with 75 k's in only 289 at-bats. If his strikeout rate was reduced to the league average of 20 percent, he'd cut his total to 58 strikeouts. That is significant, but at his current BABIP would only give him 5 additional base hits, and a .244 AVG. So while a reduction in strikeouts would add a small increase in productivity, having a few more balls fall in would be even better.

Another advantage that Iannetta has, is the ability to draw walks. In 332 plate appearances, Chris has drawn 43 walks, a rate of 13%. Although his AVG is sitting at .228, his OBP is a respectable .344.

The most important thing to consider in this issue is Iannetta's platoon splits. He has struggled versus right handers this year, batting .202/.320/.413. Versus left-handers he has hit .296/.406/.580 with 4 homers and 9 doubles in 92 plate appearances. Of course his career numbers are only .265/.379/.530. With a team that has struggled to hit left handed pitching, you can't afford to leave that type of production on the bench.

Hopefully the Rockies' management will soon figure out the catching situation before it's too late. In the worst case scenario, they need to set this up as platoon with CI starting against lefties.

Tuesday, September 29, 2009

Up Next

Here I am almost, a year later writing my 2nd post. What a difference a year makes. Matt Holliday is gone, which I am still saddened by, yet the Rockies are still leading the Wild-Card race with less than a week to go. How are they doing it? Pitching, probably. I am not really concerned about that right now, WhaI'd rather look at the hitters? The offense has improved significantly over the past year. In 2008, the Rockies scored 747 runs. In 2009, they have scored 771 as of Sept 27. That's an increase of about a third of a run per game, which puts them on pace for 801 runs. They've done so after losing one of the best hitters in the NL, in Holliday. What's different? Simple. A vastly improved top of the order.

This year's core group of Dexter Fowler, Carlos Gonzales, and Seth Smith have put up a .266/.354/.429 line while leading off and .263/.324/.465 while batting 2nd. Those numbers for 2nd hitters is weighed down by Clint Barmes getting most of those ab's the first half of the season. Regardless, those lines are much better than the .251/.311/.349 and .260/.311/.376 lines from a year ago.

So the Rockies may not have the power in the middle of the order like they used to. However if the table setters keep getting on, they will be a very dangerous lineup in the postseason.