Analytics, also called sabermetrics, rule baseball front offices, and on-field decision making. To most fans, they’re just a confusing or misunderstood topic. Some people think they ruin baseball or change the game too much, but that is just not the case. Analytics are just used to try to find the true impact a player makes on a game or to help those players make a bigger impact. To put it simply, analytics is just a fancy term for information. Fans should be happy their favorite team utilizes as much information as possible.
Fans should know that no stat is perfect. Every stat in use today has flaws, some just have less than others. The closest stats to perfection are Statcast data but even they have flaws. Every stat is just a piece of the puzzle to understand what is truly happening on the field.
People should also know all the stats in use have been studied and researched by really smart people in the game. Those people have disagreements on things but not without good reasoning. That’s why there are different versions of WAR, which we will discuss later. All these stats have had a lot of time put into them. They aren’t just thrown together to try and ‘ruin the game.’
They can also get very complicated and confusing; Especially if a person doesn’t have the time to study and memorize them. This guide should help you understand what most of the stats mean, how to interpenetrate them, and why they are used. There are way too many stats to discuss all of them so this guide will go over the most commonly used ones. There are also much better and deeper explanations of each one online if you find yourself interested in it.
The concept of analytics goes back to the beginning of baseball. Yes, you read that right. People have always wanted to separate players’ performances from their team’s performance. A pitcher’s win-loss record was an early attempt to make a stat to tell how a player performed. The concept made more sense when starters pitched an entire game but it still had its flaws. And as we know now, the stat doesn’t work at all like it was intended too, but everything is learned through trial and error.
The first team to really put them to use was the Brooklyn Dodgers under Walter O’Malley and Branch Rickey. The team hired a statistician named Allan Roth in 1947.
O’Malley was so interested in baseball statistics and their analytics that the Dodgers hired Allan Roth to work on interpolating the numbers for the team.
“Allan works for us all the year round,” said O’Malley. “It isn’t just sets of dry statistics for press releases. His compilations aid us in making decisions. Even in making trades.”
Roth proceeded to make a big impression as he advanced statistical analysis to a whole new level, working during the season and in the off-season.
“Allan goes beyond the verbal word of our scouts and other observers,” said O’Malley. “Underlying causes are important.”
Rickey knew there was an untapped world of information. Roth confirmed Rickey’s idea that runs batted in only mattered if they were correlated with chances to drive them in. He also provided evidence to prove platoon advantages are real and that on-base percentage mattered more than batting average. Roth also kept track of player splits, spray charts, and pitch charts for the team. The Dodgers had the largest amount of information in baseball under Roth and Rickey.
Analytics became widely known because of the book Moneyball: The Art of Winning an Unfair Game, by Michael Lewis, which was later turned into a movie. The story tells how the 2002 Oakland Athletics, led by Billy Beane and Paul DePodesta, used concepts made by Bill James, to replace their superstars who left in free agency with overlooked players no other team wanted. It resulted in 103 wins and a trip to the playoffs with the 6th smallest payroll in the league.
Now, every team in baseball has an analytics department and the best teams have made them a key focus in running their organization.
The Flaws Of Traditional Stats
|AVG counts all hits as the same. We know a home run does not equal a single. It also ignores other ways of getting on base like taking a walk.
|On Base Percentage
|OBP counts all hits and walks equal to each other, which just isn’t true.
|SLG is not weighted correctly and it ignores walks. A home run is not worth 4 times what a single is worth, a triple is not worth 3 times what single is worth, and so on.
|On Base Plus Slugging
|Like SLG, the stat is not weighted correctly and it gives a boost to power hitters. While power hitters are usually more productive, a homer is not worth 5 times what a single is worth. It is also flawed to add 2 stats together that have a different scale. A .500 SLG is good while a .350 OBP is good, it undervalues high OBP low slugging players.
|Runs Batted In
|RBI is largely dependent on opportunity based on what the rest of the team does. A bad hitter who hits behind Mike Trout and Mookie Betts will still get a lot of RBI because he gets so many chances.
|Like RBI, runs scored relies too much on what the rest of the team does to be an effective elevator for 1 player.
|A pitcher can throw 1 pitch and get a win. A pitcher can go 9 innings, allowing 1 run and lose. And why do starters have to go 5 innings to qualify for a win when a reliever can go 0.1 and win? It’s also incredibly arbitrary and decided on by the scorer.
|Earned Run Average
|A team’s defense can have a large effect on ERA. It also ignores runs scored as the result of an error and errors are truly subjective based on how the scorer feels at the moment. Hey, errors are a bad stat too. I guess this was a 2 for 1.
|Like ERA, it also relies on the team defense and factors a pitcher can’t control. It also ignores a pitcher hitting a batter for some reason.
|The stat is incredibly context dependent and arbitrary. To quote Keith Law’s book Smart Baseball, “To be credited with a save under the current version of the rule, which has been in place since 1975, a pitcher must record the final out in a game that his team won, but one where he didn’t get the win, and the team didn’t win by too many runs because then he obviously contributed nothing at all.”
The flaws of these stats can be explained in a deeper and more informative way, I just want to give a simple and quick explanation.
Key Things To Know
|When there is a w in the stat.
|The stat is weighted, based on something called linear weights. This basically means not all hits are equal and it assigns a value to each hit. Unlike in batting average, a home run counts for more than a triple, a triple counts for more than a double, and so on.
|When there is an x in the stat.
|The stat is an expected stat. This means the stat is based on what was expected to happen based on data like launch angle, exit velocity. and so on.
|When there is a + in the stat.
|The stat is park and league adjusted. The purpose of it is to try and remove the effects of playing in stadiums like Coors Field vs ones in AT&T Park and to compare the player to the rest of the league. The higher the number above league average the better and the lower the number below average the worse. For example, if the league average is 100, 120 is better than 70.
|When there is a – in the stat.
|This stat is park and league adjusted but any number below league average is better and any number above league average is worse. For example, if the league average is 100, 70 is better than 120.
Most of the hitters stats also work for what the pitchers allowed. Having a basic understanding of these symbols can help you learn what other stats are saying, even if they aren’t discussed in this post.
The Stats To Know
Weighted on-base average is one of the simpler stats to understand and it is also incredibly effective. It is on-base percentage that also gives more points for extra-base hits and slightly fewer points for walks and hit by pitches. It should become the go-to stat for fans because it combines batting average, on base percentage, and slugging percentage all into 1 more accurate number. This stat can be used for hitters and pitchers (wOBA allowed). It is on the same scale as on-base percentage so .320 is considered average, .400 and above is excellent, and .290 and below is awful.
Weighted runs created plus is very similar to wOBA except it also is park and league adjusted. It’s also on a different scale, instead of using the same scale as on-base percentage, wRC+ uses a system where 100 is league average and anything over is better and anything under is worse. It is also adjusted for the era since it’s league adjusted so a 127 wRC+ in 2019 is just as valuable as a 127 wRC+ in 1936. wRC+ is a good stat for comparing offensive production. Its main flaw is that you can’t create perfect park effects, so unlike wOBA, there is some estimation in it.
Isolated Power tells how often a player hits for extra bases. It is calculated as SLG-AVG. Iso more so tells what kind of hitter a player is instead of how much value he produced. Its flaw is that it counts all extra-base hits as the same value. Fangraphs has a good description on why it’s useful.
A .300 average with very few extra base hits is quite different from a .300 average with 40 home runs. The same is true of a .500 slugging percentage that is driven by many singles versus one driven by lots of doubles and home runs.
A .140 ISO is average, .240 and above is excellent, and .080 and below is awful.
Batting average on balls in play is exactly what it sounds like. It’s what the player’s hitting when he puts the ball in play and removes strikeouts and home runs from batting average.
BABIP isn’t a stat you want to use on its own. The stat can help tell you if a player is unlucky or lucky but it is also influenced by speed and hard-hit ball numbers.
K% and BB%
Strikeout and walk percentage are simply the percentage of times a batter strikes out or walks in his plate appearances. A 20% K% and 8% BB% is considered average. A 10% K% and a 15% BB% are excellent. A 27.5% K% and 4% BB% is awful.
Both of these are also used for pitchers. For pitchers, an average K% is 20% and BB% is 7.7%. An excellent K% is 27% and an excellent BB% is 4.5%. A 13% K% and 9% BB% are awful for pitchers.
Fielding independent pitching is ERA that removes the pitcher’s defense from the equation. FanGraphs has a very nice summary of it here.
It is a statistic that estimates their ERA based on their strikeouts, walks, hit batters, and home runs while assuming average luck on balls in play, defense, and sequencing is a better reflection of that pitcher’s performance over a given period of time. This is highly related to the reasons why we care so much about Batting Average on Balls in Play (BABIP), specifically the fact that pitchers have very little control over their BABIP allowed.
FIP is on the same scale as ERA so 4.20 is considered average.
Expected fielding independent pitching is similar to FIP but it gives a league average home run to fly ball rate instead of the pitcher’s actual home run to fly ball rate. It helps remove park effects of home runs. For example, a home run hit at Coors field might not be a home run at Dodger Stadium even if everything else about the hit was the same. That doesn’t mean the pitcher actually made a worse pitcher, it was just the stadium he was in at the time. The scale is the same as ERA.
Deserved run average tries to estimate how many runs a pitcher should be credited for allowing. It tries to remove every factor that isn’t what the pitcher did.
DRA is premised on the notion that while a pitcher is probably the player most responsible, on average, for what happens while he is on the mound, he is not responsible for everything. DRA therefore only assigns the runs a pitcher most likely deserved to be charged with.
According to Baseball Prospectus, it is the best estimator available to the public because it exceeds the performance of stats that try to do the same thing, like ERA.
DRA explained about 70 percent of pitcher runs allowed in each full season, even including pitchers with as few as one batter faced.
It is on the same scale as ERA, so the lower the number is, the better.
Defensive runs saved is a stat that attempts to measure how many runs a player saves or costs his team while in the field. It’s measured on a scale where zero is an average defender, that didn’t cost or save any runs, anything above zero means the fielder saved that many runs, and anything below zero means the player cost his team that many runs.
Framing (Framing Runs)
Framing attempts to calculate how many extra strikes a catcher gets for his pitcher. The concept goes back to baseball’s beginnings but without pitch tracking data, it was next to impossible to calculate. It is calculated by finding the extra strikes a catcher gets, which is “the difference between actual and predicted strikes received by the catcher,” according to Baseball Prospectus.
Framing Runs was created by Baseball Prospectus to calculate how many runs a catcher is saving, or costing his team, with his ability. It is on the same scale as DRS, where zero is average. A framing leaderboard can be found here.
— Home Run Tracker (@DingerTracker) September 24, 2016
Exit Velo and Launch Angle
Exit velocity is simply how hard the batter hits the ball. Launch Angle gives a specific description of if the ball was a line drive, ground ball, or fly ball. A ball hit with an exit velo of 95+ mph is considered a hard hit ball.
Barrels are any ball that is hit at 98 mph or harder with a launch angle of 26-30 degrees. The launch angle range expands for every MPH over 98.
The Barrel classification is assigned to batted-ball events whose comparable hit types (in terms of exit velocity and launch angle) have led to a minimum .500 batting average and 1.500 slugging percentage since Statcast was implemented Major League wide in 2015.
But similar to how Quality Starts have generally yielded a mean ERA much lower than the baseline of 4.50, the average Barrel has produced a batting mark and a slugging percentage significantly higher than .500 and 1.500, respectively. During the 2016 regular season, balls assigned the Barreled classification had a batting average of .822 and a 2.386 slugging percentage.
Spin rate is how many times a pitch rotates, which creates break or the appearance the pitch is rising. It is classified with RPM. Higher spin rates mean the ball stays flat longer, lower spin rates mean the ball breaks more. Generally, high spin rates lead to strikeouts and low spin rates lead to ground balls. An off-speed pitch with a high spin rate will move more than one with a low spin rate.
How fast a player runs at their top speed, measured by feet per second. It only includes sprints to first on “competitive runs,” meaning things, like jogging into second or jogging out a ground ball, are n0t included.
The Major League average on a competitive play is 27 ft/sec, and the competitive range is about 23 ft/sec to 30 ft/sec.
How fast the catcher gets the ball to second or third base when trying to catch a runner. It includes how fast he gets the ball from his glove to his hand (exchange) and his arm strength.
The Major League average Pop Time on steal attempts of second base is 2.01 seconds. Average times are calculated with the following ranges.
Pop Time to 2B: 1.6 sec to 2.5 sec
Pop Time to 3B: 1.2 sec to 2.5 sec
Exchange: .4 sec to 1.3 sec
Catch Probability and OAA
Catch Probability calculates the percent chance an outfielder would catch the ball. It is used to help measure outfield defense. MLB has a really good breakdown of it here:
Catch Probability represents the likelihood that a batted ball to the outfield will be caught, based on four important pieces of information tracked by Statcast. 1. How far did the fielder have to go? 2. How much time did he have to get there. 3. What direction did he need to go in? 4. Was proximity to the wall a factor?
It is broken into tiers of five percent probabilities; so 0 percent, 5 percent, 10 percent, and so on, up to 100 percent.
Outs Above Average uses catch probability to find how many extra balls an outfielder gets to, or doesn’t get to over the season. It is fairly simple to calculate once you know the catch probability. The formula for caught balls is 1.00 – catch probability = X. So if an outfielder catches a ball with a 25 percent catch probability, he adds .75 to his OAA total. If a player doesn’t make the play, you just subtract the catch probability. So if the catch probability is 65 percent, the player loses .65 points from his total. It is on a scale where zero is average, anything above zero is better, and anything below zero is worse.
Wins Above Replacement
Wins Above Replacement, or WAR, is one of the most talked about stats and it is also one of the most controversial. What you should really take away from WAR is it’s not perfect, but it is a good estimator.
WAR attempts to calculate a player’s total value added over a league average player, also known as a replacement player. Part of that is confusing to people because they think a replacement player means whoever is called up to replace him, but it really just means a player who doesn’t hurt or help the team.
A player who puts a zero war is league average, and every number over it is one win added. That also doesn’t mean a 5 win player is going to add five more wins to the team, it’s just an estimation. It is also important to know that since it’s just an estimation, there probably isn’t a big difference between a 5.4 win player and a 5.1 win player.
The scale is important to know too, 0 is a replacement level player, 3 is a starting level player, 5 is an all-star level player, 7 is an MVP candidate level player, and 9+ is Mike Trout level.
There are also different calculations for WAR, since it is just an estimation. Baseball Reference WAR can look entirely different from Fangraphs WAR.
WAR shouldn’t be used as an end-all perfect stat. But it is useful for getting a general idea of how much value a player is providing to the team.