By and large the long time posters here at Brewerfan.com are familiar with a wide variety of statistics used by sophisticated "Sabertoothed Mathematarians" (thank you hoffy). Not surprisingly some of the newer posters aren't as familiar with some of these acronym laden terms. Odd as it may seem there was a time not all that long ago (3 or so years) when I couldn't tell you what OPS was let alone what a good OPS was for a middle IF vs. a corner OF. My aim today is to try and demystify some off these stats like EQA without having to spend 3 pages talking about linear regression and other such boring topics.
Recently Brian has started to put EQA, RARP, and MIEQA stats on the website, so I'll start with these. EQA stands for equivalent average. The stated purpose of this statistics developed by Baseball Prospectus is to translate offensive production into 1 metric that is similar to batting average. That is they want you to picture a .300 hitter, you have a sense of how good that is in your mind. A .300 EQA player should be that good. Now a .300 hitter isn't necessarily a .300 EQA player. One of the larger points of sabermetrics is that all .300 hitters are not equal. So what are the components of EQA? In a rough sense the statistic combines OBP and SLG with Stolen base data and transforms that into a percentage. To make for convenient comparison the data is then adjusted for the home ballpark and the league, keep in mind that a .260 EQA represents the league average hitter for all positions for a more detailed position by position breakdown you can cjeck out the EQA page at Baseball Prospectus.
That's the simple version, but what really goes on is fairly complex though not without merit. Most of the complexity comes from all of the adjusting to context and scaling that goes on. I'm going to outline the stepby step process for generating EQA in a hopefully simple way. Step 1) Generate the RawEQA, RawEQA = ( 2*H + DB + 2*TP + 3*HR + 1.5*(BB+HBP) + SB) / (AB + BB + HBP + CS + SB/3)
Relax, it's not that bad. All of the numbers is use are familiar easy stats. The important thing to note is that each event is appropriately weighted to reflect it's actual value. Base hits are better than walks (multiplier of 2 vs. 1.5) and SB less so then walks. Step 2) This number is then converted to unadjusted equivalent runs. Without showing the equation this is done by taking the league wide number of runs and determining what fraction of league wide offense the player is responsible for (it's essentially a ratio where you know the players offensive contribution, total runs scored, and league wide offense the number you get out is the number of runs the player is responsible for creating). Step 3) This is an interesting step here you are placing the individual player into a team context so you can properly adjust for park affects. This is gives you an interesting little number called Wins Above Average (WAA). This sounds complicated, but all you are doing is creating a theoretical team in this particular ballpark with a lineup of league average hitters. You then subtract out a fraction of playing time equal to the player you are measuring and add in his performance.
OK deep breath we're almost home. At this point we've taken a players raw stats to generate a Raw EQA (1). In step two we've attributed share of the total league wide runs scored to the player based on their contribution to the total raw numbers. As you may have guessed this normalizes (statistical term basically meaning equalize) the player to the current run environment, so that you can fairly compare a player who faced Bob Gibson in 1967 to the offense glut of recent times. Step 3 we take this a step farther by accounting for park affects. To do this we created a team of average hitters and compared it to a team of average hitters except for part of it using the player in question.
This comparison leads to step 4, which is to use this comparison to generate the winning percentage this team of average hitters with your player would produce. This is done using a modified version of baseball's Pythagorean theorem (a simple formula for determining very accurately what a teams winning percentage should be given it's runs scored and runs allowed totals). Step 5) we know plug this winning percentage figure into a conversion that takes us back to equivalent runs (EQR) this time though the number is adjusted for park and league affects. Step 6) the EQR is then converted into EQA! This simple conversion involves dividing EQR by the number of outs the player made and scaling it properly to make it look like batting average.
Well I had intended to keep this article close to 750 words, but I'm going to have to run a bit longer to describe MIEQA. Quite simply MIEQA is the EQA the minor leaguer in question would have been expected to put up in the major leagues based on his performance. It's not a projection, instead Clay Davenport did a lot of work to determine difficulty ratings for each minor league. He did this by comparing large amounts of data on how well players do in one league versus another when they are promoted or demoted within a single year. Based on this he is able to assign a number between 1 and 0 to each league. A 1 means the league was as difficult as the majors. You then just multiply the difficulty adjustment and the EQA of the player in his league to get MIEQA.
It's a useful way to track the progress of a prospect as he moves up the chain of affiliates his raw numbers might bounce around due to increased difficulty or home park, but his MIEQA gives a good indication of whether or not he's improving with the bat. Like I said it's not a projection because it does not account for how much a player may improve as he ages before he gets to the majors. For AAA vets it gives a good idea of how they would perform on the roster, but for other players it's most meaningful when used to measure career progress.
