The Ideal Rotisserie Scoring System
Part One - Background
There has always been a great deal of debate over what is the best method for scoring Rotisserie Baseball. The Founding Fathers (Okrent, Waggoner, et. al.) came up with an eight category scoring system that, while not perfect, has been tremendously successful and accurate. Not surprisingly, this "Original Eight" has become the standard in the Rotisserie world; nearly every book and magazine on Rotisserie bases their projections and calculations on this system. This is wise, since it is the most widely used. This standardization is important so that players in different leagues can speak intelligently about the relative values of players.
The Original Eight consists of Home Runs, Runs Batted In, Stolen Bases, Batting Average, Wins, Saves, Ratio (Hits plus Walks divided by Innings Pitched) and Earned Run Average.
Of course, this system has its flaws. There are certain inequities introduced by using these categories, which any alternative system attempts to correct. However, these alternative methods of scoring usually do little to correct the problems. Worse, they are almost without exception less accurate than the Original Eight.
Defense of the Original Eight
F.X. Flinn, one of the key persons in the publication of Rotisserie League Baseball, wrote an article for the 1993 Edition of that book. In it, he uses statistical methods to compare the Original Eight scoring system with three scoring systems that were submitted to the RLBA and/or were popular in the early 1990s. Flinn's article, "Categorical Imperatives Redux", provides a starting point for our discussions. We also have a link to an article on calculating the Spearman Coefficient of Correlation, which is the statistical method used to compare teams.
Flinn's article is very successful in defending the RLBA's scoring system. However, in it he also mentions the imbalances that the system can produce. Basically, the overall team-based accuracy of the Original Eight is sound. However, it does not necessarily lead to realistic values for individual players. It grossly overvalues base stealers and closers, de-emphasizes lead-off hitters in favor of power guys, discounts a batter's ability to draw a walk, and does not reward players who belt out extra base hits.
Apples and Oranges
While substancial, Flinn's defense was not discussing the same things that the proponents of alternative scoring systems were discussing. Flinn focused on the team-based accuracy of the Original Eight. However, most critics of this system point out the inaccuracies it creates in the values of players. F.X. Flinn and David Gardner (who is mentioned in the Flinn article) are not looking at the same thing when they discuss "accuracy." Gardner and other critics are player-focused; they point out that each player is not necessarily worth in Rotisserie what he is worth in real baseball. Flinn on the other hand is team-focused; he sees that when you construct an entire team of players, a successful Rotisserie team's composite stats should replicate those of a successful major league team. For instance, good major league teams generally score more runs than lesser teams, so good Rotisserie teams should do likewise. However, both good and bad major league teams can have many stolen bases, thus that measure is not as accurate in predicting success.
The BMCRBA's Goals
Since our league was scheduled to start play in 1994, we were constructing our rules in the Fall of 1993. Thus, I reviewed Flinn's 1993 article in depth, and followed his lead in using statistical databases in order to determine what categories our league should use. I had two goals guiding me in this endeavor:
First, I wanted to construct a system that evened out the inequities between players. Vince Coleman should not be the MVP of any league, not even a Rotisserie league. Also, does it not seem strange (when using the Original Eight system) that the top eight pitchers in the league (except for maybe one dominating starter) are all closers? This is not the way things work in real baseball. Closers are important of course, but starting pitchers are the name of the game. Ask any major league GM if he thinks the best eight pitchers in baseball all pitch only one inning. Or if he thinks the MVP stole 70 bases but only hit .260. I sought to make the relative value of a player to his Rotisserie league team more closely mirror his relative value to his real team.
Second, I wanted to accomplish this in a way that did not detract from the overall, team-based accuracy of the scoring system. In my mind, the central question in Flinn's article is "If you wanted to decide pennant races by a mix of categories instead of just Wins, which categories would produce standings most like the standings the major leagues get now?" This main premise is very sound. Actually, I would say it is more than that; it is necessary. Just balancing the individual players is not enough, because players do not win pennants, teams do. While improving the imbalances between types of players is important, it must never detract from the accuracy of the scoring system for the team as a whole.
Construction of the Scoring System
Since the "Original Eight" is a very strong group of categories, I started there. First, it was decided that we should use ten categories rather than eight. This on its own will reduce the impact of any one category; HR by themselves now account for 10% of the total points, rather than 12.5% in an eight-category system. Ten categories also allows for the tracking of a greater variety of statistics, without getting ridiculous (twelve categories would make things too complex).
Flinn's article, confirmed by my own reasearch, identified four of the standard eight categories that had very high correlations with actual standings: Wins, RBI, ERA and Ratio (we replaced the cryptic term "Ratio" with RPI, or "Runners Per Inning"). These "core four" were left untouched.
Next, Runs scored was added as our fifth category. This measure has the highest positive correlation of any hitting category. This also served to balance one of the inequities; lead-off men were undervalued in comparision to clean-up hitters. Both are important to a team; table-setters have to be on base in order for the boppers to collect RBI.
I also decided to make Home Runs our sixth category, and not to modify it. As mentioned in Flinn's article, this category has a relatively low correlation. However, these days it is the first thing anyone looks at when examining a hitter's credentials. Not to keep HR as an independent category (without modification) would create too great a departure from what people expect when they examine a team. To many, any such action is downright un-American, and in certain parts of the country could get you branded as a communist. In the end of course, it stayed because I like to see Mark McGwire crush a ball 520' as much as the next guy.
Now it came time to work on the remaining three of the standard eight. All three (Saves, Stolen Bases and Batting Average) are not nearly as good as the "core four." Saves actually is not too bad, but we'll deal with pitching in a minute. For now, let's look at the last two hitting categories:
Stolen Bases is a bad category. Notice I did not qualify that statement at all; it is just bad. It is bad because it has the worst correlation with actual standings of any category I have tested (both hitting and pitching). It is also bad because it so distorts to value of the entire population of hitters. Speedsters are important, but come on. Do we really have to pay $40 for mediocre infielders simply because they can run? Remember, you can't steal first base. A way is needed to still keep SB in the mix, but reduce the amount of its impact. In looking into how to do this, it occurred to me that the Original Eight give no extra value to doubles (or triples) over singles. As far as moving around the basepaths with no one on base, a double is the same as a single plus a stolen base. That fact alone merits the inclusion of extra-base hits. Of course, if someone is on base, the double is even better than a single and a SB, since the men on base can generally advance farther.
However, in standard Rotisserie you would rather see a single and a SB, since in addition to the hit, it will give you a point in the steals category; doubles are triples are a waste unless they lead to a RBI. So in lieu of a pure speed category, I decided to water down SB, and replace it with a new seventh category, Speed and Power Bases. You get one S&P Base for each base you advance beyond first due either to your speed or your ability to hit balls into the gap. In other words, one point for a SB, one point for a double, and two points for a triple. Speedy players are still valuable, but guys who can hit 50 doubles also become valuable. Your speedsters will still lead this category, but there is not such an overwhelming disparity between them and everyone else in the league. In addition, the correlation with actual standings of S&P is better than that of SB.
In keeping with the theme that each base advanced is valuable, it occurred to me that On-Base Percentage was a better measure of a hitter's ability than Batting Average. While the old saying "a walk is as good as a hit" might not be 100% accurate, there is quite a bit of truth to that statement. Drawing a walk is a skill. Sometimes bad hurlers just give them up, but many times they are earned by a good at-bat. A guy who fouls off a dozen pitches before finally getting ball four obviously deserves credit for that effort. Getting on base is the name of the game. While a single is usually more helpful with men on base (again for the opportunity to move the runners up more than one base), with no one on base, "a walk is as good as a hit." In addition, the correlation for OBP is significantly better than BA. (Note: We simplify the official major league baseball calculation of OBP to (H+BB)/(AB+BB). We disregard hit by pitch and sacrifices, mainly because these stats are not widely available.)
Finally, the ninth and tenth categories needed to be added, both of them pitching categories. I started out looking at Saves in much the same way I looked at SB. Saves turns above average players into MVP candidates. While I wanted to keep Saves prominent, since it has a fairly good success correlation, I also wanted to shift more of the pitching responsibility back where it belongs--to the starters. The last two categories were designed to do just that.
The ninth category became Saves plus Complete Games, rather than just Saves. This rewards great performances by any pitcher, not just closers. Sure, the handful of top closers will account for most of the points in Sv+CG, but truly dominating starters who can go the route also can pick up points. How much would you bid on Greg Maddux, if you knew that in addition to his numbers as a starter, he would also get 10 points in the saves category? Exactly. This combination increases the values of the top starters in the league. Just as there can only be one save per game, there can only be one Sv+CG per game (you actually could have two if the losing pitcher throws a CG, but the winning team can only receive either a Sv or a CG). This provides what David Gardner was trying to accomplish with his Games Finished category, but in a much better way. You can still only have one finishing pitcher, but now an ineffective losing pitcher (or a nobody that finishes up a blowout) will not earn points. Also, since a team cannot get a Sv or CG without winning the game (except in rare instances), these numbers tend to increase for better teams. This results in an excellent positive correlation, and quite a bit better than Saves alone.
The tenth and final category was the hardest one on which to decide. I selected Strikeouts minus Walks for several reasons. One was tradition; all three triple crown categories for batters are represented (although BA was modified to OBP), so it seemed right to include all three triple crown categories for pitchers (again, with a modification to the weakest of the three). A second reason was a desire to further emphasize starters and de-emphasize closers. Guys who pitch a lot of innings have more opportunities to pick up K-BB than players with 65 IP a season. A third reason was to reward good control. Walks already hurt pitchers through the RPI (or Ratio) category. Now issuing a free pass also costs you in the K-BB category. This hopefully serves as a incentive to shy away from guys with poor control. Nothing causes ulsers in major league managers quicker than guys who cannot throw strikes.
Continue on to Part 2 of this Article
BMCRBA Scoring System,
Part 2 |
Correlation Coefficient
Home Page |
BMCBRA Archives