We know that baseball teams have a tendency to win or lose in bunches. It only takes a glance at the "last 10" column of the MLB standings in your local newspaper to find proof of it. It's also typical to see significant differences in a team's winning percentage from month to month. As a result, when a new series is about to start, we are often warned about a particular opponent being hot or our favorite team being cold. Should we really adjust our expectations for the upcoming series or are streaks largely a function of obvious factors and random chance? Let's investigate.
There are a few factors that significantly contribute to the formation of good or bad stretches of games. A string of games against above or below average teams has an effect. Whether a team is playing at home or on the road can also play a role. Changes to the starting lineup or pitching staff also contribute to uneven team play. One of largest causes of winning and losing streaks, however, is simply the "coin flip factor" Over the course of a 162 game season, we should expect a mediocre team to win or lose seven or eight games out of ten on occasion and we shouldn't just assume that it was caused by any real change in the team, itself. In fact, a perfectly mediocre team (one that has a 50% chance of winning each of their games) would be expected to win and lose seven or more games out of ten, several times through out the course of a season. It's a result of binomial uncertainty and it's inescapable. All teams have to be streaky to some degree.
So, we know that a perfectly consistent team is still going to have its ups and downs but just how many ups and downs should a baseball fan expect? To help answer that question, I'm going to use the first 126 games of the Brewers current season as a test case. The Brewers have had about as streaky a season as you can find, so it will be interesting to see how much of that streakiness is easily explainable. For each game, I will assume that the simulated team has exactly the same talent level. To account for binomial uncertainty, I'll simply simulate the season 10,000 times and find the average result.
I'll adjust for the level of competition by using the Log5 equation to calculate an expected win probability for each simulated game (adding 4% to the probability for home games and subtracting 4% for road games). I don't have an easy way to account for personnel changes so I'll be forced to ignore them.
Let's first compare the streaks and stretches of the simulated season to the actual one:
Right away, you should notice just how inconsistent even the simulated team is. In fact, the real Brewers haven't been anymore inconsistent than expected, at least with respect to the numbers above. There are other ways to measure consistency, however. What if we graph the Brewers' "last 10 games" record for the entire season? If we do, we get something like this:
It's pretty clear that the Brewers' season can be broken up into four general sections of great, terrible, great, and terrible play. They haven't really had any significant streak of playing right at their expected performance level and my previous attempts to quantify streakiness didn't pick that up. How often should a .500 team spend playing .500 ball, though?
To get a feel for the average look of a graph from a simulated season, I'll start by showing ten randomly selected simulated seasons. The red line is the actual Brewers season and the blue line is a simulated season:
Simulated Season Examples: Number of Wins Over Preceding 10 Games
The simulated seasons are very streaky as well (look at that first one!) but it's hard to say if they are any less or more streaky than the actual season. We need another way to quantify streakiness. What if we looked at the average number of games a team stayed above or below their expected "last 10" record? Graphically, that's just the average length of the vertical lines on the chart, below:
Let's compare that to the average of 20,000 simulated seasons:
Ave Games Away From Expected:
|Chance of Diff. Being Random
The difference between 1.61 and 1.28 games is pretty significant (over one standard deviation). If my model was perfect, a streaky season, like the one the Brewers are currently having, would only happen by chance about once every 13 years. Still, my very simple model still explains a large portion of that streakiness.
While it's fair to characterize the Brewers season as unusually streaky, we have also found that even a theoretically consistent team isn't very consistent at all. Even if we assume that a team has the exact same talent level for each game, they will often exhibit the qualities of hot and cold team. While it's possible that psychological or other factors may cause teams to play above or below their normal talent level for short stretches, it's difficult (if not impossible) to identify when that's the case. As a result, when we are trying to estimate a team's chances in an upcoming series, we're probably better off not weighing the results of recent games any more than any other game from the season. A team may have won their last five games, but that doesn't necessarily mean they'll win that sixth.