Predicting MLB Attendance: Multiple Regression Analysis of MLB Attendance and Ticket Prices
| 20
|
by Timothy Moreland (Bball3345)
Check out the my new website and blog
Major league baseball has grown into a billion-dollar industry. Owners seem to no longer be concerned with winning as much as with how much winning will increase their profits. The purpose of the following study was to quantify the impact of an array of variables on future yearly attendance, specifically focusing on the impact of building a new ballpark. As well, the change in attendance corresponds to a change in gate receipts, a driving factor in team revenue. Therefore, this study will also aim to predict the expected ticket price of a team based on its current situation.
At first, an initial study was done isolating the new stadiums built from 1991-2005, which included fourteen new facilities. On first glance, it was noticeable all new stadiums in this time period had increased their Att+ (attendance expressed as a percentage of the league average, with 100 being average), with the exception of Florida and Atlanta. Florida’s attendance drop was dramatic and determined to be an outlier. Atlanta’s drop was less significant. Due to the change in capacity from old to new stadiums, another variable was introduced, Cap%, or the percentage of the maximum capacity filled during the season. This still showed Florida’s attendance as a drop after building the new stadium, but Atlanta had joined the rest of the teams in increasing attendance based on capacity. The findings of this initial stage were a 49% increase in Att+ and a 35% increase in Cap% for the first year a new stadium was built. For the average of the first five years, the increase was 40% and 30%, respectively.
To make these findings meaningful to the finances of a team, the change in ticket prices from the old stadium to the new stadium were analyzed. This study showed that ticket prices in new stadiums increased by 34% the first year and 27% for the average of the first five years. Furthermore, these findings could be converted into the increase of gate receipts by multiplying the increased ticket price by the increased attendance. Doing this resulted in a 94% increase in gate receipts for the first year and 78% for the average of the first five years.
Next, the study became a multiple regression analysis, expanding the sample size to all stadiums instead of only new stadiums. The sample size included all MLB teams from 1997-2005 with some exclusions. Tampa Bay and Arizona were left out for the 1998 season because it was their first season as a franchise; thus, previous season data did not exist. As well, Seattle’s 1999 season was excluded, because part of the season was played with a new stadium and part was played in the old ballpark. Boston’s capacity changed between day and night games for 2001-2003, so these seasons could not be used for the Red Sox. Canadian teams, Montreal and Toronto, could not be used at all. Canadian population and median income were too difficult to include in the study. In total, two hundred and forty-four teams formed the sample.
This study created one model for predicting attendance and another for predicting ticket price. In the attendance equation, the most significant factors were found to be how many All-Stars a team had on the roster, how many games behind in the standings in the current season a team was, the previous season’s attendance, and whether or not it was the first or second year for the stadium. Overall, the model produced a high .8846 r-squared.
After graphing the predicted attendance versus the actual attendance, all teams fit into the range, with the exception of five outliers. St. Louis (1998) and Los Angeles of Anaheim (2003) far out produced their predicted attendance; however, both could be explained fairly easily. St. Louis in 1998 contained Mark McGwire and his assault on Maris’ homerun record of 61. The added drama brought people to the ballpark at a rate that greatly exceeded the rate under normal circumstances. L.A., on the other hand, was coming off of an exciting World Series victory and the impact surpassed the usual impact of winning a championship. On the other side of the outliers. Cleveland (2003) was in a rebuilding year and had traded away many key players, Tampay Bay (1999) was in its second year as a franchise and had finished fifty-one games behind the Yankees in their first year, and the White Sox (1998) had dumped players in 1997 while still in a pennant race, losing the faith of fans.
To test the reliability of this model beyond the r-squared, Philadelphia’s and San Diego’s 2004 season in which they built new stadiums, resulting in a dramatic change in attendance, were utilized as cases for a trial. First, the multiple regression equation was found without including Philadelphia’s 2004 season. Then, this new model predicted the attendance for Philadelphia’s year. It expected an attendance of 3,029,990, while, in reality, the Phillies attracted 3,250,092. This comes out to a change of only 2,717 fans on a per game basis. Duplicating this with San Diego, the expected attendance was 2,822,020 with an actual attendance of 3,016,752. Again, this is only a difference of 2,404 when put into the context of a single game. Both results fell within the standard error range of the model. Therefore, the reliability of this model passes both on a mathematical and practical level.
With the ticket price model, the significant factors were the previous season’s ticket price, the payroll of the team, whether or not it was the first year of a new stadium, and the capacity of the park. Again, the r-squared was a strong .8787.
One of the important findings of this study was new stadiums boosted attendance tremendously in the first season, then the attendance started to slowly drop. In the first season, a park could expect an attendance boost of 758,308 fans. In the next season, the impact dropped by 228,558 fans, but this still leaves the attendance 529,750 above the expected attendance had the stadium not been built (758,308-228,558=529,750). The effect of building a new stadium does not appear to wear off within the first five years; although, the variables after year two are not significant. Likewise, ticket prices soared in the new stadiums. Prices increased greatly in the first year. For the next four years, prices gradually fell. Still, they stayed above the price expected if they had not built a new stadium. All years after the first were not statistically significant.
Another important finding dealt with the relationship between ticket prices and winning a World Series. Owners noticeably raised the prices of tickets (additional $6.46 on top of $5.41 for making playoffs) in the year following a championship and the effect still lingered two years post-World Series ($2.83 above non-World Series prices). Assuming the demand for the team increases after a Series ring, this business move makes sense. However, the variable of winning a championship had no significance above the act of making the playoffs in the attendance analysis. Therefore, owners may not be justified in the extremely higher ticket prices following a World Series. As well, it appears as if star players do bring more people to the ballpark. Assuming the average ticket price of $22.21, each player who makes the All-Star team adds approximately 39777 fans, which equals approximately $880,000. Most players receive a bonus for making the All-Star team, which proves to be reasonable based on the extra attendance revenue.
Overall, this analysis of attendance provides a considerably accurate method of predicting attendance. This information could prove useful to a team considering constructing a new stadium. As well, owners can predict how much extra they can spend on players in the offseason based on the expected rise or fall in attendance the next year.
Star Players = the number of All-Stars on the roster for the current season
SquareT$= the price of the ticket expressed as a percentage of the league average, then squared
PrvSquareT$= squareT$for the previous season
PayrollAdjusted = the current season’s payroll expressed as a percentage of the league average
Playoffs = equal to 1 if the team made the playoffs in the current season
Years Since Playoffs = the number of years since playoffs
World Series = equals 1 if the team won the World Series the season before
World Series2 = equals 1 if the team won the World Series two seasons prior
PGB = the number of games behind the first place team the year prior to the current season
GB = the number of games behind the first place team in the current season
Population = the population of the city from the 2000 Census Bureau Median
Income = the median income of the city from the 2000 Census Bureau
Previous Att. = the attendance total from the year prior to the current season
Stadium 1-5 = the year of the stadium with 1 being the first year to 5 being the fifth season
Capacity = the total capacity of the stadium
Grass Surface = equal to 1 if the stadium had a grass surface
Data Sources www.ballparks.com www.baseballreference.com www.teammarketing.com www.forbes.com www.census.gov
RESULTS (Equation w/ significant terms):
Predicted Att. = 266119+39777*StPl-6184*GB+0.7475*PrvAtt+758308*Year1-228558*Year2
R-Sq=0.8846
StPl = # of All-Stars on Roster GB = Games Behind PrvAtt = Attendance in previous year Year1=Whether or not it is the first year of the stadium Year2= Whether or not it is the second year of the stadium
Predicted Ticket Price = 0.8859*PvTickPr+10.6*Pyrll+6558.7*Yr1-.0546*Capacity
R-Sq=0.8787
PvTickPr= Ticket price in previous year Pyrll= Payroll of the team Yr1= Whether or not it is the first year of the stadium Capacity= Capacity of the stadium


