Introduction
(This post is inspired by this post on SBNation)In a previous post, Mean Opponent- and Venue-Adjusted Margin (MOVAM) was introduced. From this value and the standard deviation of performances from this mean value, the remaining games in the season may be simulated many times to give probabilities for each team's finishing position.
However, these simulations work on the assumption that a team's level of performance is exactly representative of that team's actual ability. In fact, this may not be true, although as more games are played, confidence increases that the level reached by the team is more representative of the true level.
An analogy may be drawn with rolls of a dice: if the first roll is a six, it is still unlikely that the dice is anything other than fair, since in most cases, dice are fair, and one observation is insufficient to say for certain that the dice is unfair. If, however, the first 10 rolls are sixes, the situation changes and it becomes likely that the average roll that could be expected from this dice is greater than 3.5.
In this post, home and away MOVAMs is calculated for each team, after every game played up to the end of January to give observed values from the sample of games played. From this, the "best guess" at a team's "actual" MOVAM is calculated and, from this, probabilities are given for each team's final league position.
Method
Home and away MOVAMs from this season were calculated as described in this post.Historic, season-long MOVAMs were also calculated for every team in every season of the Premier League since 1995-6, the first 38-game season. From this, a distribution of MOVAMs was created.
A trial-and-error approach was used to find the "best guess" for the actual MOVAM level. For each proposed level, the probability of achieving that level given the historic distribution of full-season MOVAMs is multiplied by the probability of achieving the observed level in the number of games played, given the assumed actual level. This is based on Bayes' Theorem:
P(Proposed level | Observed level) =
P(Proposed level) * P(Observed level | Proposed level) / P(Observed level)
In this case, the observed level is known, but the P(Observed level) is not. However, this value does not change with the proposed level, so for each proposed level, it is necessary only to calculate the numerator of the fraction. Having done this, the proposed level which gives the highest numerator is retained as the "best guess".
Results
Below is shown the home and away MOVAMs after 22 games of the Premier League season, along with the best guess of the actual MOVAMs:
Shown here is how the observed home and away MOVAMs compare with the best guesses at the actual MOVAMs throughout the season for the top 4 sides after 22 games, presented in alphabetical order. Note that an increase in observations causes these values to converge.
And for the bottom three:
Below is a table showing probabilities regarding the finishing position of each team, calculated using the results of the first 22 games of the season and a prediction for the remaining 16 games using the best guess at the actual MOVAM:
Conclusions
This method of measuring a team's ability is more accurate than using the raw MOVAM as it allows for the expectation that a team's performances will revert to the mean, based on each team's actual ability.
Conclusions can be drawn from the probabilies shown above:
- Manchester City are now more likely that not to finish the season as champions
- The champions league spots are very likely to be occupied by the same four teams at the end of the season, albeit perhaps in a different order
- While it looks like relegation is inevitable for Fulham, the remaining two spots are still relatively open: Cardiff and West Ham appear most likely but Sunderland and Crystal Palace certainly are not safe yet
- As most people have probably concluded, the best that Manchester United can hope for is a 5th place finish, but even that looks more likely to be taken by Everton
No comments:
Post a Comment