Friday 5 October 2012

Shooting: comparing goalscorers - Part 2

In the previous post, the different factors affecting goalscoring rates were defined and the equation

G = TOB * SOB/TOB * SOTOB/SOB * GOB/SOTOB + TIB * SIB/TIB * SOTIB/SIB * GIB/SOTIB


was presented. A benchmark goalscorer was presented, with a very rough and unscientific formulation. Following this, the performance of some of the Premier League's top goalscorers was given, showing the relevant metrics given above. In this post, these top goalscorers are compared to the benchmark and the effect of each factor compared. This allows a certain amount of classification of the goalscorers, depending upon which factors caused the greatest uplift in goalscoring compared to the benchmark.

Method

To calculate the uplift in goalscoring caused by each factor, a process known as a walk is used. To illustrate, compare two rectangles, one with dimensions (width x length) 1 x 2 and the other, 3 x 5. The increase in area is 3 x 5 - 1 x 2 = 13, but the allocation of these 13 units between the increases in width and length is unclear.

A walk updates each dimension in turn and measures the uplift in area caused after each update. So in this example, 1 x 2 = 2 -> 3 x 2 = 6 -> 3 x 5 = 15. The uplift caused by the change in width, therefore, is 6 - 2 = 4; that caused by the change in length is 15 - 6 = 9. It can be verified that 4 + 9 = 13, the total increase in area.

However, the order in which the dimensions are updated is important and affects the result. If instead the length is updated first, the situation is change: 1 x 2 = 2 -> 1 x 5 = 5 -> 3 x 5 = 15. This time uplift caused by the change in width is 15 - 5 = 10; that by the change in length is 5 - 2 = 3.

If these uplifts are averaged, the change in area caused by the increase in width is 7 and that by the increase in length is 6.

Likewise for the factors involved in a player's goalscoring performance, if the factors are updated in every possible order and the uplift of each is averaged, the effect of each factor can be found.

Results

To enable the uplifts caused by each factor to be considered more easily, Goals/38 games is considered here, rather than Goals/90 minutes. This merely changes the scale used so that the numbers are more reasonable (of order 1 rather than order 0.01). The top ten goalscorers are listed below, with the uplift over the benchmark goalscoring rate (8.94 goals/38 games) caused by each factor (click on the image to enlarge it):
 
The numbers in bold font show the factors which caused the largest uplift over the benchmark goalscorers. Conclusions may immediately be drawn:
  • The most prolific goalscorers reach such a position through their work inside the box, rather than outside. While goals scored outside the box can provide a surprisingly large proportion of a players output (see this post by James Amey for more detail), the most important factor for each goalscorer here is one of those factors regarding work inside the box. In fact, for Dimitar Berbatov, the most prolific goalscorer of the 2011-12 season, every "out of the box" factor is worse than for the benchmark striker
  • Further to this, top goalscorers touch the ball less frequently outside the box than a "benchmark" striker. With the exception of Wayne Rooney, the rate of touches outside the box caused a decrease in goalscoring output compared to the benchmark
  • GIB/SOTIB appears to be the most important factor. For 6 out of the top 10, this is the factor which causes the greatest uplift over the benchmark striker and for all except two goalscorers, the uplift is greater than 5 goals/38 games
  • The two exceptions to this are Robin van Persie and Edin Dzeko. For these players, the most important factor was TIB/90, suggesting that these players benefited from good service from their team mates. Berbatov also benefited from this but not to as great a degree as GIB/SOTIB
  • The most "rounded" striker is Mario Balotelli: the standard deviation of the uplift caused by each factor for him is 3.38. The least "rounded is Vellios; his standard deviation is 8.07

By considering the most important and second most important factors for each player, broad categories are defined:

  • Shot efficiency (GIB/SOTIB)/Touch frequency (both inside box) - Berbatov, Jelavic, Anichebe
  • Shot efficiency/Shot accuracy (both inside box) - Vellios, Djibril Cisse
  • Shot efficiency/Shot frequency (both inside box) - Papiss Cisse
  • Touch frequency/Shot accuracy (both inside box) - Balotelli, van Persie
  • Shot frequency/Shot efficiency (both inside box) - Rooney
  • Touch frequency/Shot frequency (both inside box) - Dzeko

Conclusion 

8 different factors are considered here with the idea that there would be a diversity in striking strengths. While this is true, it is clear that GIB/SOTIB is a factor which is crucial to being a prolific Premier League striker. It is also clear that goalscorers can be grouped into categories with different strengths and methods of success.

While only 10 goalscorers are shown here, it is possible to create such a walk for each player in the Premier League. Each player can then be categorised as has been done above. I believe this is a useful piece of work for comparing the relative merits of and categorising goalscorers. Please let me know what you think and suggest improvements in the comments below. Alternatively, get in touch on Twitter: @hpstats

No comments:

Post a Comment