PREDICTIVE VALUE OF EXPECTED GOALS VS OTHER METRICS
(Nov 26, 2013)
First I wanted to explore overall predictive value of a suite of measures over the entire season. Basically, if I were to have no idea how the teams finished the season in terms of points, how well could I predict those points using a full season’s values of each of a list of independent variables (separately, not together). I know x team had y corsi, so I predict they’d have z points, etc
We also see traditional shot metrics like Corsi and Fenwick being further down the list in terms of r-squared correlations. The advantages they have in accumulating large sample sizes loses potency over 82 games, but this particular application is never really their strong suit. Their close measures do get somewhat closer to goals.
In any case, this is just stage setting. Let’s have a look at how good these measures are at predicting the future
What I’m trying to do is take each measure for all teams up to this point in the season, and try to predict what their points percentage would be for the rest of their games based on them.
After 8-13 games have been played by each team:
The metrics that most mainstream media guys would use to diagnose team strength — goals and points — are about as accurate as throwing darts at a dartboard at picking how well those teams will do for the rest of the year. CorsiClose is the best predictor at this early point in the season, followed closely by just plain old Corsi. Corsi will have the highest sample size of all of these metrics, which is why it stars early, and I’m guessing the unbiasing effect of looking at close situations just slightly helps CorsiClose overcome its sample size problems. Clearly in third place is raw Expected Goals.
Between 15-19 games played:
CorsiClose is still clinging to the lead in predicting how a team will play from that point onwards, but is being quickly approached by a pack of upstarts. Raw Expected Goals is second, raw shots within 25 feet is third, and expected goals in close situations is fourth. Bringing up the rear are again season to date points percentage and goals percentage.
After 32-36 games played:
By the Christmas break, shots taken within 25 feet seem to have a very high level of utility for predicting the rest of the season, with Expected Goals % now in second spot. Again, the worst are goals and season to date points percentage.
EXPECTED GOALS ARE A BETTER PREDICTOR THAN CORSI OR GOALS
For sports like hockey and soccer where goals are inherently random and scarce, expected goals models proved to be particularly useful at predicting future scoring. This is because they take into account shot attempts, which are better predictors of a team and player’s performance than goal totals alone.
Shot quality has been the subject of spirited debate despite evidence suggesting that it plays an important role in predicting goals. The evidence shows that shot characteristics like distance and angle can significantly influence the probability of a certain shot resulting in a goal.
This model also takes into consideration shooter talent, which we know varies significantly from player to player. Accounting for shooting talent makes intuitive sense, as we expect that shots attempted by Brad Marchand on average have a higher likelihood of resulting in goals than shots taken by, say, Tanner Glass. To this end, a “Shot Multiplier”*** was developed to approximate a player’s effect on each shot’s probability of resulting in a goal.
Finally, each player’s shot was multiplied by their Shot Multiplier.
To test how this expected model performs against previous models like score-adjusted Corsi and goals %, year-to-year correlations were performed.
xG At Team Level:
At the team level, xG has the same predictive power at the 20-game mark as score-adjusted Corsi (CF%) and Goals For (GF%) but proves to be a far more superior predictor of future goals past that mark.
xG At Player Level:
xG also predicts future goals better than (score-adjusted) CF% and GF% at the player level.
As early as the 10-game mark, xG outperforms previous models.
xG And Individual Performance:
xG also better estimates future individual scoring. As seen in Figures 3-4 and Tables 5-6 below, individual xG per 60 minutes (ixG/60) outperforms iCF% and Sh% across the board.
xG is significantly more predictive of future goal scoring compared to previous models. In addition to being predictive, xG also appears to have superior descriptive power as it explains more of the variance in GF% than score-adjusted Corsi at the team level.
An obvious future direction would be splitting forwards and defensemen for analysis at the player level. Presumably, variables included in this xG model can vary in their descriptive and predictive value when testing for defensemen and forwards separately.
Lastly, future work will also include looking at special teams, as one would expect that the significance of predictor variables would differ from even-strength situations.
PREDICTIVE VS DESCRIPTIVE EXPECTED GOALS VS CORSI
(Jan 2016)
So let’s just cut straight to the chase and see if eGF is a better descriptive stat for forwards than CF.
As you can see eGF20 outperformed CF20 by a pretty wide margin (statistically significant to 99.99%) and was in turn outperformed by PTS20 by a similar margin (statistically significant to 100%).
A couple more explanation points here; GF20 is a team level stat – meaning it is the observed amount of goals that a team scored while that player was on the ice. eGF20 is also a team stat in that it is the expected goals for the team while that player was on the ice. PTS20 is an individual stat, and represents the points that individual received (per 20 minutes) on the GF that his team got. Now, it is obviously problematic to use points because points can only exist if goals are scored – by default this means the correlation with points will be reasonably high. The other problem with points is that it has no value as a descriptive stat for defensive measures.
PTS20 has almost no correlation with GA20, and as a descriptive measure is useless. eGA20, on the other hand, has a reasonable correlation (it’s low because I believe forwards have less control over GA than defense does) that is significantly higher than that of CA20 (statistically significant to 99.8%). So eGA is a better descriptive measure than CA and eGF is a better descriptive measure than CF for forwards in this data set.
For descriptive purposes there is no reason to use Corsi when you could use eGF for forwards.
Now the question is – is there a reason to use it for predictive purposes?
The short answer is no, because there are no particularly great ways to predict next season’s performance (as measured by GF20/GA20/GF%).
We cannot say that there is a significant difference (>95% confidence) between any of the measures in predicting future GF20. That being said, previous GF20 (pGF20) was the best predictor in that data set followed by eGF20, then by CF20. There appears to be some ability to guess a forward’s ability to generate goals for based on his previous year’s stats (no matter which one you use).
None of the correlations are different from each other to a statistically significant degree (95% confidence), but they are all significantly different than zero and show some ability to predict GF20 based entirely on the previous seasons’ numbers. Once again pGF20 outperformed eGF20 which outperformed CF20 (for this data set), but again not to a statistically significant level.
This is more, or less, in line with what I’d expect. A forward has a significant degree of control over the offensive zone – but he is just one of 10 players on the ice. Offensive pressure might be primarily controlled by 5 (or 6) players (being the 3 attacking forwards, 2 defense and possibly a center) so a player’s offensive skill (or lack thereof) should be somewhat consistent season to season – though masked by other factors (linemates, opposition, system, etc). For defensive play I would expect that the average forward has much less to do with defensive success and so I would expect to see less correlation with future GA. Which is exactly what you see
Systems, defense, and goaltending are likely bigger parts of predicting GA20 than forwards are.
The reason I included GA20 in the prediction stats is the predilection people have to using CF% or GF% as a measure to evaluate a player. As we can see, for forwards what’s most in their control (as it relates to goals) is goals for – they have much less control over goals against.
Predicting a GF% for a player relies on so many things going right. On the offensive side he needs to maintain his SH%, maintain the quality of his chances, and likely maintain a system and his regular linemates. On the defensive side he has less control (as mentioned previously) but has to rely on his goalie saving the same percentage of chances and those quality of chances against remaining the same. Through system changes, player changes, and injuries it is very unlikely that the prediction endeavour (based on predicting his GF%) will be successful.
eGF has shown to be a superior descriptive statistic to Corsi in measuring both goals for and against, for forwards, while simultaneously being statistically similar as a predictive measure for goals for, against, and GF%.
In my (very biased) opinion, this removes the need to look at Corsi metrics to quantify the play of individuals as eGF is equal or superior in every (statistical) regard.
CORSI BETTER THAN EXPECTED GOALS AFTER 25 GAMES
(Dec 2016)
25 games is about the time when shot attempt metrics such as Corsi carry the most predictive power,
I decided to compare the predictive power of Corsi to the predictive power of scoring chances and as Don’t Tell Me About Heart’s expected goals model
Future goals for percentage correlates the most with Corsi for percentage, with expected goals coming in a very close second.
This basic analysis just kind of confirms what we already know, which is that Corsi and expected goals are the way to go when it comes to predicting the future; goals and scoring chances just don’t carry the predictive power that those two do.
Also, we can see that Corsi stays fairly stable from the first 25 games to the rest of the season; the R^2 for the correlation between CF% for the first 25 games and CF% for the remaining games is 0.597, as compared to 0.385 for expected goals and 0.239 for scoring chances.
So not only is CF% the best predictive variable, it’s also the most consistent. Shot attempts are the best current statistic we have for analysis, mainly due to the rate at which they accumulate.
This eliminates any issue that might arise due to a small sample size, and does a decent enough job at eliminating randomness from our measurements. Obviously Corsi isn’t a perfect metric, and the best predictions for future goals for percentage come from regressing Corsi 70 percent to the mean, but it’s the best we currently have.
CORSI IS BETTER THAN EXPECTED GOALS
Early writers in hockey statistics discovered that you could do a better job of predicting how well teams would score at even strength by looking at shot attempts instead of goals.
Over time many people began to argue that because there are differences in the quality of shots, improvements could be made by adjusting each shot for its likelihood of becoming a goal, based on factors like how close to the net the shot is and what type of shot was taken. We call this adjusted measure Expected Goals, or xG.
According to the DA article, expected goals are better at predicting future results than Corsi is. This was the breakthrough that many people had been waiting for, a metric that tried to account for the quality of shots rather than just their quantity.
I tested how well you could predict a team’s goal ratio (GF%) in the second half of the season based on their results in the first half of the season in four different metrics: Corsi (all shot attempts), xG (shots on net and missed shots, adjusted for shot quality), scoring chances (a hybrid approach that counts all shots from a “scoring chance” area of the ice, excluding shots from further away), and goals.
I find that no matter what model you use, Corsi is always better at predicting future results than expected goals are.
Corsi is simply a better measurement of team quality, and hockey fans should probably stop using expected goals, at least at the team level.
Also of note is that scoring chances have been better at predicting future goals than Corsi has over nearly a decade now. This lines up with research I published two years ago, and it’s interesting to see that trend continue.
Scoring chances are also more predictive than expected goals at every point measured
Since 2009, scoring chances and Corsi have been comparably predictive, and for nearly a decade scoring chances have in fact provided better predictive value. I think there is a pretty good argument at this point that scoring chances are actually the superior metric.
However, I would also argue that at this point there is no argument for using expected goals for evaluating teams. Scoring chances are consistently better by a reasonably large margin during the entire period for which we have data.
There is no point at which xG is the most predictive of the statistics I’ve looked at.
It is true that scoring chances have been better at predicting team level goal ratio over the past decade or so than Corsi has. But it is important to note that this improvement is based entirely on being better at predicting goal SCORING. Corsi is still better, by quite a large margin, at predicting goal PREVENTION.
You will have a better idea which teams are good at scoring goals by looking at quality-adjusted metrics, but you’ll get a better idea about which teams are good at preventing goals by looking at pure shot attempts.
In Conclusion
· Expected goals are not the most predictive measure over any time period for which we have shot location data.
· Over the full time period available, Corsi is the most predictive metric.
· In more recent years scoring chances have been the best metric, although it’s not clear whether that’s due to something fundamental changing in the results or if it’s just natural variation over time.
· You’re better off describing offence and defence using different metrics, because shot quality seems to matter more on offence than on defence.
EXPECTED GOALS BETTER THAN CORSI
One of the key benchmarks of an expected goal model is its predictive power. In fact, the predictive power of expected goals in general has recently been called into question in an article by an analyst known as DragLikePull. He compared the correlation between 5-on-5 score-adjusted Corsi/expected goal shares in the first half of the season and 5-on-5 goal shares in the second half of the season and found that Corsi was better overall at predicting second-half goals than expected goals were. Based on these findings, he concluded that fans and analysts should discard expected goals at the team level and return to using shot attempts.
My research has led me to different conclusions on the predictive power of expected goals, but before I get into that, I want to address my issue with this line of thinking. A part of me wishes that we had stuck to calling expected goal models “Shot Quality” models instead, because I think that the term “Expected Goals” implies that these models are solely predictive in nature, which isn’t necessarily the case. Even if expected goal shares were completely useless for predicting future goals at the team levels, expected goals would still be extremely useful for describing past events and telling us which teams relied heavily on goaltending and shooting prowess, or were weighed down by poor shooting and goaltending, and even which shots the goaltender deserved most of the blame for, so I disagree with the premise that hockey fans should stop using expected goals at the team level if they are not as predictive as Corsi
I accounted for the following variables in my model:
· Shot distance and shot angle. (The two most important variables.)
· Shot type.
· The type of event which occurred most recently, the location and distance of this event, how recently it occurred, which team the perpetrator was, and the speed at which distance changed since this event. (The inclusion of the last variable was inspired by Peter Tanner of Moneypuck.)
· Whether the shooting team is at home.
· Contextual variables such as the score, period, and seconds played in the game at the time the shot was taken.
· Whether the shooter is shooting on their off-wing. (For example, a right-handed shooter shooting the puck from the left circle is shooting from the off-wing, and a left-handed shooter shooting from the same location is not.)
· Additionally, I chose to make an adjustment for scorekeeper bias
As you can see by comparing my results to draglikepull’s (shown below), applying a score-adjustment does not change the predictive power of goal shares, but significantly increases the predictive power of Corsi.
However, it follows logically that a score-adjustment would also improve the predictive power of expected goals, and my expected goal values without a score-adjustment perform significantly better than Corsi does with a score-adjustment, and they blow unadjusted Corsi out of the water, so I am comfortable saying that they currently have more predictive power; especially the expected goal values with rebounds removed.
As draglikepull’s numbers show, my expected goal model is not the only one that has pulled ahead of Corsi in the last five years; Natural Stat Trick’s has done the same. This is partially because the predictive power of Corsi has declined and partially because the predictive power of expected goals has improved. I have a theory for why each of these respective changes have occurred
Goodhart’s Law states that “When a measure becomes a target, it ceases to be a good measure.” Corsi gained ground as a popular measure that NHL front offices used to improve their team, and that NHL player agents began using to make the case for their clients in the early portion of the 2010s, right around the same time that Corsi’s predictive power became to decline. I would not say that we’re quite at the point where Corsi is no longer a good measure, but it has indisputably declined, and I believe that is because it’s become a target. I suspect that the predictive power of expected goals has improved because the quality of data provided by the NHL’s Real-Time Scoring System has improved
One theme remained common: expected goals with rebound shots removed were by far the most predictive.
What does this mean? Going forward, should we only use expected goals with rebounds removed? No. Rebounds are real events that happened, and until the NHL decides that rebound goals no longer count, any descriptive metric of past events should include rebound shots. If you’re strictly looking to predict which team will be the best team in the future, it may be best to use a metric that excludes rebounds, but I don’t think that is how most people do or should actually use expected goal models. (I would also like to credit Peter Tanner of Moneypuck for bringing to my attention that expected goals with rebounds removed are more predictive.)
CORSI, FENWICK, AND OTHER STUPID WORDS
Be aware not to overvalue corsi or even xGoals. These stats never tell the whole story, which is why they don’t correlate all that well with winning.
It’s difficult to evaluate advanced statistics on an individual player level, because you have to use on-ice stats which depends on 11 other players as well. Besides that, there’s no agreed upon metric to measure the quality of a player.
It’s quite different, when we’re evaluating the performance of a team. Good teams win hockey games and outscore their opponents. This is the agreed upon measurement for team performance. Of course good teams can lose in the short term, but if they keep on losing, then they are really not a good team. It’s as simple as that.
This is why I evaluate stats at the team level. Then you can compare the statistics to team performance, and see how well they correlate. I have decided to use GF% as my measurement for team quality. If you outscore your opponents you’re a good team.
First I’m looking at the correlation between corsi (total shot attempts) and GF%.
Clearly there’s some correlation (R-sqaured = 0.2736) between corsi and winning, but it’s not nearly as big as you would think based on some writer’s almost religious usage of corsi.
Corsi is simply the number of shot attempts, so it seems intuitively right that corsi leads to shots which leads to goals. But it also seems intuitively obvious that you would have to factor in the quality of the chances, the goaltending and quality of your shooters. The graph shows that you can’t ignore these other factors.
Now let us turn our attention towards expected goals.
expected goals correlates better (R-squared = 0.4771) with GF% than corsi does. This sounds about right, as we’re now factoring in the quality of each shot attempt.
The next metric I would like to discuss is PDO, and it is simply the sum of shooting% and save%.
natural PDO depends on 3 things:
· Team strategy – If you value quality over quantity (think Trotz style hockey)
· Goaltending – A good goaltender leads to a higher save percentage
· Shooting – Good shooters and shot distribution matters. You want your best shooters to take as many shots as possible.
PDO correlates better with goal differential than xG and corsi. The problem with PDO however, is that it’s difficult to predict and it depends on different factors (see above). Therefore, I think it makes more sense to try and interpret shot quantity, shot quality, goaltending and shooting separately.
If we combine shots and dPDO (100 - PDO) we will actually get a perfect correlation.
There’s absolutely nothing revolutionary about this finding though, as this is simply how you can define goal differential: G+/- = S+/- * dPDO
you want to factor out team defense as much as possible using individual goals and expected goals: dFSh% = (iG-ixG)/iF
The positive thing about using dFSh% is that you can compare players directly. It doesn’t matter if you’re a defender or a forward, if you play 5v5 or on the PP.
The only problem with dFSh% is, there’s no where you can find this stat directly, so you will have to calculate it yourself.
PREDICTABILITY OF ON-ICE MEASURES
Using Marcels to predict Corsi, xG, and GF% predictability.
Let’s start by looking at on-ice goals. Here’s how the projected on-ice goal differential correlates with the actual on-ice goal differential.
It’s not easy to predict goal differential, and using goal differential from the past isn’t perfect. Even when we’re using data from 3 years the predictability is fairly low.
And here’s the predictability of corsi differential.
Now we see a much better predictability (R-squared = 0.311).
This really isn’t surprising. The reasoning for using corsi instead of goals is because the sample size and repeatability are bigger. The problem with corsi is that it doesn’t correlate particularly well with goal scoring.
Finally, here comes the predictability of expected goal differential.
It’s less predictable than corsi, but more predictable than goals. These findings are very much in line with what I expected.
Lastly, I have also looked at how predictive corsi differential and expected goal differential are of actual goal differential
I’m a bit surprised that corsi predicts goals better than expected goals, but honestly neither works particularly well. In fact goals are a better predictor of goals (R-squared = 0.122).
I don’t think on-ice metrics is the best way to predict future player performances. You could probably make a decent model using and weiging a number of different on-ice metrics (including goals).
However, using one preferred on-ice stat out of context is not a great way to evaluate a player.
Comments