It’s well-known that both Corsi and shooting percentage change with the score of the game. When you’re up by one goal, your SH% goes up by almost a whole percentage point – and your Corsi goes down by four points. When you’re up two or more, the differences are even bigger.
That’s probably because when teams are ahead, they play more defensively. Their opponents, who are trailing, play more aggressively – they press in the offensive zone more, and get more shots.
So, teams in the lead see their shots go down. But their expected SH% goes up, because they get a lot of good scoring chances when the opposition takes more chances – more breakaways, odd-man rushes, and so on.
CORSI & SCORE EFFECTS
playoff-bound teams are much more likely to dominate the shot counts in the clutch minutes of their games.
A low percentage is certainly a red flag for any team with cup aspirations.
For those who still don’t believe even-strength shot ratio is important, note that only two teams with a ratio below 50% made the playoffs, while only three teams above 50% missed the playoffs.
SCORE ADJUSTED FENWICK
It has been shown that if you want to predict a team’s future winning percentage, you should look at their shot differential with the score tied rather than at their place in the standings or goal differential.
This is very counter-intuitive, that goals and wins aren’t the best measure of whether a team will get more goals and wins. But the problem is one of sample size – it seems like a season is a long time and random bounces should even out, but with only two or three goals per game, they really don’t. Because there are ten times as many shots as goals, random fluctuations in shot differential get much more rapidly washed out and we get a better measure of a team’s talent.
So if sample size is so important, doesn’t it seem inefficient to throw away all of the results from when the score isn’t tied? Can’t we find a way to correct for the score effects instead?
If we can do that by correcting for score effects so that we can include the non-tied situations and more than double the sample size on our Fenwick statistics
It’s easy enough to make those corrections and come up with a Score-Adjusted Fenwick total that uses 42.4 minutes of even strength play per game, instead of the 17.9 minutes that goes into Fenwick Tied or 28.4 minutes that goes into Fenwick Close (another attempt to reduce score effects by focusing on close games).
The average team spends 3.75 minutes per game down by 2 goals, 8.46 minutes down by 1, and 17.94 minutes tied, giving us the following formula for Score-Adjusted Fenwick:
Score-Adjusted Fenwick = [3.75 * (Fen_up_2 – 44%) + 8.46 * (Fen_up_1 – 46.1%) + 17.94 * (Fen_tied – 50%) + 8.46 * (Fen_down_1 – 53.9%) + 3.75 * (Fen_down_2 – 56%)] / 42.36 + 50%
Early in the season, Score-Adjusted Fenwick does a substantially better job of predicting how many points a team will earn in the remainder of the season that Fenwick Tied.
Fenwick is our best inherent measure of a team’s ability to get more scoring chances than their opponents, but analysts typically limit their sample size to situations where score effects will have limited or no impact. Instead, we can make very simple adjustments and include all situations, which more than doubles our sample size.
SCORE ADJUSTED FENWICK IS THE BEST PREDICTOR EARLY IN SEASON
Just using shot differential does OK, but it gets confounded by the way teams adjust their strategy to suit the score. A team that’s leading will go into a bit of a defensive shell, allowing the other team to outshoot them. Accounting for that makes our predictions much better.
The simplest way to do that is to narrow things down to just look at shot differential when the score is tied, so there aren’t any score effects. But whenever you shrink the sample size, you make it so you need more data to make good predictions; if our goal is to be able to tell really early in the season what the final standings will look like, then we don’t want to throw out large swaths of the game if we can help it.
My preferred method is to include all of the data but correct for score effects. We know that the average team gets 56 percent of the shots when they are down by two goals, so if a certain team has gotten 58% when down by two, we know they were doing 2 percent better than average and can just factor that in.
The result is a formula that I called Score-Adjusted Fenwick, which averages together how much better or worse than average a team did in each game state. This turned out to be a better predictor than Fenwick Tied or Fenwick Close, especially early in the season.
POSSESSION EFFECTS IN PLAYOFFS
One thing to note right away is that you shouldn’t expect a team’s SAF in the regular season to be reflected in the postseason, for the simple reason that the competition is better.
The idea that SAF predicts playoff series takes for granted that good regular-season possession teams play strong possession hockey in the postseason, and this is what drives their success.
So, why does SAF appear to be such a robust predictor of series outcomes? Part of the answer is almost certainly luck: 77 of these series were won by the team with a higher series PDO. If you bring together puck possession and PDO, of course, you’re nearly unbeatable
Another important piece to the puzzle appears to be home-ice advantage. No one disputes that teams with stronger regular-season possession numbers tend to win more, which implies that they’re more likely to have home ice in the playoffs: this brings them a real advantage in short series. In these 102 series, home teams won almost 57% of games, and the advantage didn’t differ depending on which team entered the series with a higher SAF.
The implication is obvious: home ice does more for you if you’re a better possession team in the regular season
The best way for analysts to maintain their credibility is to avoid making bad predictions, and the unpredictability of hockey outcomes in short series makes them a poor fit for shot-based metrics better suited to bigger-picture research. Teams with better possession numbers will indeed tend to win more often in the playoffs, but this doesn’t mean that teams will necessarily play the way those numbers imply, or that they’ll be successful doing so. Possession matters in postseason hockey, but it’s hardly the beginning and end of meaningful analysis. In a short series, it’s still much, much better to be lucky than good.
HOCKEYVIZ - SCORE ADJUSTED FENWICK
(October 2014)
There are two common approaches to accounting for score effects in raw possession numbers: score-close and score-adjusted. The score-close method counts events as having value “one” if they occur when the score is tied, or within a goal in the first or second periods; all other events have value “zero” — they simply aren’t counted. Score-adjusted includes all events, but weights them according to score-situation; the established formula is due to Eric Tulsky (@BSH_EricT) in a 2012 article at Broad Street Hockey. In this article, I introduce another formula for computing score-adjusted fenwick, which has greater predictivity at all sample sizes, especially smaller sample sizes, is easier to compute, and is perhaps conceptually easier to understand.
For reference, the formula in Tulsky’s article is:
Score-Adjusted Fenwick =
[ 3.75 * (Fen_up_2 - 44.0%)
+ 8.46 * (Fen_up_1 - 46.1%)
+ 17.94 * (Fen_tied - 50.0%)
+ 8.46 * (Fen_down_1 – 53.9%)
+ 3.75 * (Fen_down_2 – 56.0%)] / 42.36 + 50%
The various numbers require a little explanation: 42.36 is the average number of minutes per game (in the data Tulsky examined) that teams were 5v5; 3.75 the average number of minutes that teams were up-one/down-one, 8.46 the average number of minutes that teams were up-two/down-two (or more than two), and 17.94 the average number of minutes teams were tied. The obvious issue (which Tulsky points out) is that the formula can only hope to be useful when applied to a large enough sample to permit approximating the actual time spent by a given team in those score situations.
In general, the adjustment coefficient for a given team (home or away) in a given situation is the one which satisfies:
(Coefficient for given team) * (Events for given team) = Average events for both teams.
Notice that there is no need for any measurement of times. This makes it possible to compute the score-adjusted fenwick of any set, no matter what set of score situations happen to occur in it and for how long. Notice also that the adjustment coefficients for -3/+3 are substantially different from 2/+2, validating our earlier concerns about curtailing score differences at 2. Even with seven years’ data, however, there are hardly any events at a score difference of 4, so I decided to stop at 3.
ADJUSTED POSSESSION METRICS
Score effects play virtually no role outside of the third period.
This article will show that, while score effects are magnified as the game wears on, time-adjustment for possession calculations is not justified.
Score effects are stronger when the home team is losing. In the last five minutes, for instance, the home team generates between 62% and 65% of the events when losing, where the road team generates between 58% and 61% when losing. Home-ice advantage, it seems, applies in all score situations, although not evenly at all times.
Score-adjustment produces substantially better measures, and further venue-adjustment somewhat better still.
There is clearly a non-trivial time-dependence to score effects, as the opening plot shows. However, from a modelling point-of-view, adjusting for this time-dependence gives no improvement
Many effects are similar to what we have learned here about time-adjusted possession measures: they are clearly visible effects, the knowledge of which adds essentially nothing to our ability to make predictions.
Finally, and least obviously, we see that score-close possession metrics are utterly indefensible for any purpose at any time. Raw measures are preferable for conceptual clarity and for predictivity at almost all sample sizes, and adjusted measures are superior for predictivity at all sample sizes. They purport to distill the essence of possession when in fact they do great violence to data by censoring large tracts of meaningful information and magnifying a smallish portion. Adjusted measures, by contrast, apply small nudges to the raw data—their seeming complexity masks how much closer to raw data they are than ‘close’ measures.
The evidence is clear: ‘close’ possession measures are misguided and must be done away with.
SCORE CLOSE VS SCORE ADJUSTED
(November 2014)
If the goal is comparing teams and individual skaters in the most “controlled” and least variable context possible, the restriction to 5v5 close statistics makes a modicum of sense. Although if that is the objective, using 5v5 Tied data makes more sense frankly. Typically though, the main objective of analyzing underlying statistics is to predict and project future outcomes. The perception that analysis of 5v5 score close situations permitted identification of the true underlying skill of a team may well be accurate, but unfortunately the information lost in the process has value and meaning. We are restricting the sample of data we have to a far smaller amount than necessary
Another method of accounting for score effects was first proposed in April of 2010 by Gabe Desjardins and then analyzed in more detail by Eric Tulsky in January of 2012. Rather than ignoring situations with teams trailing or leading by more than one goal, it was proposed that adjusting the Fenwick percentages in various score states to account for so called “score effects” would allow us to retain the value of the sample of all 5v5 data.
It appears that score and venue adjustments at 5v5 capture enough important information that they drastically improve on raw Corsi or Fenwick when it comes to predicting future goals and wins (by approximately 5% around the 20 game mark). Time impact adjustments on the other hand do not seem to add any predictive power, most likely because score effects would be largely collinear with the time effects, thus washing out any added value.
McCurdy showed that 5v5 close is actually WORSE than raw 5v5 Corsi or Fenwick when it comes to predicting future outcomes in terms of goals or winning percentage.
It appears that 5v5 Close is not necessarily the valuable asset we have long thought. Score and venue adjusted Corsi/Fenwick is the better way.
It is unlikely that 5v5 close will be wiped from the hockey analytics lexicon anytime soon. Multiple sites and end users reference it with regularity, and future research may again shift thought in favour of its use. Whichever way it goes though, the development of an improved array of adjusted measures to account for various contextual factors in our models – rather than methods that subtract useful data – will likely improve our holistic understanding of the game.
WHERE ARE TEAMS DRIVING GOOD POSSESSION
It has been shown previously that score-adjusted measures have greater predictive validity than “tied” measures, and even better than the “close” measures that have been widely used.
What I have done is taken each teams CF% for each score situation (down 2+, down 1, tied, up 1, up 2+) and compared it to the league average for that score situation. It’s one thing to look at a team’s possession numbers when they are down on the scoreboard, but it really means nothing unless you compare it to how the league performs at that score state.
You must look at each team over each score state and compare when they are driving good possession numbers.
A team that has possession numbers even with the league average at each score state would have a score-adjusted CF% of 50.0. This is a good baseline to measure a team’s performance at each state of the game throughout the season.
If a team has an above-average CF% when trailing, but a below-average CF% when leading, with relatively similar TOI numbers for each situation, it can show that they may work harder when trailing and tend to sit back and “protect” when they get the lead. Tied score states can tell which teams dominate possession when the both teams are “equal” in terms of score effects (which is pretty much common sense).
HOW WELL DOES SCORE ADJUSTED FENWICK PREDICT PLAYOFF SUCCESS
Insofar as teams that consistently outshoot their opponents win games more often than not, it stands to reason that – in the aggregate – teams with better possession numbers will win more series than they lose.
My analysis here focuses on how the predictive accuracy of SAF varies by home-ice advantage and match-up, but also includes some sensitivity analyses.
Over the full sample of 150 series, SAF correctly predicted just 61.3% of playoff series
So, what about confounding and home-ice advantage? In the full sample, home ice is 55.3% accurate in predicting playoff series. Over these 150 series, the team with the higher SAF opened on home ice 81 times (54%). Teams with home ice and better regular-season possession numbers won 65.4% of their series. Teams with superior score-adjusted Fenwick numbers who opened on the road, however, won just 56.5% of the time.
Wrap up:
Score-adjusted Fenwick is almost certainly not 70% accurate in predicting playoff series. Over the past ten seasons, SAF is 61% accurate, not far off from the 55% accuracy of home-ice advantage. The lesson in this: if you’re going to make bold claims, you should probably wait until you have a large sample of data to base them on.
To emphasize this last point, 150 series are still a pretty small sample, and I’m not convinced that 61% is a much more reliable estimate of SAF’s predictive accuracy than the 70% number.
The accuracy of SAF for predicting playoff series is confounded by home-ice advantage. In the full sample of 150 series, SAF’s accuracy dropped by 9 percentage points if the team with better possession numbers lacked home-ice advantage. This disparity shrinks when we remove the 2013 data and the superteams from our analysis, but this is mostly due to possession becoming less predictive among the home-ice teams.
In a close possession match-up, SAF is (unsurprisingly) almost useless for predicting the winner. If, like me, you think that changes to the NHL’s CBA will make ultra-dominant possession teams like the 2007-08 Red Wings and 2009-10 Blackhawks extremely difficult to assemble going forward, this doesn’t bode well for SAF’s future as a predictive metric.
Not addressed here, of course, are reasonable assumptions surrounding team Sh% and Sv%. In the long run, we’d assume that PDO will be close to league average in this analysis, but presumably analysts could easily improve on the accuracy of SAF by incorporating properly-regressed estimates of shooting and goaltending into their models.
This analysis also doesn’t dig into a critical assumption analysts make when they use SAF as a predictor; namely, that the team with superior possession numbers entering a series will actually control possession in that series, and will actually win because of it. In reality, this assumption is rarely born out on the ice: in my analysis last year, teams underperformed their expected possession differential in 69% of series.
My final piece of advice: any time someone tells you that they’ve boiled a complex game like hockey down to a single number that explains everything, you should maintain your skepticism. More generally, try to remember that puck-possession differential is just a statistic. It measures a set of underlying processes like effective neutral-zone play, crisp exits from the defensive zone, and the ability to maintain possession on the attack, but there’s nothing magical about the number itself. The sound, consistent execution of the above processes is what ultimately drives both good possession numbers and winning; the stats themselves don’t drive anything.
Comments