How to Predict the Ladder in Nine Stupid Steps

You’re an intelligent person, probably, with opinions about football teams. Occasionally you might want to employ those qualities to predict what the ladder will look like at the end of the year.

So how, exactly, does someone do that? What is the ideal process?

The answer, my friend, is a journey through madness and despair. The first step is stupid, yet with each successive step, it somehow gets worse.

Let me walk you through it.

Step 1: Eyeball the teams and guess

Sure. Anyone can do that. Your ladder looks reasonable, but you’re not even properly considering the fixture. What about teams that have an easy or hard run home?

Step 2: Go through the fixture and manually tip all the games

There we go. You have now accounted for fixture bias. And you have a ladder with… wait, Geelong on 20 wins. They’re good, but that seems ambitious. How did that happen?

Oh, of course! You didn’t tip upsets. In reality, favourites lose about 30% of the time.

Step 3: Throw in a few upsets

Now things look more realistic. Geelong have 16.5 wins. You threw in a draw because you couldn’t bring yourself to say they’d lose to Sydney. You don’t actually expect that game to be a draw, of course. In fact, you don’t really expect most of your upsets to come true. That’s why they’re upsets: they’re unlikely by definition.

So… now your ladder is based on results even you don’t believe in. Uh.

Step 4: Calculate expected wins

All right. Time to piss off the ladder predictor and get serious. What you’re doing now is going through each game and awarding a percentage of a win to each team based on how likely it is. Collingwood are a 60% chance to beat North Melbourne, so that’s 0.6 wins to the Pies and 0.4 wins to North.

This is better. You’ve successfully accounted for the likelihood of upsets, without having to guess exactly when they will occur. You just averaged the possibility of them over the course of the season. Smart.

So let’s see. You now have Collingwood on 14.2 wins total, and right behind them, GWS on 14.1 with a much healthier percentage. Hmm. So you’re basically forecasting each team to win 14 games, and for GWS to have a better percentage, but for the Pies to finish above them.

Shit.

Step 5: Round those fuckers off

No-one wins 14.2 games! You can’t win a fraction of a game! What your number really means is that Collingwood will win about 14 games while leaning toward more rather than fewer. So if you round everything off, it works. Collingwood: 14 wins. GWS: 14 wins. Percentage comes into play. GWS go higher. Done.

Except… further down there’s North Melbourne on 10.5 wins and Essendon on 10.4. They’re almost identical, but you have to round them in different directions. That puts North one whole win ahead of Essendon. Well, that’s probably still okay. I mean, they’re still in the right order. And your numbers really do have North closer to 11 and Essendon closer to 10. So they’re rounded. Moving on.

Next is Fremantle on 9.5 wins with a better percentage than Essendon. So… the Dockers… also… round to… 10 wins… and move above the Bombers.

Now the rounding is messing with the order. You originally calculated that Essendon and North are in close competition with Fremantle a game behind, but after rounding, you’re putting North clearly ahead with Essendon third of the bunch. That’s not great.

And that’s not all! Look at the shit that transpires when there are two rounds to go! At that point, it’s logically impossible for certain teams to finish in certain spots, because of who plays whom, but your fractional wins are putting them in those spots anyway! What the fuck!

Step 6: Simulate games

You know what you need? A FUCKING COMPUTER. You can’t do all this shit on paper and spreadsheets. You need to write a GOD DAMN PROGRAM to run through every single game and roll a die or whatever a computer does to pick a random number. Then, because it can calculate footy stats all day and not get asked to take the dog for a walk or fix the wobbly chair, it can do that TENS OF THOUSANDS OF TIMES.

All right. All right. You now have a simulation that can figure out the likelihood that percentage comes into play when deciding ladder positions. You still have to average out finish placings, so have the same issue with occasionally tipping logically impossible placings. Is mode better than mean here? Who knows. It’s an improvement. Moving on.

Wait. Some numbers seem a bit wacky. There might be a bug or two in those hundreds of lines of code you just wrote. Yep. Go fix those.

And while you’re poking around, ask yourself: Does the language you used employ a half-arsed random number generator that prioritizes speed over correctness, which completely falls apart when you call it forty thousand times per minute? Well shit! Yes it does! Now you’re reading the documentation, you see that for actual randomness, you need to use a special module with an interface written in Russian! And don’t forget to ensure your machine has an adequate source of entropy! What the hell is entropy? Where do I get that from? The entropy shop?

Step 7: Fix bugs and supply adequate entropy

This simulator seems pretty damn sure of itself, you have to say. You fixed its bugs and gave it all the entropy it could desire, but this thing insists there’s no way a low-ranked team could ever make a late run for the finals. It’s guaranteeing Geelong top spot even though they’re only two games ahead with half a season to play.

It’s overconfident. It’s treating each match as an independent random event, but you know that if Fyfe’s knee blows out, Fremantle’s results will start looking pretty goddamn dependent. You need to simulate the chance that each team can get fundamentally better or worse as the season progresses. How do you do that? Oh, the experts disagree. Super, super.

Step 8: Simulate seasonal changes in fundamental team ratings

You did it. You created a full-bore simulator made from belted-together hunks of stolen code and occasionally you discover a horrifyingly fundamental bug but god damn it, it works. It mostly works.

Of course, you had to make a lot of design decisions along the way. You’re maybe not a hundred percent confident in all of those choices. To test them, you need to run this thing against real-world results, a lot of them. Like decades’ worth. And that requires a method of scoring your ladders’ accuracy. Hmm. There are several different ways of doing that. They’re all complicated.

Step 9: Revise model based on score against large data sample

I’m not sure what happens after this. I’m sure it’s something. This is as far as I’ve made it.

At this point, you can pause, reflect on your efforts, and observe that your ladder predictions are often outperformed by random BigFooty posters employing the eyeball-and-guess method.

God damn it.

Squiggle Ladder Predictor: Predict the final ladder!

Rate My Ladder: Score your prediction!

Who Won the Round?

When you come off a good win, you don’t just want to analyze how great you were compared to the other team; you want to see how great you were compared to ALL the other teams.

Sadly, it’s hard to establish objectively how much better (or worse) Richmond’s defeat of Hawthorn was to Collingwood’s thumping of St Kilda, for example, or any of the round’s other games.

Until now! Squiggle now offers an algorithmic ranking of who had the best round. Using data from the aggregate Projected Ladder, which brings together the predictions of many different excellent AFL prediction models, this determines how the weekend’s results impacted each team, by comparing how their predicted ladder finishes changed.

This is all based on pre-round expectations, so an upset win can be hugely meaningful for a team, radically improving its prospects of finishing higher on the ladder. Equally, a shock loss can be catastrophic, as the cold-hearted computer models begin shaving down its finals chances.

The importance of “eight-point games” is clearly visible, too, where teams that defeat an opponent competing for the same ladder spots are recognized both for advancing their own position and damaging their competitor’s.

To have an outstanding weekend outside of “eight-point games,” teams need to rely on other results falling fortuitously, so that teams around them lose, while teams too far above or below to matter win.

The current algorithm is a bit experimental, since it applies a weighting to decide the relative importance of changes in predicted ranks vs wins vs percentage. It also applies its own ideas in determining how much to scale these based on the predicted “closeness” of teams, and therefore who is competing with whom for which spots. So it’s currently in beta.

But I think it offers a pretty good map of the round, allowing a peek into the changing fortunes of each team, as prognosticated by the internet’s finest models.

The Aggregate Projected Ladder

In the same way that Squiggle Dials aggregate predictions from the internet’s best AFL computer models, so does the new auto-updating aggregate Projected Ladder!

As I write (post-Round 6), it looks like this:

There are some funny quirks to projected ladders, which are quite a bit weirder than they first appear. You can read some discussion of that at the bottom of that page, but the fundamental question is: What are we trying to predict? It’s not at all clear how we should rate the accuracy of a ladder prediction — for example, is it more valuable to correctly tip who finishes 1st than who finishes 12th? How much better? How do you score a ladder that gets the ranks right but had the number of wins all wrong, compared to one that was very close on wins but had some incorrect ranks?

It’s worth noting also that a ladder prediction is not the best way to answer questions like, “What are the chances that my team makes finals?” You can find those kinds of estimates from many Squiggle-friendly models, including FMI‘s Swarms, Graft‘s Projection Boxes and PlusSixOne‘s Finishing Positions. They aren’t aggregated here, but are better targeted to those kinds of questions.

In the background, the Projected Ladder is recording the ladder predictions of each contributing model, so in the future it should be possible to go back and see how they evolved. We could even score them on how accurate they were — once we establish what it is, exactly, that we want to score.

Speaking for myself, I’m pretty sure that my Live Squiggle ladder predictions are quite a lot less intelligent than my game tips, simply because there isn’t a clear way to rate it, which makes it more difficult to refine and improve. A standard metric of some kind would help.

If you’d like to play around with this data, it’s available in a machine-readable format via the Squiggle API!

Podcast: Chilling With Charlie

There’s a terrific new podcast on sports analytics available from Robert Nguyen, author of the site Analysis of AFL and co-creator of the very popular R data package fitzRoy.

All the episodes are worth your time, but this one features me talking about the torment of Richmond fans and the genesis of Squiggle:

You can find it on iTunes Podcasts by searching for “Chilling With Charlie,” or via this link.

Introducing Fat Stats

One more model sneaks in ahead of the season! It’s Fat Stats, with a machine learning-based player metric model incorporating Elo.

That brings the number of new models to four, and the total field to 16 this year, including Punters, which is our aggregate of bookies.

That’s a lot of models! It’s double the number from only two years ago, and many (most?) are now player-aware, which means they take into account who’s actually taking the field each week, rather than modeling teams as a single entity.

Rise of the Machines

Squiggle will track two new machine learning AFL models in 2019: one from AFL Gains and another from AFL Lab.

I believe these are the first public models to lay claim to a machine learning heritage, so this is a good opportunity to see how they go in action, at least until the inevitable robopocalypse when they destroy us all.

In related news, most of 2018’s models are already up and running for the new year, including Massey Ratings, Swinburne University and reigning champ Live Ladders!

It’s 2019!

Here is a January 1 ladder prediction:

WINS
1.RICHMOND14.8
2.MELBOURNE14.8
3.GWS14.3
4.WEST COAST14.1
5.GEELONG13.6
6.COLLINGWOOD13.2
7.ESSENDON13.0
8.ADELAIDE12.8
9.Hawthorn11.9
10.Port Adelaide11.6
11.North Melbourne10.6
12.Brisbane9.5
13.St Kilda9.5
14.Western Bulldogs8.7
15.Sydney8.3
16.Fremantle7.3
17.Carlton4.8
18.Gold Coast3.6

Some notes:

  • Teams are ranked by average wins from 100,000 simulated seasons.
  • Unlike a regular ladder, it doesn’t round off wins to whole numbers and tie-break on percentage. That’s why it’s different to the quick-and-dirty Live Squiggle Ladder Predictor you may see on the right. This way is better.
  • That said! Historically, season-long predictions aren’t much more accurate than tipping everyone to win the same number as games as they did last year. So, you know.
  • This takes into account off-season list changes.
  • Predictions will continue to evolve as Squiggle is able to factor in pre-season results, major off-season injuries, and Round 1 team selections.
  • Sydney are that low because they tailed off badly in 2018 despite being able to put out something close to a full-strength team most weeks, and did not significantly bolster their team in the trading period.

How the Fixture Screwed St Kilda

It’s hard to prove the fixture affects any team’s finishing position. To say that, you need to find games that were so close that the result could easily have been different, and establish that they were scheduled in a lopsided way.

The only really clear examples of this tend to be self-inflicted wounds, where a club sells a home game and loses it narrowly — such as Richmond’s famous after-the-siren loss in Cairns to the Gold Coast in 2012, or St Kilda’s 3-point defeat by Brisbane in New Zealand in 2014.

These cases are nice and clear: Computer modeling can tell us with a high degree of certainty that Richmond’s home ground advantage at the M.C.G. is worth more than 2 points, and therefore they should have won the game if it had been played there. Likewise, St Kilda should have defeated Brisbane if they’d stuck to their regular home ground in the Docklands. Of course, you can point to many things that could have changed a close result, but one of them is the venue.

Otherwise, though, the picture is muddier. You can establish that a team had a favourable fixture — weaker opponents than average, or games at friendlier venues — but you can’t really say it was worth a certain number of wins. When Fremantle played Gold Coast “away” in Perth this year, due to the unavailability of Carrara, that was certainly unbalanced fixturing… but the Dockers won the game by 28 points, so would the result have been different in Queensland? Modeling suggests probably not.

However, you can say one thing for sure: St Kilda got screwed.

St Kilda (2018)
Net Benefit from Fixture: -123 points (ranked 18th)
Opposition: 18th Timing: 12th Home Ground Advantage: 18th

St Kilda is no stranger to scheduling screwings. It had the AFL’s worst fixture in 2014, the worst in 2015, the 9th best in 2016, and the worst in 2017. This year, it was the worst again, and that was also the worst fixture of any team in five years.

This analysis rests on three key factors: who you play (Opposition), when you play them (Timing), and where you play (HGA).

(Two factors that come up sometimes in these kinds of discussions, and which aren’t included because they don’t matter, are Six-Day Breaks and The Cumulative Effect of Travel. There is more on these at the end of the article.)

Opposition

The simplest factor is the strength of opposition. All clubs face each other at least once, of course, and the AFL attempts to equalize the competition by ensuring that Bottom 6 teams have no more than one double-up game against a Top 6 opponent.

This is fine in theory, but since this year’s performance isn’t a reliable guide to next year, it often works out less well in practice.

St Kilda was a Middle 6 team in 2017, finishing 11th. They were duly given a 2/2/1 split of double-up games: That’s two matches against Top 6 opponents, two against Middle 6 opponents, and one against a Bottom 6 opponent.

This was already a touch on the mean side, since two other Middle 6 teams were assigned the more favourable 1/2/2 split.

But the Saints’ real misfortune came from how its double-up opponents — Richmond, GWS, Melbourne, Hawthorn, and North Melbourne — performed unexpectedly well. This turned their 2/2/1 split into 4/1/0: four double-up games against this year’s Top 6, one against the Middle 6, and none against the Bottom 6.

And that put the Saints into a whopping 95-point hole for the season.

You can also see that in practice there isn’t a whole lot of equalization going on here. Many of 2017’s stronger teams had weaker than average double-up opponents (Melbourne, Port Adelaide, GWS, Hawthorn, Richmond), while many of last year’s lower teams faced harder than average opposition (Gold Coast, Brisbane, Fremantle, Western Bulldogs, St Kilda).

At least North Melbourne, the recipient of this year’s most beatable opponents, had a generous fixture by design. After finishing 15th in 2017, the Kangaroos were given a 1/2/2 split, which turned out to be 0/1/4, with repeat games against Brisbane, Gold Coast, Western Bulldogs, St Kilda, and Sydney.

Timing

There’s a problem with considering strength of opposition like this: It assumes that teams are equally strong all season long. That’s clearly not the case. Team strength fluctuates both on a weekly basis, as star players get injured and miss games, and over the medium- and long-term, as a club gets stronger or weaker for any number of reasons: fundamental gameplan changes (Essendon), or deciding the season is lost and looking to the future (Carlton), ramping up toward finals (West Coast, Melbourne), or simply having the wheels fall off from no discernible cause (Port Adelaide).

Each club will happen to play some opponents around their weakest point and meet others near their peak. Lucky clubs will meet more opponents at times of relative weakness; unlucky clubs will run into more at times of relative strength. It should average out, but doesn’t always. And since team form can rise and fall quite a lot, it can make a real difference.

After a stirring Round 1 victory over Adelaide, the Bombers lurched from one unconvincing performance to another, culminating in their Round 8 defeat at the hands of Carlton. This led to a very public dissection, staff changes, and a very different looking team for the remainder of the season.

As such, it was quite a lot better to face Essendon early in the year. In fact, it’s possible to identify the worst possible time to play Essendon: Round 21. This is when the Bombers were performing well as a team, but just before they lost Orazio Fantasia (Round 22) and Tom Bellchambers (Round 23). As it happened, the team they played in Round 21 was St Kilda.

(Note: This is a naive rating, which means it rates the apparent strength of a team each round before they played, in order that it remain uncontaminated by the performance of the team they played against. It means there’s often a real difference between how beatable a team appeared to be and how they performed on the day. Essendon provide a fine example of this, too, in Round 9, when, after looking abysmal against the Blues and then losing Hurley and Parish, they upset Geelong.)

In truth, though, St Kilda weren’t particularly screwed by timing; not this year. They rank around the middle of the league on that metric, receiving a mix of good luck (Geelong and North Melbourne early in the season) and bad (Adelaide early, Essendon and Hawthorn late).

The worst timing belongs to Fremantle, who managed to encounter a string of opponents near their peak: Collingwood (round 23), Hawthorn (round 19), Essendon (round 18), Port Adelaide (round 17), Sydney (round 9), Carlton (round 13), and Gold Coast (round 3).

The Blues had the most fortunate timing, thanks to repeatedly running into teams who were losing key players to injury — although perhaps this is less to do with good fortune than their opponents seizing the opportunity to rest players. The Blues also had early-season games against teams who would dominate the season later: West Coast, Collingwood, Richmond, and Melbourne.

Home Ground Advantage

But back to the screwing. In theory, home ground avantage (HGA) is balanced: every team receives roughly the same advantage from its home games that it must face in its away games.

West Coast, for example, play ten home games against opponents traveling from interstate, and ten games to which they must travel, plus two local derbies. There’s no real HGA in a derby, and the benefit the Eagles receive from playing interstate sides at home is neatly counter-balanced by the penalty of traveling interstate to play away.

A Melbourne-based team such as Collingwood, by contrast, plays many games against other local opponents at relatively neutral venues in the M.C.G. and Docklands while hosting interstate sides about five times and traveling away about the same number.

Either way, the net benefit is roughly zero.

But there are exceptions. Sometimes teams give up a home venue they’re entitled to, such as Melbourne playing in the Northern Territory. Hawthorn and North Melbourne occasionally drag a Melbourne-based team to Tasmania, creating more venue-based advantage than there would otherwise be. And occasionally there are weird situations like the Commonwealth Games depriving Gold Coast of a home ground, sending the Suns to play a home game against Fremantle in Perth.

Also, sometimes a team is given unbalanced travel.

Now, HGA is hard to define, and various models use different methods. You can get a reasonable approximation simply by assigning 10 or 12 points of HGA whenever a team hosts an opponent visiting from out of state. A more sophisticated, but also more fragile, strategy is to attempt to divine particular teams’ affinity for particular venues from the historical record of how they under- or over-perform there.

I employ something in between, which is essentially a Ground Familiarity model. This awards HGA based on how familiar a team is with the venue and location; in practice, it serves as a proxy for an array of factors that probably comprise HGA in the real world, including crowd noise, travel time, and psychological disruption.

There’s a fair argument that Sydney wasn’t actually the second-most HGA advantaged team this year, because Sydney didn’t play its home ground very well. Similarly, many believe that Richmond received more advantage from M.C.G. games this year than the average M.C.G. tenant. Such ideas are popular, but tend to be transient and based on few data points. For example, for me, Richmond’s much-discussed four interstate losses are more easily explained by the fact that those were its four hardest games. So there is no attempt here to model that kind of theory.

Then again, a Ground Familiarity model has quirks of its own. Much of the reason Sydney scores well on this measure is that the Swans played 18 matches at just three grounds: the S.C.G., the M.C.G., and Docklands. They traveled to Western Australia just once and South Australia not at all. This means the Swans frequently play away at grounds they’re reasonably familiar with, while their opponents don’t have the same experience at the S.C.G.

This small but persistent imbalance affects Docklands tenants in reverse: They are almost always less familiar with their opponents’ home grounds than their opponents are with theirs. For example, the Saints are relatively familiar with Perth for a Victorian team, being dispatched there at least once a year, and four times in the last two years. But when they last met the Eagles in Melbourne, that was the Eagles’ fifth trip to Docklands that year alone. The venue still offered St Kilda an advantage (especially compared to Perth), but it was a little less than it might have been.

Therefore, under a Ground Familiarity model, being based at the Docklands is the worst option. You probably don’t even get to play home finals there.

But the real culprit behind St Kilda’s poor rating on home advantage is their persistent travel imbalance:

St Kilda’s Games vs Non-Melbourne Opponents

Year Home Away Difference
2014 5 7 -2
2015 5 7 -2
2016 5 6 -1
2017 6 6 0
2018 4 7 -3
Average 5.0 6.6 -1.6

Having to travel to your opponents more often than they travel to you is a clear source of home ground advantage disparity. This is rare for non-Victorian teams, who almost always have a 10/2/10 split, but common for those in Melbourne: They will often benefit a little more or a little less from travel than their opponents. For the Saints, almost every year, it’s less.

This year’s version was the most extreme yet. St Kilda enjoyed home advantage from hosting travelling teams only four times (Brisbane, Sydney, GWS, and Adelaide), while facing disadvantage from travelling interstate five times (GWS, West Coast, Port Adelaide, Fremantle, and Gold Coast), as well as having to visit Geelong at Kardinia Park (with no return home match), and Hawthorn in Tasmania.

The Saints have actually been sent to play North Melbourne or Hawthorn in Tasmania for five years in a row now, each time turning what would be a neutral game at Docklands or the M.C.G. into one where the venue favours the opposition.

Three extra games of significant disadvantage is quite a lot. The ratio is eyebrow-raising, too; the equivalent of a non-Victorian team playing 8 home games and 14 away.

Sooner or later, the law of averages will ensure that St Kilda get lucky with their double-up opponents, or else their timing. But their unbalanced travel is an enduring, deliberately fixed anchor. It ensures that whenever fortune smiles on the Saints, it’s more of a smirk.

St Kilda’s Fixture: A History

Year Fixture Rank Oppo Timing HGA
2014 18th 12th 11th 18th
2015 18th 12th 17th 18th
2016 9th 2nd 10th 14th
2017 18th 13th 16th 14th
2018 18th 18th 12th 18th

Continue reading “How the Fixture Screwed St Kilda”