Virtually Season 2020

Since there is no actual football, we will do the next best thing: simulate it.

With the support of the world’s best football computer models, Squiggle will play out each and every cancelled game in real-time, as if it were really happening.

Goals. Behinds. Score worms. Quarter time breaks. They will all unfold here at the exact same time the match is supposed to be played.

It will look like this:

This will continue all season long, game by game, until actual games resume. We will track a virtual Ladder and Top Eight. If, God help us, we don’t get real football back by September, we will hold virtual Finals and award a virtual Premier.

This Thursday night at 7:25pm Eastern Time, Collingwood will play Richmond in the first virtual match in real-time, here on this site. You can check in and see it happen.

Why Tho

I believe Australia needs football. I need football. Or, in the absence of the real thing, a simulated version from computer models.

How It Works

Usually models make predictions about the most likely outcome of a game (e.g. “Collingwood by 4 pts”). But they can also generate batches of simulations, where if Collingwood is a 60% win chance over Richmond, then in 100 sims, Collingwood will win 60 of them. In the other 40, Richmond will win. (In a few, unusual things might happen, like a team scoring over 120 points.)

Participating models supply Squiggle with their sims. At match time, Squiggle randomly plucks one out and unspools it in real-time. No-one knows in advance which sim it will be.

My Promise To You

This is as rigorous a process as I can make it, drawing from the work of highly talented football analysts and math wonks who created the world’s best football models.

There will be no bias or fiddling. Just hard maths and cruel random variation.

It’s not the real thing. But it’s virtually season 2020.

Squiggle’s Ladder Prediction for 2020

Here’s Squiggle’s own in-house ladder prediction for 2020 (not to be confused with the Aggregate Ladder, which combines this plus predictions from many other AFL models).

This prediction accounts for:

  • 2019 form
  • Trades, retirements, delistings and returns
  • 2020 preseason form
  • Injuries to players listed as “Season” or “Indefinite”

The league was very even in 2019, so it’s going to be harder than ever to make a good ladder prediction. But this is what I’ve got:

2019 Form

Squiggle’s top teams at the end of 2019 were Richmond, Geelong, Hawthorn, and Collingwood. The Cats were widely lambasted for their post-bye form last year, but it wasn’t actually that bad – it was just clearly less good than their 11-1 start (at a percentage of 151%).

The Hawks had a strong finish, Collingwood were 4 points shy of a Grand Final, and Richmond were, well, Richmond.

Notably absent from Squiggle’s 2019 Top 4 were West Coast (8th), Brisbane (5th), and, despite their late surge, the Bulldogs (6th). Squiggle was bearish on the Eagles throughout 2019, primarily because of their reliance on high goalkicking accuracy to win matches – something that, despite much effort, no team has ever been able to sustain for long.

Trades, Retirements, Delistings and Returns

Squiggle uses AFL Player Ratings to gauge the likely impact of list changes between 2019 and 2020, including the return of players who missed games late last year. This last factor is often the important one, as most clubs put out weakened teams towards the end of 2019.

On this measure, the most upside is in Fremantle (regaining Lobb, Ryan, Wilson, Hogan*, Hill, Pearce, Colyer, and Cox, while recruiting Acres and Aish), GWS (regaining Coniglio, Whitfield, and Ward, while recruiting Sam Jacobs), followed by Gold Coast, Collingwood, and Carlton.

At the other end of the scale, the only club to have gone backwards is Adelaide (losing Greenwood, Jacobs, Douglas, Betts), while Brisbane (losing Hodge), Richmond, and North Melbourne have relatively little to add to their sides in 2020.

There’s been a lot of talk about Tim Kelly, but despite his stellar numbers, his trade doesn’t single-handedly drag Geelong or West Coast out of the pack (in either direction).

2020 Preseason Form

The preseason usually contains a few hints about regular season form, and at this time of the year, we don’t have much else. The best pre-performers in 2020, after accounting for the quality of their opposition, were Gold Coast, GWS, St Kilda, Essendon, Port Adelaide and Melbourne.

The worst were Geelong, Carlton, Hawthorn, Richmond, Adelaide, and Sydney.

Long-Term Injuries

For most of the off-season, Squiggle rated Hawthorn a Top 4 team in 2020. But long-term injuries to Howe, Impey, and Hardwick have sent them tumbling to the lower reaches of the final 8.

Also hampered by long-term injury this year are Fremantle (Hogan, Hamling), Collingwood (Beams, Greenwood, Langdon), and Carlton (Curnow).

Summary

The punditry is big on West Coast this year, with the Eagles a popular flag tip and Top 4 lock. Most computer models, however, are much cooler, placing them no higher than 3rd and as low as 11th.

Models have a pretty good record in situations like this, when there’s a divergence of opinion but not because people know something that models don’t. However it shakes out, it’ll be interesting to watch.

Squiggle is high on the Bulldogs, ranking them 2nd, although only by a slim margin. What’s remarkable about the Dogs is how young they are: They’ve been fielding shockingly young teams for two years. Younger teams lose matches pretty reliably, so the ability of the Dogs to make finals in 2019 despite their age profile speaks to their potential upside.

More than any time since 2000 – perhaps since 1993 – we have a very even field entering the new season, so expect surprises! We could have a very volatile ladder, with teams surging and plummeting on the ladder, and a large middle cluster that sits within one or two games of each other.

All I Want for Christmas is an AFL API

If you want to do your own football analysis today – write an article, create a chart, build a neat online tool – you can’t legitimately acquire the most basic stats about AFL matches, not even the scores.

You can manually browse to a website and eyeball the scores. But these pages have Terms of Use that prohibit any downloading or reuse of content, like this one linked from AFL.com.au:

The Copyright Act 1968 (Cth) protects materials such as films, music, books and computer programs. You can break the law if you download, copy, share or distribute this material, unless you’re allowed to do so by the Copyright Act or you have the copyright owner’s permission. Please don’t use our services to do any of these things, because if you do, we might have to cancel your services (including your email count) and the copyright owner could take legal action against you.

In practice, small operators – armchair analysts and independent sites – either laboriously compile these stats themselves, or else ignore the Terms of Use and write programs to download them from somewhere else anyway. Not everyone can do this, though, and for those who can, it can be tedious and time-consuming, as whenever the website updates its format, the scraper stops working or begins pulling corrupt data. And sometimes the source just plain disappears.

The AFL could and should create an API: a simple online interface that publicly serves up very basic football data such as match scores in a computer-readable format. It could do this simply, cheaply, and without exposing any advanced stats that Champion Data rightly consider to be proprietary and valuable.

This would:

  1. Dramatically lower the barrier to entry for anyone with an interest in building something on top of football stats, allowing them to get started with a bunch of basic, legal data.
  2. Signal an interest in and acknowledgment of the growing amateur/semi-pro analytics community and its audience.
  3. Grant the AFL some control over what’s happening. At the moment, it has a fence around every single piece of data, a bunch of tunnels going underneath, and no idea who’s digging them or why. If it added a gate to the fence, many people would use it, because gates are easier.

Today there are excellent free APIs for practically all major world sports, except AFL. There are dozens for cricket and rugby, and hundreds for soccer. In the US, you can’t move for tripping over a baseball, basketball, or football API. But for AFL: nothing.

Regardless of where you land in the wider debate over exactly which stats should or shouldn’t be kept secret, surely no-one is being served when basic match scores are kept under legal lock and key. Fixing this could create a platform for analytics innovation, discussion, and expansion.

Please, Santa?

The Squigglies 2019: Home & Away

With the home & away season tucked away, it’s time to look back and see whether all those preseason predictions that were floating around earlier in the year turned out to be prescient… or putrid.

It’s time for the Squigglies 2019! (Home & Away edition)

Every Expert Preseason Ladder Rated

Best Ladder Prediction: AFL Lab

Head and shoulders above the rest, AFL Lab correctly tipped Geelong for the minor premiership and Sydney to finish bottom 4. It was one of few ladders to resist the temptation to fit Adelaide into the Top 8. While, like every ladder, it had Melbourne much too high (3rd) and Brisbane much too low (12th), it is otherwise excellent, with no fewer than 13 teams tipped within 1 rung of their actual position. Score: 71.5

Runner-up: HPN Footy (68.3)

Best Ladder by a Human: Paul Bastin

In March, AFL.com.au gushed forth ladder predictions from no fewer than 15 journos. One of them was the only prediction to squeeze a computer model out of the top 5: Paul Bastin’s. Paul was bullish on Brisbane (8th) and bearish on Sydney (14th), but done in by his faith in Adelaide (3rd) and lack thereof in the Bulldogs (15th). Score: 67.7

Runner-up: Nat Edwards (65.5)

Best Ladder by a Crowd: AFL.com.au readers

A few media outlets ran preseason fan surveys, drawing on the wisdom of the crowd to compile ladder predictions. Some crowds were more prescient than others. The best was from AFL.com.au, which finished 6th overall, beating out every single expert from the media but Paul Bastin. Score: 66.7

Of other crowd-sourced predictions, The Roar and The Age were also better than most pundits. Reddit r/AFL’s attempt, however, was only marginally better than taking the 2018 ladder and guessing it would be the same again.

Runner-up: The Roar readers (63.3)

Worst Ladder: Damien Barrett

Look, predicting the ladder is hard. It makes fools of us all. Unfortunately, someone has to be last, and this year it’s Damien Barrett, who tipped Adelaide for the minor premiership and Fremantle to storm into finals, alongside Sydney and Melbourne. Damien didn’t have enough faith in Brisbane (15th) or the Bulldogs (14th), and expected Geelong to slide out of finals contention. With only one of the top four correct (Richmond), half of the top eight missing, and ten teams wrong by three or more rungs, it’s a shocker. Score: 50.4

Every Expert Preseason Ladder Rated

How to Predict the Ladder in Nine Stupid Steps

You’re an intelligent person, probably, with opinions about football teams. Occasionally you might want to employ those qualities to predict what the ladder will look like at the end of the year.

So how, exactly, does someone do that? What is the ideal process?

The answer, my friend, is a journey through madness and despair. The first step is stupid, yet with each successive step, it somehow gets worse.

Let me walk you through it.

Step 1: Eyeball the teams and guess

Sure. Anyone can do that. Your ladder looks reasonable, but you’re not even properly considering the fixture. What about teams that have an easy or hard run home?

Step 2: Go through the fixture and manually tip all the games

There we go. You have now accounted for fixture bias. And you have a ladder with… wait, Geelong on 20 wins. They’re good, but that seems ambitious. How did that happen?

Oh, of course! You didn’t tip upsets. In reality, favourites lose about 30% of the time.

Step 3: Throw in a few upsets

Now things look more realistic. Geelong have 16.5 wins. You threw in a draw because you couldn’t bring yourself to say they’d lose to Sydney. You don’t actually expect that game to be a draw, of course. In fact, you don’t really expect most of your upsets to come true. That’s why they’re upsets: they’re unlikely by definition.

So… now your ladder is based on results even you don’t believe in. Uh.

Step 4: Calculate expected wins

All right. Time to piss off the ladder predictor and get serious. What you’re doing now is going through each game and awarding a percentage of a win to each team based on how likely it is. Collingwood are a 60% chance to beat North Melbourne, so that’s 0.6 wins to the Pies and 0.4 wins to North.

This is better. You’ve successfully accounted for the likelihood of upsets, without having to guess exactly when they will occur. You just averaged the possibility of them over the course of the season. Smart.

So let’s see. You now have Collingwood on 14.2 wins total, and right behind them, GWS on 14.1 with a much healthier percentage. Hmm. So you’re basically forecasting each team to win 14 games, and for GWS to have a better percentage, but for the Pies to finish above them.

Shit.

Step 5: Round those fuckers off

No-one wins 14.2 games! You can’t win a fraction of a game! What your number really means is that Collingwood will win about 14 games while leaning toward more rather than fewer. So if you round everything off, it works. Collingwood: 14 wins. GWS: 14 wins. Percentage comes into play. GWS go higher. Done.

Except… further down there’s North Melbourne on 10.5 wins and Essendon on 10.4. They’re almost identical, but you have to round them in different directions. That puts North one whole win ahead of Essendon. Well, that’s probably still okay. I mean, they’re still in the right order. And your numbers really do have North closer to 11 and Essendon closer to 10. So they’re rounded. Moving on.

Next is Fremantle on 9.5 wins with a better percentage than Essendon. So… the Dockers… also… round to… 10 wins… and move above the Bombers.

Now the rounding is messing with the order. You originally calculated that Essendon and North are in close competition with Fremantle a game behind, but after rounding, you’re putting North clearly ahead with Essendon third of the bunch. That’s not great.

And that’s not all! Look at the shit that transpires when there are two rounds to go! At that point, it’s logically impossible for certain teams to finish in certain spots, because of who plays whom, but your fractional wins are putting them in those spots anyway! What the fuck!

Step 6: Simulate games

You know what you need? A FUCKING COMPUTER. You can’t do all this shit on paper and spreadsheets. You need to write a GOD DAMN PROGRAM to run through every single game and roll a die or whatever a computer does to pick a random number. Then, because it can calculate footy stats all day and not get asked to take the dog for a walk or fix the wobbly chair, it can do that TENS OF THOUSANDS OF TIMES.

All right. All right. You now have a simulation that can figure out the likelihood that percentage comes into play when deciding ladder positions. You still have to average out finish placings, so have the same issue with occasionally tipping logically impossible placings. Is mode better than mean here? Who knows. It’s an improvement. Moving on.

Wait. Some numbers seem a bit wacky. There might be a bug or two in those hundreds of lines of code you just wrote. Yep. Go fix those.

And while you’re poking around, ask yourself: Does the language you used employ a half-arsed random number generator that prioritizes speed over correctness, which completely falls apart when you call it forty thousand times per minute? Well shit! Yes it does! Now you’re reading the documentation, you see that for actual randomness, you need to use a special module with an interface written in Russian! And don’t forget to ensure your machine has an adequate source of entropy! What the hell is entropy? Where do I get that from? The entropy shop?

Step 7: Fix bugs and supply adequate entropy

This simulator seems pretty damn sure of itself, you have to say. You fixed its bugs and gave it all the entropy it could desire, but this thing insists there’s no way a low-ranked team could ever make a late run for the finals. It’s guaranteeing Geelong top spot even though they’re only two games ahead with half a season to play.

It’s overconfident. It’s treating each match as an independent random event, but you know that if Fyfe’s knee blows out, Fremantle’s results will start looking pretty goddamn dependent. You need to simulate the chance that each team can get fundamentally better or worse as the season progresses. How do you do that? Oh, the experts disagree. Super, super.

Step 8: Simulate seasonal changes in fundamental team ratings

You did it. You created a full-bore simulator made from belted-together hunks of stolen code and occasionally you discover a horrifyingly fundamental bug but god damn it, it works. It mostly works.

Of course, you had to make a lot of design decisions along the way. You’re maybe not a hundred percent confident in all of those choices. To test them, you need to run this thing against real-world results, a lot of them. Like decades’ worth. And that requires a method of scoring your ladders’ accuracy. Hmm. There are several different ways of doing that. They’re all complicated.

Step 9: Revise model based on score against large data sample

I’m not sure what happens after this. I’m sure it’s something. This is as far as I’ve made it.

At this point, you can pause, reflect on your efforts, and observe that your ladder predictions are often outperformed by random BigFooty posters employing the eyeball-and-guess method.

God damn it.

Squiggle Ladder Predictor: Predict the final ladder!

Rate My Ladder: Score your prediction!

Who Won the Round?

When you come off a good win, you don’t just want to analyze how great you were compared to the other team; you want to see how great you were compared to ALL the other teams.

Sadly, it’s hard to establish objectively how much better (or worse) Richmond’s defeat of Hawthorn was to Collingwood’s thumping of St Kilda, for example, or any of the round’s other games.

Until now! Squiggle now offers an algorithmic ranking of who had the best round. Using data from the aggregate Projected Ladder, which brings together the predictions of many different excellent AFL prediction models, this determines how the weekend’s results impacted each team, by comparing how their predicted ladder finishes changed.

This is all based on pre-round expectations, so an upset win can be hugely meaningful for a team, radically improving its prospects of finishing higher on the ladder. Equally, a shock loss can be catastrophic, as the cold-hearted computer models begin shaving down its finals chances.

The importance of “eight-point games” is clearly visible, too, where teams that defeat an opponent competing for the same ladder spots are recognized both for advancing their own position and damaging their competitor’s.

To have an outstanding weekend outside of “eight-point games,” teams need to rely on other results falling fortuitously, so that teams around them lose, while teams too far above or below to matter win.

The current algorithm is a bit experimental, since it applies a weighting to decide the relative importance of changes in predicted ranks vs wins vs percentage. It also applies its own ideas in determining how much to scale these based on the predicted “closeness” of teams, and therefore who is competing with whom for which spots. So it’s currently in beta.

But I think it offers a pretty good map of the round, allowing a peek into the changing fortunes of each team, as prognosticated by the internet’s finest models.