Thursday, September 3, 2020

Welcome, and an introduction to ZaphBot







ZaphBot is an AFL football tipping algorithm/model many years in the making, and of questionable quality.

When I say 'many years' I'm not exaggerating, I wrote the first version of this algorithm back in ~1981/82 on a TI-99/4A when there were only 12 teams in the competition.  In fact, I still have the disk - even if it is a little worse for wear:




I continued to tweak the algorithm on and off over the years, using it as my basis for tipping competitions and usually doing "OK".  It took a few hits when the number of teams increased, but I still fire it up to enter competitions and tweak the algorithm.

The algorithm never had a name, but when it came time to post on the net I realised I had to call it something, so ZaphBot was born - simply my nickname with 'Bot' attached.  I'm sure I'll come to regret that!

So, what is the algorithm and why am I posting stuff on the internet?

I'm posting this here (and my tips on Twitter) because I want to interact more with others who have made algorithms and models, including the Models Leaderboard over at Squiggle.  Maybe one day I could join them!

I am also trying to write a better algorithm that can beat it - so you can follow my adventures as I do that here!

The Algorithm

ZaphBot is made up of lots of bits of algorithms that I found over the years.  The key thing is that all the information it has is the results of games (scores, teams).  It currently doesn't even look at the ground the match is at, just whoever is named first is assumed to be the home ground (worked great in the 80's... not so great in 2020!)

The core is the concept of looking for previous matches that involve the current teams, and combinations or chains with those teams.  This comes from an article my Dad showed me when I was about 15 - long lost now, but it was interesting back then!   The idea is that if Geelong beats Carlton, and Carlton beat Hawthorn - then Geelong will probably beat Hawthorn

The bot grabs all such chains from the last X weeks, applies weights to the results and averages them.  It also finds Geelong d Richmond, Richmond d Carlton, Carlton d Hawthorn.  Etc.
Of course it's never that easy, often it is Geelong d Carlton and Hawthorn d Carlton (sorry Blues supporters!) - then it comes down to margins etc. A Home-Ground advantage is also applied.

The weightings are then randomly adjusted until it starts to get better results over a test set of data.  Note that I was doing this as a teenager in the 80's - long before I knew about AI models.  It's probably closest to a genetic algorithm, but honestly it's just a mess!

In the last 10 years I added some ELO ranking bonuses to the mix, as well as individual team travel penalties, and some more ideas.
I simply add the idea to the mix, with a randomized weighting for each and then if it appears the additions are valuable then they get included and start to make a difference.

The end result is a bot that, in a normal season, competes reasonably well but not perfectly.  Last season it finished third in our work tipping contest (~50 people), and also won the last-one-standing contest (pick a different winner each week, until you get one wrong or run out of teams).  Historically it's always managed to be in the upper end of the tipping, but never clearly on top.
I also run it in the Monash University Probabilistic Tipping competition

Now I'm going to see if the old bot can survive, or if I can come up with an improved model!

That's it for this post.  Next I'll talk about what I'm doing now in trying to beat it - and the progress so far.

No comments:

Post a Comment

Round 5

Here we go again.  Absolute shocker of a week last week, I'm now languishing at the bottom of the Squiggle ladder, and I don't even ...