Thursday, September 24, 2020

Round 18 recap, & Round 19 tips

Round 18 was a pretty good round for the bots (and many other people)

ZaphBot managed 8/9, missing only on it's Collingwood tip.
ZaphBot-New picked all 9, since it only differed on that one match.  I'm not reading too much into that result at this point.

ZaphBot Round 18 Results
✔ Richmond by 39, 80% chance ✔ Brisbane by 27, 75% chance ❌ Collingwood by 1, 52% chance ✔ Melbourne by 7, 56% chance ✔ Western Bulldogs by 1, 51% chance ✔ Hawthorn by 19, 65% chance ✔ West Coast by 38, 80% chance ✔ StKilda by 22, 75% chance ✔ Geelong by 34, 80% chance


ZaphBot-New Round 18 Results

✔ Richmond by 49, 90% chance
✔ Brisbane by 28, 77% chance
✔ Port Adelaide by 2, 53% chance
✔ Melbourne by 7, 57% chance
✔ Western Bulldogs by 8, 57% chance
✔ Hawthorn by 11, 61% chance
✔ West Coast by 33 82% chance
✔ StKilda by 14, 64% chance
✔ Geelong by 26, 75% chance
Standings after Round 18
                     Tips  Bits  MAE   Correct | Round by Round
ZaphBot-New   109  23.06  23.0   71.2%  | 654668366677875559 ZaphBot              105  18.02  24.1   68.6%  | 844458457666864668 Autobot-PureELO      100   9.75  45.3   65.4%  | 762466548754864477 Autobot-HomeTeam      87   1.68  26.0   56.9%  | 664659134467645623

Here are the tips for Round 19.  Only one game that is anything more than a coin-toss according to the bots - although they disagree on which match that is!
The two bots again differ by one tip, and again it is the one that ZaphBot has the least confidence in.
ZaphBot:
Port Adelaide by 16, 65% chance
Richmond by 4, 54% chance
Western Bulldogs by 1, 51% chance
West Coast Eagles by 1, 52% chance
ZaphBot-New:
Port Adelaide by 4, 55% chance
Richmond by 4, 54% chance
StKilda by 3, 53% chance
West Coast Eagles by 13, 63% chance

Monday, September 21, 2020

How good does your Tipping Bot actually need to be?

Here's an interesting question that I have been pondering.  How good does a Tipping Bot (model) need to be?




The first question is, what is the the target?
A few options:

1. Beat all the other models in the Squiggle Leaderboard.
    This would require tipping at the rate of between 66%-74% over the last few seasons.

2. Beat all the other models, and the punters on the Squiggle Leaderboard.
    In the last 4 seasons only one model has had more correct tips than the Punters, MasseyRatings in 2019 - so this is clearly a difficult task.

3. Beat all the tipsters on Footytips.com.au (or your preferred tipping site)

4. Tip every game correctly!   Lets ignore this option and assume it is impossible.


My goal:

For ZaphBot my original target was to do well in office tipping competitions, which I have (usually top-5 placing).  I've decided to step things up a notch and now target the Squiggle Leaderboard.

So, in order to do that what do I need to improve?


Right now, with one game left in the Home and Away season, the best bots on Squiggle are scoring 110-111, with the punters at 112.
Zaphbot, at this same moment, is at 105 - and lets focus on that one at the moment, since it is still good enough to place in the top 10 at Squiggle (just)

The difference between my bot, and the AFLalytics - the best bot on Squiggle - is just 6 tips (possibly 7 after Monday night).  Lets call it 7, out of 18 games.
If I could get 1 more tip right every 2 weeks, I'd be on top of the leaderboard.

But I'm going to diverge for a moment, and the numbers below are from before the start of Round 18 so forgive me if they don't match the above data - I started this post a week ago!

The gap between Bots and the best Humans:
Lets pretend we've got an algorithm equal to the current leader, AFLalytics and we want to beat all humans on footytips.com.au.  At the end of Round 17 there was an 8 game gap to the top place.  To be the best in the world it would need to change the tip for 8 out of the 34 tips that it got wrong.  Doesn't sound impossible.


My quest now is to see if we can find those 34 games.

Model Agreement:
Here's how the Models on Squiggle have agreed this year, game by game. I need a better way of visualising this, but it will do for now.  100% means all models agreed, 50% would be half - there is an odd number of bots most weeks so it doesn't hit 50% exactly:



So far this season there have been 65 occasions where every single model agreed on the outcome.  Of those games, they still got it wrong 16 times (~25% of the time).  So lets assume those 16 are genuine upsets and we were never going to get them right since nobody elses model did, and put them in the too-hard basket.


This leaves us with improving 8 out of (34-16) = 18 tips.  That's close enough to half, and in each case at least one of the models tipped the correct result, so it is not impossible.

The goal now becomes identifying those 18 games, while not flipping tips on other games we have correctly tipped.  If we can find just half of those 18 games, and have our model flip it's tip for them, then we'll be on top of the world.  At least for this season.

It may be that there are ~2 games per round (on average) that are true 50-50 games that are impossible to tip, but as a Hawthorn supporter who watched Geelong win almost every 50-50 game for many years, there may be something to look for.

My next analysis will be to look and see if this trend occurred in previous seasons, and to see if there is any pattern in the games that are tipped wrong.
Stay Tuned!

Thursday, September 17, 2020

Round 17 recap, and Round 18 tips

In Round 17 ZaphBot did OK, 6/9 - must try harder!
The new bot only managed 5/9 - maybe it's not as good as I hoped :-( 
The only really bad tip was the Carlton game, even Geelong was not a very confident tip

ZaphBot Round 17 results

West Coast by 1, 51% chance
Geelong by 10, 65% chance
Fremantle by 3, 53% chance
Port Adelaide by 31, 80% chance
GWS by 6, 59% chance
Carlton by 36, 80% chance
Western Bulldogs by 8, 57% chance
Brisbane by 9, 59% chance
Collingwood by 39, 80% chance


ZaphBot-New Round 17 results

West Coast by 10, 60%
Geelong by 14, 64%
North by 1, 51%   ** This is the only tip the two bots differ on
Port Adelaide by 39, 87%
GWS by 5, 55%
Carlton by 34, 83%
Western Bulldogs by 5, 55% 
Brisbane by 15, 65%
Collingwood by 34, 82%


Season so far - the new bot is slightly ahead in tips, and has a good lead in Bits: 

                     Tips  Bits  MAE   Correct | Round by Round
ZaphBot-New          100  19.14  23.2   69.4%  | 65466836667787555
ZaphBot               97  14.28  24.3   67.4%  | 84445845766686466
Autobot-PureELO       93   6.16  45.9   64.6%  | 76246654875486447
Autobot-HomeTeam      84   2.42  25.8   58.3%  | 66465913446764562



Here are the tips for Round 18 - you might notice a few 80% tips, that's because I put a hard limit on the maximum tip probability to avoid large penalties for getting it wrong.  Those could probably be 90+ if I left it alone

ZaphBot Round 18 Tips
Richmond by 39, 80% chance Brisbane by 27, 75% chance Collingwood by 1, 52% chance Melbourne by 7, 56% chance Western Bulldogs by 1, 51% chance Hawthorn by 19, 65% chance West Coast by 38, 80% chance StKilda by 22, 75% chance Geelong by 34, 80% chance


For the new bot I've increased that cap to 90%, which may account for why it is doing better in Bits - I'll need to analyse that at the end of the season.

ZaphBot-New Round 18 Tips
Richmond by 49, 90% chance Brisbane by 28, 77% chance Port Adelaide by 2, 53% chance Melbourne by 7, 57% chance Western Bulldogs by 8, 57% chance Hawthorn by 11, 61% chance West Coast by 33 82% chance StKilda by 14, 64% chance Geelong by 26, 75% chance

Thursday, September 10, 2020

Round 17 Tips

Here are the tips from the two bots this week, Round 17 2020.  They differ by only 1 tip, although there are a lot of close games being predicted.

ZaphBot:

West Coast by 1, 51% chance
Geelong by 10, 65% chance
Fremantle by 3, 53% chance
Port Adelaide by 31, 80% chance
GWS by 6, 59% chance
Carlton by 36, 80% chance
Western Bulldogs by 8, 57% chance
Brisbane by 9, 59% chance
Collingwood by 39, 80% chance


The New Bot:

West Coast by 10, 60%
Geelong by 14, 64%
North by 1, 51%   ** This is the only tip the two bots differ on
Port Adelaide by 39, 87%
GWS by 5, 55%
Carlton by 34, 83%
Western Bulldogs by 5, 55% 
Brisbane by 15, 65%
Collingwood by 34, 82%


The New Bot (needs a name) uses a different method, including some questionable mathematics that just happens to cancel out and end up tipping OK.
I'm not sure I'd be ready to switch to the new bot yet - I have another alternative ZaphBot that is the same algorithm but trained to get a better Bits score rather than the current Tips score, it does much worse on margins but ends up tipping a similar number of winners.   This new bot does better than that, and is also trained to improve it's Bits score even at the expense of getting tips wrong.


Round 16 Recap

 How did ZaphBot fare in round 16?   pretty much the same as everyone else!

ZaphBot managed 6/8, and the new bot got 5/8

ZaphBot Tips - Round 16, Season 2020

✔ Port Adelaide d North by 25, 75% chance  
✔ StKilda d Hawthorn by= 23, 75% chance
✔ Geelong d Essendon by 40, 80% chance
✔ Western Bulldogs by 1, 52% chance
❌ Melbourne d Fremantle by 23, 75% chance
❌ GWS d Adeaide by 18, 65% chance
❌ Carlton d Sydney by 8, 62% chance
✔ Brisbane Lions d Gold Coast Suns by 31, 80% chance

The new bot got 4/8, but the one differing tip could also have gone either way in the final minute!

New Bot Tips - Round 16, Season 2020

✔ Port Adelaide by 26, 75% 
✔ StKilda by 17, 67% 
✔ Geelong by 39, 87% ** I would probably cap this at 80%
❌ West Coast Eagles by 9, 59%  ** This is the one differing tip between the two models
❌ Melbourne by 17, 67% 
❌ GWS by 18, 68% 
❌ Carlton by 9, 60% 
✔ Brisbane Lions by 35, 84%  ** another one I'd cap at 80%


Season so far - the new bot is slightly ahead, but I'm not really happy with it: 

                     Tips  Bits  MAE   Correct | Round by Round
ZaphBot-New           95  19.06  23.0   70.4%  | 6546683666778755          
ZaphBot               91  14.48  24.0   67.4%  | 8444584576668646 Autobot-PureELO       86   3.69  46.6   63.7%  | 7624665487548644 Autobot-HomeTeam      82   3.56  25.2   60.7%  | 6646591344676456

Saturday, September 5, 2020

Round 16 2020 Tips

 Here are ZaphBot's tips for Round 16.  Note that I have not adjusted margin predictions this season, despite the games being shorter.

ZaphBot Tips - Round 16, Season 2020

Port Adelaide d North by 25, 75% chance
StKilda d Hawthorn by= 23, 75% chance
Geelong d Essendon by 40, 80% chance
Western Bulldogs by 1, 52% chance
Melbourne d Fremantle by 23, 75% chance
GWS d Adeaide by 18, 65% chance
Carlton d Sydney by 8, 62% chance
Brisbane Lions d Gold Coast Suns by 31, 80% chance

For comparison, here are the tips for one of the new algorithms I've been working on - which currently has significant issues.  I'll explain this algorithm in more detail later in the year.
It is important to note that this algorithm is completely independent of Zaphbot, despite several tips being almost identical.

New Bot Tips - Round 16, Season 2020

Port Adelaide by 26, 75% 
StKilda by 17, 67% 
Geelong by 39, 87% ** I would probably cap this at 80%
West Coast Eagles by 9, 59%  ** This is the one differing tip between the two models
Melbourne by 17, 67% 
GWS by 18, 68% 
Carlton by 9, 60% 
Brisbane Lions by 35, 84%  ** another one I'd cap at 80%

We'll see where those end up at the end of the round!




Friday, September 4, 2020

Baseline Bots

 When trying to create a Footy Tipping model, or Bot, it is important to have a baseline - something to compare against that sets a goal you want to improve on.  These are simple algorithms that do better than chance but are not particularly good.

With AFL tipping there are a couple of good baseline models that I like to use:


1. Pick the Home Team

Historically the home team in AFL matches has a slight advantage, although the COVID-19 impacted 2020 season has made defining 'home team' a little more difficult.  For the purposes of this I have simply used the first named team as the home team - assuming an advantage from either crowd, ground knowledge, lack of travel, or a combination of those.

Over the last ~2000 games, home teams have won ~57% of the time (counting draws as wins).  This means if you tip home teams every week you will, on average, get 5.1 tips correct.
So far in season 2020, picking the first named team would net you 59% correct tipping - a good baseline to start with.

Note that on any given week, the basline bot may do better than you, but across a season you should be able to do better.
Rounds 6 & 7 in season 2020 show this well - in Round 6 every single home team won. 9/9 - perfect, but the very next week only one home team won (Richmond), giving this model just 1/9 and a cumulative 10/18 for the two weeks.


2. Elo (not ELO)

I have a long history with the Elo algorithm, since I was involved in the Australian Scrabble tournament scene back in the 80's and 90's thanks to my brilliant Maths teacher who was an active competitor and official.  He introduced me to the Elo algorithm as it was used to rank the players, and I've loved it ever since.

Elo is a mathematical model and is used all over the world - it can be used to predict the strength of two opponents and the probability of one winning over the other.  You can read up on the model in lots of places, the Wikipedia page is a great start.

For the purposes of this baseline bot I'm talking about a "Pure Elo" implementation.  Giving 1.0 for a win, 0.5 for a draw, and 0.0 for a loss.  No home ground advantage.

Over the last ~1500 games (sorry, my data sources are not all the same size!) a pure Elo baseline bot picks the winning team an impressive 66.2% of the time.  This is more or less 6/9 every week,

So far in season 2020, the Pure Elo baseline model that I use has picked the winner 64.6% of the time - still a good result.

Here's a graph that shows the relationship between the difference in the Elo scores of the home and away sides, and the average winning margin (-ve means away team win) based on the last 1500 games.  There is a clear relationship here, with the team that has the higher Elo score averaging a higher margin in their favor.  Note that there are still lots of games on either side of the margin axis, so this doesn't guarantee picking a winner, but it's not bad.


The astute amongst you may have spotted that the graph doesnt cross at the 0 point on the X axis, which implies even with a small -ve value (Away team has higher Elo), the home team still averages a slightly higher score.  This is effectively showing a home ground advantage, or at least that when the home team wins they tend to win by more than the away teams win by.


Season 2020 so far - ZaphBot vs Baseline Bots.

Here's ZaphBot vs the two baseline bots this season (at the end of Round 15).  ZaphBot is having a tough season, and would be sitting in 11th place on the Squiggle leaderboard right now.  I think we can do better than that.

                     Tips  Bits  MAE   Correct | Round by Round
ZaphBot               85  13.10  24.5   66.9%  | 844458457666864
Autobot-PureELO       82   4.25  47.4   64.6%  | 762466548754864     
Autobot-HomeTeam      76   2.86  25.4   59.8%  | 664659134467645


Other Baseline Models

Glicko
There are some other ratings schemes that can be used as baselines.  The one I'm most interested in is the Microsoft TrueSkill ranking system which can rank players within team matches.  I used to make video games back in the 90's and 2000's and have been intrigued by this for a long time and it's application to team sports.  TrueSkill is Microsoft property, but there is a public domain implementation of the algorithm called Glicko - it's something I've been thinking of implementing but keep delaying because the way I would want to do it requires getting all the player information and match lists. 
There is a Glicko based AFL model that I'll be keeping a close eye on:  @AFLGlickoRatings 

Punters
Probably the best baseline out there is the gambling community - it's also the one that if you can consistently beat, there is the potential to make money gambling (if you are in to that sort of thing).
Most seasons, the Punters will be at the top or very close to the top of the tipping ladder.  So far in 2020 the Punters are leading the Squiggle Leaderboard.  Unfortunately it's not a great model to use because you cannot repeat it or generate tips from data - you simply have to wait for the odds just before the match starts and use those.  While I do like to compare against this, I don't actually model it myself.

I hope some of that has been interesting for people (and robots) - I'll be back with more in the next week.

Thursday, September 3, 2020

Welcome, and an introduction to ZaphBot







ZaphBot is an AFL football tipping algorithm/model many years in the making, and of questionable quality.

When I say 'many years' I'm not exaggerating, I wrote the first version of this algorithm back in ~1981/82 on a TI-99/4A when there were only 12 teams in the competition.  In fact, I still have the disk - even if it is a little worse for wear:




I continued to tweak the algorithm on and off over the years, using it as my basis for tipping competitions and usually doing "OK".  It took a few hits when the number of teams increased, but I still fire it up to enter competitions and tweak the algorithm.

The algorithm never had a name, but when it came time to post on the net I realised I had to call it something, so ZaphBot was born - simply my nickname with 'Bot' attached.  I'm sure I'll come to regret that!

So, what is the algorithm and why am I posting stuff on the internet?

I'm posting this here (and my tips on Twitter) because I want to interact more with others who have made algorithms and models, including the Models Leaderboard over at Squiggle.  Maybe one day I could join them!

I am also trying to write a better algorithm that can beat it - so you can follow my adventures as I do that here!

The Algorithm

ZaphBot is made up of lots of bits of algorithms that I found over the years.  The key thing is that all the information it has is the results of games (scores, teams).  It currently doesn't even look at the ground the match is at, just whoever is named first is assumed to be the home ground (worked great in the 80's... not so great in 2020!)

The core is the concept of looking for previous matches that involve the current teams, and combinations or chains with those teams.  This comes from an article my Dad showed me when I was about 15 - long lost now, but it was interesting back then!   The idea is that if Geelong beats Carlton, and Carlton beat Hawthorn - then Geelong will probably beat Hawthorn

The bot grabs all such chains from the last X weeks, applies weights to the results and averages them.  It also finds Geelong d Richmond, Richmond d Carlton, Carlton d Hawthorn.  Etc.
Of course it's never that easy, often it is Geelong d Carlton and Hawthorn d Carlton (sorry Blues supporters!) - then it comes down to margins etc. A Home-Ground advantage is also applied.

The weightings are then randomly adjusted until it starts to get better results over a test set of data.  Note that I was doing this as a teenager in the 80's - long before I knew about AI models.  It's probably closest to a genetic algorithm, but honestly it's just a mess!

In the last 10 years I added some ELO ranking bonuses to the mix, as well as individual team travel penalties, and some more ideas.
I simply add the idea to the mix, with a randomized weighting for each and then if it appears the additions are valuable then they get included and start to make a difference.

The end result is a bot that, in a normal season, competes reasonably well but not perfectly.  Last season it finished third in our work tipping contest (~50 people), and also won the last-one-standing contest (pick a different winner each week, until you get one wrong or run out of teams).  Historically it's always managed to be in the upper end of the tipping, but never clearly on top.
I also run it in the Monash University Probabilistic Tipping competition

Now I'm going to see if the old bot can survive, or if I can come up with an improved model!

That's it for this post.  Next I'll talk about what I'm doing now in trying to beat it - and the progress so far.

Round 5

Here we go again.  Absolute shocker of a week last week, I'm now languishing at the bottom of the Squiggle ladder, and I don't even ...