Skill, Luck and Morning Lines
In PhotoFinish.Live every scrap of information is precious, and maybe with Morning Line Prices, owners have some free alpha right under their noses.
Petrocker
7/27/202410 min read
At the most basic level any horse’s performance is some combination of skill versus luck. How do you differentiate between the two? Well, skill is generally something that suggests a consistent level of performance over time. Luck is something that is hard to replicate.
In the NFL the Baltimore Ravens have had a top 3 scoring defence 4 out of the last 5 years. That is a measure of skill. During same period their takeaway ranking has been 29th, 20th, 9th twice and 6th. There is much more luck involved in takeaways than there is in scoring defence (although we probably can’t call it pure luck), hence the inconsistency. At the risk of alienating Cowboys fans, scoring defense is a better metric of performance than takeaways to understand the skill of a defense.
Horse performance is a combination of skill and luck. Skill (getting a consistent outcome over repeated attempts) is some combination of attributes (which are hidden to us), preference stars, suitability for the race and optimal distance. We have previously looked at how to determine suitability and find optimal distance in previous posts. It is safe to say that skill is a combination of the skill of the horse and the skill of the owner giving the horse the best chance to win. That maiden horse you like so much, is probably a maiden for a reason and no amount of graded stakes entries will bring its first victory any closer. Any mature horse that consistently runs unprofitability has two skill issues, one being the owner’s skill.
Our definition of luck here is something that can’t be reliably replicated over time or repeated events.
So warm up stretches over. Do morning line odds reveal something hidden about skill or luck, both or neither?
Do morning line prices leak some alpha?
Every race, whether betting is allowed or not, publicly displays morning line prices in PhotoFinish.Live. These prices simulate the market interest in a horse’s chance to win the race, although no money has changed hands.
The above race details were taken from the next race at the time of writing. A 4f 3k claimer with seven runners. After the post number and before the grade of the horse is a column that contains the morning line (ML) price. The -x- represents that this race is not open for betting on, so live prices are not available. However every race has ML prices for all runners. Note for prices >10, numbers on screen are truncated so that 10.9 would be represented as 10 and 11.1 would be represented as 11.
The first thing to notice is that if you add up the implied chance of winning for each horse (1/price) it always adds up to 125%. This number represents the 100% chance that one horse will win plus a 25% margin for the house. All betting in PhotoFinish.Live is parimutuel which means that the house takes in all the money and collects a rake or vig or margin before paying the winners in proportion to their stake. Interestingly the margin on actual betting is different to the ML prices. To get to the real chances of winning there are several clever or simple transformations that you could do. We are just going to rebase to 100% by dividing by 125% (1.25) each of the odds implied by the prices (see table below).
From this race we can see that the two higher grade horses had the shortest prices, despite not having the highest stars or benchmark. Netting the Big One has the highest benchmark, 3rd highest stars but has the second worst price at 7. The two S grade horses had an 18% chance of winning and the S- were in the range of 10.4 to 15.8% chance. Not a massive difference at the end of the day.
It feels like the ML price is being influenced by the attributes of each horse and maybe the relevance of those attributes to the optimal distance. Despite there being an even-ish split between RDF and RDS archetypes, in sloppy conditions, this doesn't seem to separate their prices. We are going to have to review a race where all of the runners have retired so we can compare the ML prices before the race in light of what we now know about the horses and their attributes.
The below race was taken from a 12 horse race with all horses having the same grade and carrying the same weight. This is an attempt to isolate the important factors from other variance. The race in question is https://photofinish.live/races/season0009-custom-sweet-berry-trophy-stakes-7336, an S- juvenile stakes race over 8 furlongs in good conditions.
For each horse, the archetype, preference stars, morning line prices, implied chance of winning and adjusted chance of winning (divided by 1.25), the attributes (all are retired) and those attributes converted to integers so we can do some simple maths (D- =1, S-=13 in our scale) and calculate an overall score (called pts). From this we can calculate the subgrades above or below the overall average. An average S- grade horse has 78 (6x13) points on this scale. Finally the finishing position was included, but remember we are trying to isolate skill and the finishing position will contain both skill and luck, so we shouldn't be influenced by how the race actually finished (crazy right?).
Firstly, the horses with higher points tend to be at the top of the price list so maybe ML is a good indicator of underlying subgrades within a grade. This race was run on good conditions and 4 out of the top 5 horses are LDFs. Being a 8F race, there is most likely less importance on which attributes are higher. A previous post suggests that at 8F the importance of different attributes is broadly even. It's not clean but the colour coding on the adjusted chance and pts columns are quite similar suggesting there might be a relationship between the two.
Secondly a less clear pattern emerges with preference stars. Foreign Jockey only has 4 stars but is the 4th favourite whereas Uma has 7.5 stars and has the second worst implied chance. Obviously we are looking a very small sample of one race here - so take everything with a pinch of salt. Overall I would say, purely thought the eyeballing this data, there would appear to be some relationship between attributes and ML prices. Let's put that to the test.
Taking a much larger dataset, we can build a model to understand the relationship between ML price and attributes. We have to choose a distance - because weightings will most likely change by distance. In the last post we tested various statistical techniques and today we need to use a slightly different one. Generalised least squares (GLS) regression is good for dealing with highly correlated data (which we have here) and is still a simple approach. Going much deeper than this is outside the scope this post. A quick model has been built attempting to predict the ML price based on attributes and preferences.
This is a typical model output. The model in question is just for 4F races - as the weightings are different for different distances - and co-mingling them would dilute performance. Top right you can see the R-squared score of 0.511 - this means about 50% of the variance in ML prices is explained by just these features. I have checked different distances and the variance explained is always very similar. For 4F, the order of importance here is heart, start, temper, speed, surface (prefs), stamina, direction, finish and finally conditions (see the coefs column). Apart from condition, all parameters are statistically significant (I would use t>2 or 4).
So, does the ML price prediction something about the attributes of a horse? I would say yes, it's not the whole picture but its a free piece of information that owners should be using to their advantage.
How accurate are Morning Line prices anyway?
There are a couple of ways to assess the accuracy of ML odds. The first is to see whether the actual win percentage of different bands of ML prices. The following table has the data split into deciles (10% bands) by implied odds or winning. As you can see the implied odds and the actual odds are fairly close. If you could bet at ML prices - all ranges would be unprofitable (because of the 25% house margin), but the higher the band (the higher the implied chance of winning) the less unprofitable things become. Overall I would say the ML prices are a fairly good representation of real outcomes.
This table says the bottom 10% of runners by ML price (decile=1) have an implied 6.6% chance of winning. If we strip out the house edge of 25% that is adjusted to the real implied chance of 5.3%. In reality 4.4% of these horses won. If you could bet on every race at ML prices you would have had a negative ROI of 31.5%, which is worse than the house edge of 25%. At the other end decile 10 has an adjusted implied chance of winning of 19% and in the game won 19.8% of the time. I would say this is fairly close. If you could bet on all these horses at ML prices you would have been -16.8% down, still bad, but better than -25% you would have got from picking a horse at random. Horses with short ML prices are performing better than their price would suggest (marginally).
Let's look at the same data but ordered by ML price rank, where the favourite is rank 1, the 2nd favourite is rank 2. The below table breaks shows us something interesting with favourites - if you could bet on them at ML prices, you would make a 5.4% profit, even with a house edge of 25%.
This profit is sadly hypothetical - there is no free lunch. None of this article is designed to be gambling advise - you will have to figure out that for yourselves. That said, ML prices are clearly quite a reliable indicator of predicted performance - but just remember a favourite with a 20% implied chance of winning will still lose 4 out of 5 times - so just blindly betting the favourite will generate long losing streaks. If you are into picks and gambling in PFL, you need to take more factors into consideration. A good place to start, particularly ahead of the major races is at PFL Forms https://pfldrf.com/.
What conclusions can we draw from ML prices?
It does appear that ML prices are telling us something about the underlying skill of a horse. It may also include some of the skill of the owner in putting their horse in the right race. It is probably unclear whether some random element has been added but we don’t care about that too much because we are getting some clear signal as demonstrated by the GLS model output (>50% of variance explained). The favourite doesn’t always win because of luck factors and the way the game in programmed.
How can we apply this new knowledge? I think the most obvious place is claimers and the marketplace. Here are the questions I would ask to shortlist a potential horse for purchase if it was the ML favourite:
Does the favourite have a lower grade than other horses? That would be interesting.
Are there multiple higher grade horses running that aren’t the favourite. That would be very interesting.
Is the favourite being run at its preferred distance? If not, I would be very interested. A horse that is better than the rest despite not being run in optimal conditions would be a good sign.
The difference between the favourite and other horses can’t be explained by preference stars. If there is parity of stars then the difference possibly lies in the attributes only.
If you are looking across multiple races, how often is the horse in question the favourite? I would avoid horses that ranked as outsiders.
Is there a significant gap between the favourite and second favourite? This might indicate a superior horse particularly if it is a similar or lower grade.
This information can also provide feedback on whether you are picking the right races for your horse. If you are, expect to be in the top 2-3 ML prices each time. If you are constantly the outsider you have either got the distance or the competitiveness level wrong and this is a skill issue (for you and your horse!).
PFL is a game of skill and luck. ML prices seem to leak some insight (dare I say alpha) into the underlying skill of a horse. Use this to your advantage in picking the right races to enter, the right juveniles to persevere with, the right horses to claim, or buy or breed with. I am going off to figure out which horses had shorter prices than Sprinter Sacre and Charlie Chaplin...
Join the fun and put these insights into practice at PhotoFinish.Live and if you are considering starting your own stable please consider using my referral code: PADDOCK or just click on this link: https://signup.photofinish.live/?referralCode=PADDOCK
Please remember this is a web3 game where your spend your own money. Nothing I write about should be considered financial or investment advice.
Other blog posts:
Finding the optimal distance: https://aipaddock.com/understanding-optimal-distances
Understanding breeding: https://aipaddock.com/understanding-the-most-successful-breeding-horses-in-pfl
What are subgrades: https://aipaddock.com/what-are-subgrades-and-how-do-they-work
Fastest horses: https://aipaddock.com/who-are-the-fastest-horses-in-the-game
Racing profitability: https://aipaddock.com/how-much-profit-do-you-make-racing-horses
Trueskill pvp ratings: https://aipaddock.com/which-horse-is-the-goat-in-pfl
Breeding: https://aipaddock.com/trying-to-understand-breeding-in-pfl
Evaluating horses: https://aipaddock.com/how-good-is-my-horse
Understanding performance: https://aipaddock.com/understanding-the-true-performance-of-your-horse
Do stars matter: https://aipaddock.com/how-much-do-preferences-matter
FF Rating vs Finish Time: https://aipaddock.com/the-difference-between-ff-rating-and-finishing-time
Are horses getting faster: https://aipaddock.com/are-pfl-horses-getting-faster