Charting the uncharted in breeding within PFL
This post attempts to remove more of the fog of war around breeding in web3 game PhotoFinish.Live. Why are some horses better parents than others? What is signal and what is random noise?
Petrocker
9/7/202410 min read
I don't know about you but I feel like my theories on breeding change from season to season, and next season's juveniles in my humble little stable are the product of wild theories I had 3 months ago, and I shake my head at each and every underperforming runner, cursing the dumb ideas I had 3 months ago.
Well here are some more up-to-date dumb ideas that may influence how you go about breeding, selecting studs, matching mares and studs together. As always I wish I had all the answers, but what I do have is some interesting evidence, and as always I encourage you to make your own mind up. My job here is to present the case and let you decide, rather than present some grandiose unifying theory of everything. If I had all the answers, I would be winning majors!
What makes a good stud?
The answer to this question depends on what you are trying to achieve. I think the main reasons (not including all the cool exotic breeding programs some stables excel in) are 1) you want to get an uproll to the highest grade of horse possible, the next SS- or the first SS or at least a competitive S+. 2) you want a dominant horse for grade restricted racing as the schedule has moved to reward this type of racing. 3) something else that probably won't be covered here today.
If a stud produces a major race winner does that make it a good stud? In his seminal book, Fooled by Randomness, Nassim Nicholas Taleb, says something like if a person presents you with a monkey that has written the complete works of Shakespeare, the question you should ask is how many monkeys are there? If the answer is a few, this is the smartest monkey you have ever seen. If the answer is a very large number, then it could just be through random luck that this monkey slapped the keys in the right order by chance. So it is with studs. If a stud has produced hundreds of foals and one happens to win a major, then it might well be luck. Yes it probably has to be a S+ grade. Yes it helps if the mare (probably even more important given their scarcity) is an elite S+. You can't just look at the winners, you need to look at the entire output from a stud, and if one out of a hundred foals went on to win a major, are those odds good enough for you to part with $500? If a stud has sired 10 horses and 1 won a major, then it's probably a very good stud. If it sells out 35 breeds a season for 8+ seasons, then it might just be brute force luck that it's a parent to a major winner. Although alluring to breed with the sire of a major winner, ask yourself is this survivorship bias, basing judgements only on the good outcomes? We can't use major wins as a good signal without taking into account how many crap, non competitive foals were also produced.
Breeding is a numbers game, every breed is a roll of the dice. In the last post we looked at what can influence that roll of the dice, but just having more rolls of the dice is an excellent strategy in any game of chance. It's not all luck though, and there are things us smart owners can do to try to tilt the odds in our favour.
All of the data in this post is based on retired horses at the beginning of September 2024. Retired horses obviously allow us to see the attributes so we can objectively measure horses in relation to how good or bad their rolls were. The downside to this is we are not analysing all the horses running today or yet to mature. That said the sample is sufficient to draw some conclusions, which is what we want.
In researching this post, the first weird chart I stumbled across was the one below.
The y-axis is the aggregate win rate of all offspring from a set of parents and the x-axis is the similarity between parents measured by how closely their attributes align. If both parents have exactly the same attributes for start, speed, stamina, finish, heart & temper, then the gap between them is 0. If one has, say, an S for start and the other has an A+, then that gap is 2. Add up all the gaps and you get the number across the bottom of the chart. The chart suggests that the closer aligned the parents are, the higher the win rate of the foals produced. Now, there might be some unrevealed factor at work here. Removing S, S+ and SS- horses, the pattern remains the same, so its not that top horses are just closer together in attributes.
This is an interesting area and one that requires more analysis, but the theory here is that better matched parents are more likely to produce higher performing foals at all levels. You might say, well 14% win rate isn't that high, which is true but the difference from the midpoint of the chart, around a gap of 16 is basically a win rate that is 30%+ lower than the peak on the left. Why might this be the case? Well my guess is when well matched parents meet, the process takes 50% of DNA from each parent and if they are well matched then both contribute to the upside. If one is dominant, its hard for the lower horse to contribute upside with its 50% of the DNA. But guess is the operative word.
So learning number one, matching parent grades seems to be beneficial (but we don't know why exactly).
Some studs are better than others
I love the market trader aspect of owners hawking their studs every month. The graphics get better, the pitches more appealing and the claims more outrageous every season. With the mix switching to inhouse breeds recently, its important to understand what makes a good stud.
One objective way to do this is to look at the offspring of a stud and compare their attributes versus what we would expect. For this analysis I have set the mid-point between parents' attributes as zero. So if the parents have start attributes of S and A+, the S- is zero. If the attributes are S and S-, then the mid-point between S and S- is zero and a roll of S would be +0.5 and a roll of S- would be -0.5. Add the six numbers for the attributes up and you can quantify how lucky a roll is? (Is it luck?). To set the scene here are the luckiest rolls on record to date, remember we can only see this for retired horses:
The luckiest rolls of all time (so far) is 9 points above the average of the two parents. Overall uprolls and downrolls are balanced around zero (which is the average). The bottom end of this table includes all the inbreeds, which have obviously been altered from random chance. So the upper limit on how much luck you can get is 9 attribute points, or one and a half grades (9 / 6 attributes). This doesn't explain performance; Bridgetown was a crap A+ and Crown Atlas an excellent S+ but it shows the limit of how lucky you can get.
Using the same concept of mid-points as zero, all breeding rolls are normally distributed around zero. The below chart illustrates this point. It is the aggregate of all rolls for start, regardless of grade, you can see that the rolls are mostly likely to be in the middle and rarely at the ends. As I have said before chasing wicks is like buying a lottery ticket, you are way more likely to get a roll in the box of your breeding report boxplot than at the extremes.
As you can see you are most likely to get a roll which is the average of the parents, and your likelihood diminishes the further out you go from the average (zero). Extreme rolls (good or bad) do occur but you can't rely on them. That's why many people think its better to have more tickets in the lottery than try to predict the lottery winning numbers. From this obvious normal distrubition we can infer that the process is fairly mechanical and produces well balanced outcomes. This chart is for start only, but all attribute charts look similar.
This analysis is at a population level. There are however horses that seem to defy the orthodoxy. Is this random noise or an edge we can exploit? If we look at the individual studs and their offspring, we can see which studs produce foals that are above average and which ones are below. Here is a table of studs that produce the most above average foals (regardless of the mare):
These are the top 20 studs ordered by how far their foals end up above the average of the two parents. The way to read this table, starting with the last column is Liam Neighson's foals are 1.91 attribute grades above expectation on average, across 16 foals. The cutoff is a minimum of 10 foals (who have retired so we can see their attributes). This is to rule out one dominant foal biasing the outcomes. Note only Saturn of the famous old school studs appears on this list.
If we look at the columns from start to temper, we see similar figures at an attribute level. So for Liam Neighson (I have edited out a Taken pun here), his foals have 0.59 more start attributes than we would expect. In other words on average, his foals are half a grade higher in start and almost the same on speed, not much on the next three and then again half a grade higher on temper.
The remaining column is uproll, which is the average increase in overall grade versus expected. I wouldn't worry too much about this - the overall grade is obviously an aggregate of the other attributes. Liam Neighson's foals are half a grade higher (0.47) than we would expect on average.
I know what you are thinking, but where are all the 40k studs everyone tells me are the best ones to breed with? That obviously depends on your strategy and goals. From a purely results basis, when looking at improving grades, or beating RNG, these horses came out top. Shaun E Bear, a very popular stud, the plague of many bloodlines, is ranked 490th out of 550+ qualifying studs with an average downgrade of 0.9 attributes versus average, just below Balloon Arch at 488th.
How can we interpret these figures and what causes them? One theory I have, and its just a theory, is that the figures above for start to temper represent how strong the stud's attributes really are. We always here of a hidden layer of detail we don't have access to behind the letters, and a strong S+ attribute versus a weak S+. Maybe these values are a direct representation of the horse's hidden attributes, and where foals are produced above the average, that is a direct reflection of the starting point for the stud is actually above average. Maybe Liam Neighson's start and speed attributes are very strong versions, so the offspring inherits them, the starting point for speed is not say 14, but 14.5, so that when we see the foal its not above average, but the starting point of the sire was. I find this hard to explain so let's use an individual horse as an example, Abominable Snowman.
Abominable Snowman has S+ for start, speed, stamina and finish. We call this a flat attribute profile. It would suggest he is best at 8F, the middle distance. His attributes are all above his overall grade, which is what makes him a very successful horse on the track (not some winning gene). But looking at his FF chart we see that he was actually most successful at 6F. Not 4F or 8F. This usually means that his start and speed are stronger than his stamina and heart, and start and speed are probably at a similar level, otherwise he would have been better at 4-5F if start was dominant, and 7F if speed was higher than start.
Looking at his entry line into the previous table (see very bottom of the above image) we see that his foals are average overall - right where we would expect them to be. However his start is 0.32 and his speed is 0.25, both well above average. Stamina and finish are close to zero. This confirms what we would expect looking at his attributes and race performance, that although all 4 racing attributes look the same, start and speed MUST be stronger because he ran optimally at 6F. So my conclusion is when foals are produced above the expected average, which we proved earlier is a normal distribution around zero with mean of zero, it is a reflection of how strong or weak the stud's attributes are at a hidden level.
Learnings about breeding from this post
Regardless of the folklore and mystique, breeding is really rather mechanical. There are some real things that might give you an edge:
Matching parents' grades seems to result in foals' on track performance being better
The luckiest rolls are a maximum of 9 attribute points above the average of the parents
Attribute rolls are normally distributed around zero. Your roll is most likely the average of the parents
Some horses buck this trend and produce foals that are systematically above average
This is probably due to having hidden stronger variants of attributes you can't see
Quickly go and check whether all of Liam Neighson's stud slots have been "Taken" (sorry)
Be methodical with your breeding. Its a numbers game. Don't fall for survivorship bias (do I have to show the plane with the bullet holes in the wings?). Combining this and the last post, think about +subgrade horses, matching attributes, and playing the averages, and you are more likely to succeed. You just have to wait 3 seasons to be proven right or wrong!
Join the fun and put these insights into practice at PhotoFinish.Live and if you are considering starting your own stable please consider using my referral code: PADDOCK or just click on this link: https://signup.photofinish.live/?referralCode=PADDOCK
Please remember this is a web3 game where your spend your own money. Nothing I write about should be considered financial or investment advice.
Other blog posts:
Know your odds of breeding success: https://aipaddock.com/know-your-odds-before-you-roll-the-dice
Further down the rabbit hole of ML: https://aipaddock.com/deeper-down-the-morning-line-rabbit-hole-we-go
What can ML prices tell us: https://aipaddock.com/skill-luck-and-morning-lines
Finding the optimal distance: https://aipaddock.com/understanding-optimal-distances
Understanding breeding: https://aipaddock.com/understanding-the-most-successful-breeding-horses-in-pfl
What are subgrades: https://aipaddock.com/what-are-subgrades-and-how-do-they-work
Fastest horses: https://aipaddock.com/who-are-the-fastest-horses-in-the-game
Racing profitability: https://aipaddock.com/how-much-profit-do-you-make-racing-horses
Trueskill pvp ratings: https://aipaddock.com/which-horse-is-the-goat-in-pfl
Breeding: https://aipaddock.com/trying-to-understand-breeding-in-pfl
Evaluating horses: https://aipaddock.com/how-good-is-my-horse
Understanding performance: https://aipaddock.com/understanding-the-true-performance-of-your-horse
Do stars matter: https://aipaddock.com/how-much-do-preferences-matter
FF Rating vs Finish Time: https://aipaddock.com/the-difference-between-ff-rating-and-finishing-time
Are horses getting faster: https://aipaddock.com/are-pfl-horses-getting-faster