Racing and Data Analytics
James Knight, head of racing at Coral, last week put out three tweets that pretty much summed up where our sport is at in its relationship with data and analytics, writes Tony Keenan.
Horse Racing should be the best betting sport out there. Currently it isn't, so stuff needs to be done both by racing & betting industries
— James Knight (@jamesaknight) August 17, 2015
We should be looking for an integrity & data rich sport, which can be bet confidently to competitive margins. That's what football is.
— James Knight (@jamesaknight) August 17, 2015
...and it is what Racing is at the top level sometimes. — James Knight (@jamesaknight) August 17, 2015
I’m biased of course but couldn’t agree more with the sentiment that racing is the best of betting sports; it has a complexity that few, if any, other sports can match and this is one of its most appealing factors. This complexity lends itself to the creation of data from the nuts and bolts like ground, distance and form to deeper factors like breeding, times and run styles; the list really is endless. But racing, despite some progress lately, doesn’t exploit this extensive data to its full potential.
Much of this is cultural and I mean that not only within racing itself but with a broader Irish, British and European approach to engaging with sport. On this side of the Atlantic, statistics and numbers are not ingrained into the psyche of the sports fan as they are in America. This is changing, however. Take the company Football Radar for example – you can watch a clip introducing their methods here (https://www.youtube.com/watch?v=Y2ee1GoQdeI) – and you see what can be done with the analysis of soccer.
The Americans do it on a whole different level of course with data-heavy websites like Football Outsiders and Baseball Prospectus, though if you want to read a more palatable version of the numbers then Grantland is the place to go where writers like Bill Barnwell, Jonah Keri and Zach Lowe synthesise the vast array of statistics into cogent and well-written arguments. It’s all very mainstream in the States but it boils down to one thing; these numbers help explain why things happen and how sport works so this is something we should want for racing. And lest we forget, they have extensive betting utility too!
It’s important to differentiate between old and new data. By old data I mean the fundamentals that make up racing from age to weight carried to trainers. These basic details have been around forever but that’s not to say you can’t garner new insights from them; the book and film ‘Moneyball’, a prime example of sports analytics reaching the masses, shows this as Billy Beane/Brad Pitt exploits the perception that batting average was more valuable than on-base percentage in baseball.
We’re getting better at interpreting these old numbers in racing too and we now have access to the tools to do so; databases like Horse Race Base, and of course Geegeez mean we can put our own filters on the data and find betting angles that were hitherto hard to calculate. We’ve learned that some numbers are better than others and by better I mean have more predictive value; pure strikerate is a fair indicator of success or otherwise but figures like impact value, actual over expected and percentage of rivals beaten give a truer insight.
But it’s the new data that really interests me. Again, the Americans have led the way. Football Outsiders, the doyens of NFL analysis, use volunteers to chart the minutiae of each play and you can now see data on all the moving pieces of on-field actions, including the once-anonymous offensive lineman and cornerbacks, not just the skill positions like quarterback and wide receiver. Baseball is arguably even more advanced where each major league stadium has installed a PITCHf/x system which charts the trajectory and speed of every pitch thrown in the game. I have even read articles lately where they now have the technology to tell how much spin each pitch has and these are balls moving at upwards of 90 miles per hour.
Racing too has many areas where new data can be introduced, and chief among them has to be sectional timing. I have to admit to being a devotee of sectionals and an admirer of Simon Rowlands and his team at Timeform who have done so much in terms of education with the subject and in building a database of times for racing in the UK. I do some sectional timing of my own and they certainly have betting application with pace being so important in the outcome of a race.
Establishing sectional times at every track in Britain and Ireland would obviously be expensive but I would be surprised if it doesn’t come around eventually; in the interim racecourses need to get on board with people doing their own times and play ball in terms of getting the race distances right and advising of any changes as well as making furlong markers visible. The same applies to TV stations who can provide on-screen clocks and suitable camera angles that aid taking sectionals. Taking these figures can be a little laborious, especially when camera angles make things difficult, and I look forward to a day when the data is provided and I only have to interpret it.
An extension of sectional times is the use of GPS in tracking the exact movement of horses within a race as each animal carries a chip to relay back information about its race position. We have only really seen this used in Dubai (where there is obviously an unlimited pot of money to spend on racing) and at the Breeders’ Cup, with the American company Trakus charting the specific breakdown of how each race went, but the numbers are fascinating. Not only does this provide us with the times for each horse but it also reveals the cost in distance of racing wide, an-0ften underrated aspect of race analysis over here. Simple physics suggests that the shortest distance between two points is a straight line but we have no way of quantifying the cost of racing away from the rail in Ireland and Britain.
Horse weights are used extensively in Hong Kong, a jurisdiction that many believe is the ideal in terms of racing run for the betting public. Whereas installing sectional timing and/or GPS tracking systems at every track in Britain and Ireland would be costly, the weighing of horses would not. The scales are relatively inexpensive, costing between €3-5,000 each, and it’s not as if horses aren’t used to them with many trainers using them at home. Knight mentioned integrity in his tweets and the weighing of horses would be massive tool in the policing of the sport as the best way to stop a horse is not to give it a ride where it can’t win but rather to leave it half-fit for the race.
The obvious plus to the latter option for the dishonest trainer is that there is no way of proving it with the current system. Were the weighing of horses to become widespread, this would lead into a sort of big data around the published numbers; we could compare animals not just against themselves but also against others and over the years could get a sense of optimum racing weights and what sort of figures suggests a horse is not fit or even too fit and ready to go off the boil.
As I mentioned earlier, there are some aspects of American sports where charters note down the data on each and every play, working within a common framework that standardises the numbers; Football Outsiders do this and the volunteers get access to the information while others have to pay for it. This could certainly apply to racing though perhaps in different areas on the flat and over jumps. On the level, charters could look at the keenness of horses within races. As things stand, we can read in-running lines that say a horse ‘raced keenly’ but there are degrees with this and perhaps a one-two-three scale would be better, with one being not perfectly settled, two taking a right grip, and three pulling the jockey’s arms out thus giving itself no chance.
When this data is compiled, it could be placed alongside other information and provide insights. We would know which trainers’ horses are more keen than others (and which can win being keen and which can’t) and what jockeys are best are settling their mounts. We could find that certain tracks or races run at slower paces produce more keenness or even that how horses race is random. Backers of Golden Horn on his next start would certainly be keen to know this after his hard-pulling effort in the Juddmonte International; what are the chances he does the same next time?
This could also apply over obstacles with a horse’s jumping ability graded one-two-three at each hurdle or fence. Again, we would find out which trainer’s horses jump best and whether bad jumping is repeated from one start to the next; we all have our own ideas on this but it would be better to put a number on it. It could also answer some difficult questions like was Zaarito, who fell three times in 2010 with races at his mercy, one of the unluckiest chasers in recent memory or simply a terrible jumper?
With all this data, there will be things that people get completely wrong, numbers that we use that really have little value. But these blind alleys don’t matter in the big picture as mistakes help push racing analytics on. Big data is here to stay in sport and as fans who have become accustomed to seeing it in other sports, many of us want it in racing too. Let’s hope we don’t get left behind: there is no reason why we should with the amount of technical angles we could exploit.
- Tony Keenan
You can connect with Tony on twitter at @RacingTrends
Very good article Tony and one which I fully agree with. The cynic in me does wonder whether the bookmakers in the UK would want this though…
Sectional timing includes individual distances raced and stride lengths all within the one package. Some Australian companies can provide all that data at £25 a race. Costs are tumbling further as you can do similar on a smart phone nowadays. So cost is a total red herring they use to stop people demanding progress.
Where it all falls down is if you ask people to spend more time and effort to go deeper into the data there is a diminishing or zilch return, as bookmakers stop your bets and Betfair charges up to 60% tax.
40% of would be punters who will never make a lasting profit in a month of Sundays are also knocked back to zilch.
Horseracing data and service providers seem asleep at the wheel as there is absolutely no point paying for their services if you cannot recoup your money in paid out winning bets rather than paper trading ones. These firms have, and increasingly will, gradually go to the wall. By the time the data comes in, if ever, there could be hardly any firms left interested in providing it. The original Turftrax once provided full sectional data for all AW meetings at £5 a month. so few were interested that they soon went bankrupt.
Unless legislation comes in that bookmakers have to take bets as part of their licence then racing will continue to die a slow death. The regulators have to wake up to the problem first and take the initiative that full data provision is their problem to organise and fund.
James Knight should be asking himself why Corals and all the other bookmakers restrict or close winning accounts.
UK Horse racing does itself no favours by crazy source data and licensing costs meaning the cost to entry for smaller firms or individuals with skills to interpret, display and market them to a wider audience is largely out of reach. You are looking at between 10-15k annually just to get historical and ongoing data from the likes of the Press Association/RDC, hence why many daily newspapers and cards have stopped or reduced their coverage.
You can pick up all historical baseball results/data for free from the Lahman database online and elsewhere which has led to the growth of such sites like Fangraphs etc which produce outstanding analysis.
Unfortunately, the UK hinders growth in data analytics by pl acing that data in the hands of a few outfits like Timeform who can afford the exorbitant initial data costs.
………………………………………………………eventually.
It’s hard not to be jaundiced after a lifetime watching racing’s administrators ‘in action’.
Great article, Tony is ahead of his time on this, Stats/data analytics is the way forward for horse racing betting, Bookmakers should be concentrating on volume punters rather than restrictions.
Would be great in the future to see a Fantasy Horse Racing Model being allowed, same idea as DraftKings etc, the horses are as much a superstar as pro players in my eyes, I heard a rumour Paddy Power and Betfair are in talks to merge, this should be interesting…
Agree with comments of ‘jo’ and ‘psnich’ above. No point of better data when even the ‘big bookmakers’ just restrict you. To illustrate point, CORALS were offering Evens about Mirror City at Kempton this evening. I tried to place a ‘huge bet’ of £10 (TEN POUNDS AT EVENS) but get restricted to £5 (FIVE POUNDS AT EVENS)! I doubt one could recover costs of better data with these restrictions in place. Surely if bookmakers want to keep licences, they must be prepared to take a reasonable sum. What is most galling is even some minutes after placing my £5 bet, CORALS are still offering EVENS!
Agree with Joe above. More information and data for what reason. Coral recently closed my account down even though I was only breaking even over the past three months. Clearly they saw signs that they did not like and rather than take a chance on losing money just shut me off straight away. Reading the stories on sites like this it is the same for the vast majority turning a profit (or suggesting they might). It would appear more data and more studying will just lead to accounts closed quicker and the only option left will be Betfair with it’s poor early liquidity which hardly rewards the system finder. Not sure what can be done about it but losing does seem the only option available to punters. Not sure how that fits with researching for new angles and having more data available.
Surely sectional timings are only of use when you know that all the horses have been put into a race to win. Too many races, particularly on flat turf are erroneously entered into the form book amidst a smoke screen of failed jockey tactics, horses that were never put into the race and those that were never ridden out when it became obvious that a placed result was the best the jockey could hope for.
Until these scurrilous practices are stamped out the form book will continue to be full of misleading information and of little use to the punter with the exception of the valuable prize money races where we know that everyone involved is busting their ass to pull off the win.
I’m not sure I see the point of more of the same.
Very interesting article but I agree with a lot of the comments above and do find it surprising that the Head of racing at Coral seems so keen on horse racing being “the best betting sport out there” whilst admitting that currently it isn’t.
Coral were the first of the major bookmakers to withdraw BOG from me and then restrict the size of my bets. I closed my account.
In my experience only Bet365,WilliamHill and Betway deliver what they promise.Paddy Power who ARE merging with Betfair don’t allow me to put say £10 on any horse that is going to return me more than about £100.
The betting industry stinks and is so weighted against any punter that gets anywhere near making a small regular profit. I agree with comments about licences for Bookmakers forcing them to accept reasonable bets (anything under £100) on the prices they have on offer. They would still make a fortune but at least maybe some of us might also make a regular (albeit not so large)profit.
A thought-provoking article, Tony, for which you are to be congratulated.
I would love to see horses weighed before each race. Years ago, when there was ample time to look at the horses on TV in the parade ring it was possible to make some assessment of fitness by each horse’s ‘condition’. Nowadays, Channel 4 allow so little time to see the horses before a race that this ability is lost. A database of weights, linked to past performances, would help punters quickly sort out the trainers who run horses that are not fully fit.
Lots more in the article that we should commend to the race authorities, perhaps through this new racing body that was announced recently?
I look forward to seeing more written and discussed about the various topics in Tony’s article in the future.
Well done, Tony.
David
I can see this new horse racing for punters initiative being just a whole load of hot air with meetings for the sake of a meeting. Hope not but that’s the way it looks.
Hi Jim,
I hope not too, as I’ll be in those meetings!!
We certainly don’t intend for them to be that…
Matt
Ps if you’d like to contribute, we’d love to hear your thoughts.
There is plenty of value to be found on Betfair which I have pretty much used exclusively to achieve profits over the last 6 years or so, never really understand why people continue with bookmakers unless they are not winning of course. You don’t have to get on board early on the exchanges either – you can wait till just before the off and snag plenty of value if you are armed with the right data and have a contrarian approach from the crowd. Litigant was available at 70’s on the exchange around 10-15 mins before the race and there was a late plunge, just have to be a bit smarter. It’s not a new thing that bookmakers always want losing punters and will restrict/close the smart ones but if more people simply boycotted bookies and used the exchanges then liquidity would increase and you would achieve better value as a result. It may also give bookies a kick up the backside although they will always attract mug punters swayed by their ‘offers’.
you can look at all the stat you want racing is corrupt ,plain and simple i’ve seen horses with poor for beaten by 20l 35l not been in the frame go on to win next race by 25l how can that be they dont hit form for one race then never been seen again in the winner enclosure,if a trainers wants to win or try he will if not doesnt matter with all the stat you look into ,horse should be finnishing lot closer that result tell us some try some dont you now have the like off paddy power merging with betfair and holding 52%of the share’s it’s just a closed club soon more will go that way the idea of punters betting against each other is long gone your betting 3seconds slower than some one with sis live feed so they can cherry pick your bets off you ,and bookie dont care if you dont use them because they know that will never happen,plus bookies off load and use the exchange aswell they got inside trainers to tell them if their trying or not ,ive been in the ring when top jockey came out or room telling trainer fav in the next race not going to win so lay it ,and you cant say it dont happen as i was there when it was said ,sure enough fav 4th
I see BET365 mentioned as one of the “big” bookmakers. Well walk a mile in my shoes. Restricted to .23p yes 0.23p on some bets. Who the f**k goes into a bookies and bets 23p on a horse. BET365 are a farce although they still give me BOG unlike Ladbrokes, Sky or Stan James
Well, 12p is my ‘bet’ on BFSB the same as someone paying £100K to BF annually, I found out last week.
I’m a small bettor.
Wow! I thought Bet365 were bad restricting me to 50p a bet (and strictly no accumulators) even though, over a year, I’d actually lost a tiny amount with them.
It would appear you are more than twice as dangerous to their bottom line as I am!
It seems more and more that accounts will get restricted and closed for making a small profit. It will be left to mug punters playing on the casino and putting on silly accumulators or doing offers. Having said that, the bookmakers are quite happy to talk up punters who win large sums for small stakes. I suspect that with all the consolidations and mergers that more and more bookies are getting their accounts in order, and banning anyone who has made small regular profits or limiting accounts to very small stakes so it is not worth bothering. I feel that all successful punters will end up on the exchanges and trading (that is my next learning curve)
Re being too keen and making mistakes whilst jumping, Timeform record 2 degrees of both in their In Play symbols; p or P for the former and x or xx for the latter. (That they only apply these, and other, symbols haphazardly is a constant bone of irritation, but having said that their product is simply invaluable for form analysis.)
I think that there should be another factor recorded when it comes to keenness too; how long the horse was keen. There’s obviously a huge difference between pulling for half a furlong and pulling for a mile or more – but such distinctions are extremely rare in form analysis.
Good article. Shame about the dullards who prefer to drone on about something completely unrelated to its subject matter in the comments.
Interesting points you make, James. But, unlike you, I also believe those struggling to get a bet on make interesting points, too. There must be a correlation between the effort associated with creating better data – or even good/better use of existing data – and the ability to get a bet on.
So I don’t consider these people “dullards” at all. Sadly, I think that diminishes the value of your other points personally.
Matt
We could add a final sectional at every race course for the price of an ice cream a day frankly. The courses have the power and no will to do it. It would be an ideal start and an 80/20 except the 20% cost would be 0.0001% [80/0.0001 solution not as rolling off tongue]. Then we can build from there.
Thus far it has been like we are asked to choose between a Ferrari or nothing. It would surely be better to start with something. More money has already been wasted and set aside on sectionals than would cost to provide 1 or 2 sections a race (Can be any distance on flat between 2 and 3 pole dependent on course layout as long as distance to line accurate). BBC create an athletics track in Manchester every year and offer split times so how hard is it? We have a finishing line another beam how hard is it? Most courses could probably do it tomorrow if they wanted.
There seems no will and as said already racing has wasted more money on trials and grants than to provide it for every flat and jumps track.
While I think the article was extremely interesting I would like to point out something that I think is overlooked a lot of the time. I understand that bookies operate on mathematics and successful punters, those who have accounts closed, do the same. But these, like Matt for example, have all day and every day to devote to racing.
I, as an ‘amateur’, do not have this time to devote to trawling through the mountainous quantity of information already in existence. There is of course the always invaluable Racing Post site, Geegeez, Racing UK offering tons of info, facts and figures.
On top of that I I am constantly bombarded by people throwing ‘systems’at me, almost always based on trainers records, or a horses breeding, or jockey, trainer, track combinations, or just about any other combination you can think of.
For example you will often find an analysis offered concentrating on a type of race, over a certain distance, dominated by 3/4 trainers and how ‘backing them blind’ would have produced x profit. None of these systems explain what to do when more than one of these trainers has a runner in a race. In essence these are not much different from ‘metrics’.
What is missing from all this is ‘gut instinct’, something all punters recognise. Sometimes this is not based on ‘form’, but just something your unconscious has absorbed and you see something and just ‘know’. For example Ralphy Boy on Wednesday, a ‘feeling’ popped into my head had a quick shufti and got a nice result. I did not bother looking at the other runners.
Tom Segal once wrote somewhere about keeping it simple and not overloading his brain with information and stats. I used to live in the USA and the focus on stats to me meant that people started to miss the poetry in a game of baseball. Football is heading the same way.
I backed the winner of the Ebor at 40/1 the night before the race from a reading of the form and ‘gut feeling'(and the bloody second why didn’t I do a combi forecast? from three choices)but only for pennies. The reason my account is never closed is because I only back in pennies.
Anyway my point is that the sport has to be a great deal bigger than numbers and you can get swamped by information to the point that you can see all the individual trees but not the wood.
How often, and I will bet it is often, you see a horses name and your ‘gut’ tells you to back it, you have a gander at the form, change your mind and your gut choice romps in and your ‘rational’ pick gets timed with a calendar?
Philip
Philip
You are completely and fundamentally wrong. I have all day to devote to my business and to my family. I have, on a good day, an hour to devote to racing. That’s why I spent so much time and money building a suite of tools and reports to give me what I need to know without trawling for it.
The “it’s all right for you” argument is trite and misguided. I thought you knew my operation better than that…
Matt
Dear Matt,
Apologies. I assumed as a successful punter and dealing with geegeez all day you would spend a great deal longer on form analysis. I also know from experience that reading form is very time consuming and tools such as geegeez are useful in cutting that time. However the really successful professional gambler either has to have rock solid inside info on a regular basis or spend huge periods of time chainned to a desk.
But I still think that we can get overwhelmed with stats, info,and systems and sometimes I think ‘instinct’ or a mental affinity with say particular trainers, or jockeys, allow you to make picks that analytics will simply disregard.
Best withes and apologies for any offense.
Philip
Hi Philip
No offence taken, I just fundamentally disagree with you. I also don’t agree that to win you need ‘inside info’ (most of which doesn’t account anything except one horse in a race, and is surely the biggest white elephant in racing) or to spend hours and hours chained to a desk.
If you had to manually calculate, for instance, the handicap first time trainer stats, then yes, sure, it would be that. But the whole point of Gold (and other form databases worth their salt) is that it does much of the number-crunching for you. As an example, backing all The Shortlist selections at Betfair SP would have made you a lot of money. Backing them at SP would have made you a small amount of money. That’s actually a very ‘crude’ report in terms of sophistication, but it objectifies the form profiling process (i.e. it takes sentiment out of betting decisions).
‘Instinct’ and ‘affinity’ are often the result of thousands of hours of effort over time which has distilled into a sub-conscious form study methodology. Part of Gold’s reporting and racecard content is based on me ‘reverse engineering’ the pieces of my own form study rationale that I could steal back from my sub-conscious!
Best,
Matt
I have an example to prove my point. Despite not being stable jockey any longer Paul Hanagan has a decent strike rate for Richard Fahey and often at good prices. I have just looked at the 5.50 t Newmarket and seen Jan Van Hoof running. I think that is the horse to back on ‘gut’ feeling. Racing Post verdict said:”Got found out off 6lb higer at Goodwood”. The form says “badly hampered”. JVF may not win, or even be placed, but I think it is a good wager. The price is e/w, which is what I always look for, I don’t see why that is not a good choice.
Philip
As an Aussie visiting the UK – don’t mention the cricket – I just want to comment on horses and their position in a race viz how deep they run. I know certain course, such as Goodwood, where they track from side to side, doesn’t apply but in Australia, except for the final run to the post, any jockey with his horse more than two wide is considered a bad ride – except on bog tracks where the inside running is badly cut up.
This quote from the article sums it up.
Not only does this provide us with the times for each horse but it also reveals the cost in distance of racing wide, an-0ften underrated aspect of race analysis over here. Simple physics suggests that the shortest distance between two points is a straight line but we have no way of quantifying the cost of racing away from the rail in Ireland and Britain.
Simple physics doesn’t just suggest the shwortes distance between two points is a straight line it mathematically proves it. The industry standard is one length for every position off the rail. I’ve watched 5 furlong races here where runners are three, four and even five wide. When a short head can determine a finish and these horses are losing two or more lengths in what is a dash, I shake my head in disbelief. On straight courses this isn’t a factor and races over more than a mile I discount. But there are a wealth of short races run around bends in this country where I believe it does count and where an astute punter can make good money. I have done particularly well at Chester, probably the biggest course in the country, back horses drawn low that can take up a forward position over 5,6, and 7 furlongs. Once on the rails and in front at Chester you can safely pocket the money.
I meant to say Chester is the bendiest course in the UK but good old predictive text got me again. Sorry for that.
if you are constantly beating the price and backing “warm horses” your account will probably get restricted.
a punter backing the winner of every big handicap throughout the year wont get restricted