## The Draw Analyser Challenge

Sadly, for those of us who love the UK and/or Irish racing, it looks like we're in limbo until at least June 1st. The good news, relatively at least, is that the odds of a restart on that date are shortening all the time. Assuming nothing untoward occurs during these next few weeks, we ought to be ready to get cracking just 20 days from now. Everything crossed, of course.

In the meantime, it's time to further tool ourselves up, and so I've come up with another challenge!

So that everyone can play I've made our absolutely awesome, best in breed, Draw Analyser tool available to all registered users; so if you have a geegeez account, free or paid, you can join in. This is for the duration of the challenge - one week - only.

Here's what I'd like you to do:

### Step 1 - Visualisation

The first thing to do is to bring some logic to the party. It is all too easy to walk straight into the data without thinking about the problem at hand. That casual approach lends itself readily to back-fitting, because you're not trying to prove - or disprove - a theory. Rather, you're looking at the numbers and trying to work back from there. Whilst such an approach is not completely without merit, it is less rigorous than beginning with a notion of what you're hoping to find.

A way to do this when considering potential draw biases is to first look at the track layout. Let's use an example, York racecourse in this case.

#### 1a. Go to our UK racecourses page and choose a course.

I've linked to it there, and you'll find it in the top menu under Courses/Fixtures.

Hint: try to avoid obvious ones like Chester; we're looking for angles that might not be over-exposed

In the top right corner of the racecourse page, you'll see a course map. Clicking on it will expand it and display the locations of the race starts.

#### 1b. Scan for possible draw-affected race distances.

I'm immediately drawn to the mile (1m) and 1m1f distances because of that sharp bend at the top of the home straight that comes up fairly quickly. I wonder if, in bigger fields, that might inconvenience wide/high draws and, therefore, favour low to middle stalls.

So that's the assumption I'm going to test. (I think it's possible that in bigger-field two mile races there might be a similar bias for a similar reason given the number of left-handers the field takes, but we'll save that for another day).

### Step 2 - Set up the tool

So now we need to set up the Draw Analyser. We're going to do this in a specific way so we test apples against apples, as it were.

The Draw Analyser has a series of options at the top of the page to allow us to configure things as we'd like.

So we're going to use a standard set of parameters, shown above and ignoring course and distance for now, as follows:

- Set 'Draw' to Actual - this will review the data based on the actual stall positions of the horses, removing any non-runners from consideration (so, for example, the horse drawn six would have an actual draw of five if one of the horses drawn inside him was declared a non-runner, and so on).

- Set 'Going' to Hard to Heavy (you could use Firm or, at most courses, Good to Firm, but we'll do this for now).

- Set 'Runners' to 10 to 16+

- Set 'Races' to Hcap (so we're only looking at handicaps)

- Set 'Dates' to 2009 to 2020

Once these are set up they will only change when we change them, as all data below the options area updates auto-magically 🙂

Now select your course and distance combination from the dropdowns.

### Step 3 - Review the data

If we've performed steps 1 and 2 correctly we should have some data in the tool which may or may not support our theory. Let's review that to see if it is starting to tell us anything.

#### 3a. Consider the course and distance draw 'all going' data

We can see from the chart that there's a lovely linearity - a straight line - from low to high. That is a very good start and normally things will be less cut and dried at this stage. N.B. Do make sure you check the left hand scale because you might see a line like this with very few percentage points from the top of the scale to the bottom.

The table above the chart tells us a number of things:

- There have been 65 races that match our criteria (wins column, 32 + 21 + 12) so a reasonable sample

- The win percentage drops as we move from low to middle to high; so, too, does the place percentage

- The A/E and IV figures for low are both above 1.00, a strong sign

#### 3b. Consider going subsets

At some courses the favoured sector of the draw/track can change markedly on differing ground. For example, at Epsom and Brighton, jockeys will chart a course to the polar opposite side of the home straight on soft or heavy ground due to the way the camber leans and, therefore, the way the rainwater drains (it is always softest at the bottom of a hill or incline).

So we must check for any variance of going. I divide things into two simple subsets, fast and slow. Fast is 'Good or quicker', and slow is 'good to soft or slower'. [For all-weather, I include all AW going in a single range]

N.B. When using going ranges, the faster going must go in the top box or you will get no data returned.

Let's bisect our York mile data in this way:

Fast:

Slow:

In this case there is very little of note: the slow group has only a few races in it and it appears progressively tougher for high drawn horses to prevail, but there is not really enough evidence to be categorical about that.

What we can say is that the bias is 'going agnostic', that is, it manifests largely the same regardless of the state of the ground.

#### 3c. Retest on date range subsets

Racecourse husbandry is an extremely complex business. I, and many others who value data in their wagering decisions, have given clerks of the course a hard time on occasion for their misleading reporting, but there is little doubt that all of them operate to a high level of skill in their field (pun intended!). Advances in irrigation (watering) and drainage, as well as tactical rail movements, have reduced or eliminated many historical biases and so it is important to check our data against different periods of time.

Dave Renham, our main resident draw expert (along with Jon Shenton, who takes a broader sweep in his course analyses), has recently taken to following the Mordin approach of rolling five-year subsets (e.g. 2009-2013, 2010-2014, 2011-2015, etc) and that is a great way to go if you have the time and inclination. For now, though, we'll break the data into two groups, 2009-2014 - the oldest six years in our database - and 2015-2019, the most recent five years. Again we're looking for any material change in the bias.

Hint: Remember to reinstate the full going range

2009-2014

2015-2019

While the sample sizes are quite small, the general principle is the same: low favoured, middle less favoured, high unfavoured. So we appear to have a bias that is consistent against both time and going. These are rare birds so do not fret if you don't find such a clean and consistent relationship with your chosen course and distance combination; after all, mine was cherry-picked for example purposes!

### Step 4 - Fine Tuning and Scoring

The last step, assuming there is anything of note to this point, is to fine tune and score your course/distance combination. Actually, there is value in noting that there is little or no bias over a course and distance. No knowledge is bad knowledge and knowing that draw is not a factor in certain races enables an unencumbered focus on other aspects of the puzzle.

#### 4a. Fine tuning

The fine tuning comes first; it's not really fine tuning as such, because we are working within the fixed parameters of field size, going and date ranges to resist accusations of convenience fitting.

But... it is sometimes the case that, for instance, very wet (heavy) ground or the biggest fields accentuate a bias, and it is worth noting that alongside the 'fixed parameter' work.

For my mile handicaps at York research, I wanted to see if a bigger field would emphasise the advantage to those drawn inside and the disadvantage to those drawn highest.

This is really interesting: in the 30 qualifying races, low has readily outstripped middle and high. But looking at the constituent draw data we can see that stalls six and thirteen, on either cusp of the middle draw section, have kept that group afloat. It does appear that either the inside stalls 'get away' or the wider drawn horses sweep around the outside to prevail. Those berthed in the middle have had a tough time being neither one nor the other of those things: not getting first run, and being potentially trapped behind horses in the straight preventing them getting the late run also.

That is conjecture on my part to some degree, but it's credible enough. Of course, I welcome alternative theories!

The IV3 chart at the bottom of the image above (IV3 being the average Impact Value of a stall and its immediate neighbours) demonstrates the middle drawn hinterland as well as the low-draw safe haven for punters.The constituent draw table reveals that ten of the 30 races in the sample were won by horses drawn 1, 2 or 3: that's a third of the winners from less than a fifth of the runners.

#### 4b. Scoring

The last part of the process is to try to score the utility of any observed bias. It may be useful from an elimination perspective - that is, avoid high draws unless their form/value case is irresistible - or, more generally, from a 'mark up' perspective: in other words, bonus points to the case for a horse optimally housed.

The score should be more than a mere number, because there is normally a qualitative element to our observations as well the quantitative component.

For example, in my York mile example, I will score the bias as a solid 7 at this stage. When I've worked through a few more course/distance combinations, I might revisit that score and nudge it up or down a bit, but 7 feels about right for now.

The fact that it's somewhat 'feel-based' - we could use percentage scoring bases, but this challenge is not intended to be too academic in its rigour - adds ballast to the need for the quantitative element: some commentary on what we've discovered.

In this example, my final comments are thus:

York, 1m - 7/10 LOW

Strong linearity from low to high, the widest-drawn runners unfavoured. Bias has been consistent over time and on all going, and is accentuated in bigger fields (8/10 in 16+ runner handicaps), where the bottom three stalls have won a third of the 30 races in review.

### 5 The Challenge

This challenge may be considered a little more in-depth than the horse profiling one from last week, but it's actually about the same once you get into a rhythm. It would be easy to go through all of the distances at a given track in 30-40 minutes, and to select and review the most likely distance(s) in 15 minutes or so.

I'd very much welcome readers of a curious bent taking up the challenge and adding a comment below in the style of my York 1m note and score. I'll add it to the comments as an example, and hope it's not a lone comment!

Good luck,

Matt