## Gold Reports: What do the numbers really mean?

As of today, a subset of Gold reports now have two extra columns on them. In this post, I'll outline not just the two extra columns but the whole reporting structure and how best to use them.

### Gold-en Reporting

There are currently eleven reports in the Gold suite - more will follow in 2016 - and each has its tell-tale pointers for the day's punting.

The first five in the banner above - called a mega menu apparently, and what you see when you hover on the 'Reports' link in the top navigation - all have individual layouts which I'll not go into here. That's for two reasons:

- Their layout hasn't changed from the format displayed in the User Guide (click to download)
- I don't want this post to be too long!

The remaining six, from Trainer Statistics top right and all the way through the bottom row, all follow a common format, as follows [click to enlarge] :

Allow me to share the layout. The report is split into two areas: the 'control' at the top and the 'content' beneath.

### The Report Controls

In the top section, users can define how and what they want to see. For instance, if you like to lay horses, you might change the win percentage filter to be from 'Any' to '10%', and look for fancied runners representing trainers with poor form in the context of the relevant period.

The "relevant period" is whichever blue button on the right that is checked - displaying in a lighter colour. In this example, it is Course 5 Year Form.

The other two blue buttons, to the left, allow Gold users to look at either today's racing or, for you evening students, the following day's action.

Oh, and if you're mainly interested in handicap races, you can select the 'HCAP' filter at the top, to show the report output for handicap races only - pretty neat when trying to sift the "always trying" from the "occasionally tuned up to do a job"!

### The Meat of the Matter

Once you've set your parameters, click the big grey 'UPDATE' button and the content area will display those entities - trainers in this case - that fit your criteria. On subsequent visits to the report, your settings will be recalled, so if you have a fixed approach you need only set this once. (You're welcome!)

The content area is split into four parts: qualifying entities, record, profit/loss, and statistical significance. It is the last part which is new, but let me quickly touch on each to set the scene.

The example above has been set to display trainers whose course record in the last five years includes at least 20 runners. All other filter parameters have been set to 'Any'. I have then clicked the column heading by which I want to sort: the default is number of wins, and I have changed it to Win %.

We can see at the top of the content area that the picks of the trainers with runners today *on this configuration* are Nicky Henderson at Doncaster and Warren Greatrex at Bangor.

Looking to the race record next, both have struck at better than one-in-three under these conditions, and both have a better than 50% place record - which might be of interest to placepot players, as well as exacta/trifecta protagonists.

Of course, we need to consider win/place strike rates in the context of profitability (or 'lossability' if you're a layer), which is where the third set of columns come in. As well as win profit and loss we also calculate each way performance, and that data is displayed to the right of the performance record. Here, we can see that both Hendo and 'the great Rex' have rewarded support - both win and each way - historically in this context, as we might expect having sorted on win percentage.

Clicking on any row in the report will reveal inline information relating to that report row's entries. In the example above, Greatrex has two runners at Bangor, Ballyculla and Ma Du Fou. Clicking on a row in the inline display will open a new window for that race. (The aim is to make these reports both useful and usable!)

### A Measure of Utility

And so to the 'new stuff'. The final two columns, on the far right, are A/E and IV. These are measures of statistical significance where a score of 1.00 is the norm (i.e. neither significant nor insignificant). The job of A/E is subtly different from that of IV.

#### Actual vs Expected (A/E)

A/E, or the ratio of Actual versus Expected, attempts to establish the value proposition (profitability in simple terms) of a statistic. The 'actual' and 'expected' are the number of winners.

Eh? Expected number of winners? What's them then? Let me try to explain.

So we're happy the actual number of winners is just that. In the case of Nicky Henderson above, he has had 40 winners from 113 runners. Actual then is 40.

But how do we calculated the 'expected' number of winners? We use a simple formula based on the starting price (you could just as easily use Betfair Starting Price or even tote return if you were sufficiently minded - we've used SP), thus:

Actual number of winners / Sum of ALL [entity] runners' SP's (in percentage terms)

which we know at this stage to be 40/ Sum of ALL [entity] runners' SP's (in percentage terms)

To establish a runner's SP in percentage terms, we do the sum 1/(SP + 1).

For instance, 4/1 SP would be 1/(4 + 1), or 1/5, which is 0.20.

1/4 SP would be 1/(0.25 + 1), or 1/1.25, which is 0.8.

And so on...

The sum of Henderson's 113 runners' starting prices in the last five years, calculated in the above fashion, is 32.078.

Our A/E then is 40 / 32.078 which is 1.247, or 1.25 for cash.

We can then say that Hendo's Donny horses have performed 25% above market expectation in the last five years. Based on a reasonable sample size of 113 runners (all sample sizes for these reports are unacceptably small for categorical pronouncements, but in many cases are perfectly sufficient for wagering chances to be taken), we can say that in general terms his horses look worth following.

***********

#### Important Note!

*Before I go on, I want to make it plain that you absolutely do NOT need to understand how the numbers are arrived at. I am adding this info for geeks and the generally inquisitive.*

*What you need to know is that better than 1.00 is good, and worse than 1.00 is not good (for backers); and that the further away from 1.00 the better/worse the stat may be.*

***********

#### Impact Value

But wait. What if a sample has been skewed by a 66/1 winner? Or two big-priced horses? The A/E figure might look very attractive, but what are the chances of such an event repeating itself?

Good question, and I'm glad you asked! 😉

We use Impact Value to help reveal skewed datasets. Impact Value is essentially a glorified winners/runners ratio, except that it looks at things in the context of the micro (e.g. Nicky Henderson's 5 year record at Doncaster) versus the macro (e.g. all races at Doncaster in the last five years).

Here's how it goes...

IV = %age of Donny 5 year winners trained by N Henderson / %age of Donny 5 year runners trained by N Henderson

To work out the first bit, we need to know how many winners Hendo trained, and how many winners there were at Donny in the last five years overall. We already know Hendo trained 40 winners in that time, but what we didn't hitherto ken is that there were 1177 winners ridden at Doncaster (flat and jumps).

So, our "%age of Donny 5 year winners trained by N Henderson" is 40 / 1177 = .033985 (expressed as a decimal)

For the second bit, we already know that Henderson ran 113 horses at Doncaster in the period, and I can reveal that there were 12146 total runners in that time.

"%age of Donny 5 year runners trained by N Henderson" is then 113 / 12146 = .009303 (expressed as a decimal)

Impact Value therefore is 0.033985 / 0.009303 = 3.652905, or 3.65 to two decimal places.

T'riffic, Matt, but what does it all mean?

It means that, at Doncaster in the last five years, a horse trained by Nicky Henderson is more than three-and-a-half times more likely to win than the norm.

### The Perfect Combination

For backers, then, the ideal world is a reasonable dataset - in this context, more than 50 is fair, more than 100 is good - and both A/E and IV showing some way above 1.00.

For layers, the same principle applies regarding sample size but, of course, you'd be looking for A/E and IV figures as far south of 1.00 as possible (and, naturally, prices you'd be comfortable laying at!).

An example, which Sod's Law dictates will now win, is Chantara Rose. Her trainer, Peter Bowen, has a five year record at Doncaster of 0 from 34. This gives scores of 0.00 twice, so one needs to look at the place record too. He's had just six of those 34 make the frame. In that light, Chantara Rose may have her work cut out and can be laid at 7.8.

### Important Final Note

It is really, *really*, really important to note that the report output is best used as a starting point for your deliberations. Moreover, any horse can win any race, so - obviously - wagering on the basis of report output, or indeed on any basis, should be undertaken as part of a long game with a bank that supports and, where necessary, sustains it.

I know you know this, but I just want to be clear that good data allied to good punting sense is the way forward. Geegeez readers generally, and Gold users in particular, tend to 'get' this more than most punters... which sets us up rather nicely to profit from the sport we love.

Me and all of the team here at Geegeez very much hope this information will enhance the value of the 'bare stats' on the reports, and perhaps enable you to see more (or, just as importantly sometimes, less) value in the content of those data lists.

Matt

p.s. if you've any questions, just pop a comment below and I'll get back to you.