Posted: 4/10/2012 7:20:16 PM
Ok, So how this works:
1) I've used ALL scores from the first 2 rounds, including players that have missed the cut. This is different from just about any other site that I've seen doing statistical correlations between key stats and score - many of them even only use the top 5 or 10 or at best all players that make the cut. We've discussed in the past that it's often more effective to find "fades" than "fors", so I think it's important to have the bottom end of the correlations in there.
2) First thing I do is normalize all scores and stats to be between 0 and 1. 0=sucks, 1=the best
3) Then, the numbers in the table above are a simple linear regression between the Y (score) and the X (different stats).
Slope= the slope of the regression line. The closer this value is to 1, the better indicator that stat is of a players score.
R2= this is R-squared. A pure statistician probably wouldn't like my simplification, but you can think of it as the confidence you have in the slope number. R2 also has a max of 1. In other words, the closer it is to 1, the better the data fits the regression line or the fewer "outliers" we have.
The first column is the different stats:
gir=greens in reg
ppr=putts per green in reg