Dateline: Wednesday 24th August, 2011 – the nation waits with baited breath for the latest L2 (or if you’re the media, you will insist on calling it GCSE) results to be unveiled. Lets see if we can predict tomorrows headlines:
- Best GCSE results ever
- GCSE are getting easier
- Schools underachieving in [insert your favourite whipping boy subject here]
- “Gender gap” widens in [insert your favourite whipping boy subject here]
- League tables are great
- League tables are useless
- Schools choose Btec for league tables
- Afro Caribbean boys under perform in [insert your favourite whipping boy subject here]
- We are failing students on Free School Meals….
- Drinking water gives you cancer (couldn’t resist a dig at the Daily Mail!!!)
Putting aside all the hard work that learners, teachers, parents and everyone involved in the process of achieving success at L2 undertakes – and we are left with the media storm that will inevitably kickoff between 8:30am and 10:30am tomorrow. Just wait – you know I’m right.
Drum roll for my annual rant: The stats are misleading – they are based on MEANS of data sets.
Take for example, the gender differential which is bound to be discussed tomorrow. The data will show a difference between boys and girls performance – either overall, or for specific subjects. Educational experts will be interviewed and opinions offered as to why girls out perform boys (generally) and why in physics (for example) boys beat girls.
Importantly though, these opinions are based on the MEANS of data and often this reliance on just the mean as a measure of central tendency masks a more profound underlying reason.
This is the data from our school for the 2007 results. The chart shows that the mean of the average point score on leaving for Girls is higher than for Boys. That factoid led to the school setting up working parties to “close the gap” between the genders.
The chart also shows the range of the data – this simple addition should be sufficient to demonstrate that the mean is not the end of the story. Girls performance has a higher top end than the boys, and these high achievers are pulling up the overall mean for the Girls. Indeed if you remove these 5 extra high performers there is little to differentiate the girls from the boys.
Statistics has a tool to determine the significance of data sets being “different” – analysis of variance. You can calculate a number that represents the chance that the data sets are actually different. In business and science, we normally talk about 99.5% chance that the means are really different. (In the case of my data, the value comes out at 18% – so we would conclude that there is something more significant causing the split in the data, and we would continue to keep looking)
So we dig – and end up segmenting the data (boys and girls now lumped back together) by teacher attendance (for Science) and looking only at Science scores.
We find that the difference between students whose teacher is present less than 90% of the time and those whose teacher in present more than 95% of the time, corresponds to more than 1 GCSE grade. This is significant to 100% – the data proves that teacher attendance (in this case, for these learners) was more important than the learners gender.
But so what?
I bring this rant out of the closet every year because I am tired of hearing the same old analysis of the results that importantly leads me to have to fix (gender differential) something that might not actually be there in the first place.
Means are really blunt tools in analysing data, and in the real world are next to useless in determining if data sets are different. Take this example: Students living in ODD numbered houses and EVEN numbered houses. I guarantee that if you did that analysis there would be a difference between ODD and EVEN houses — are we to conclude that this is a problem and needs fixing? Same for left / right handed, XBox vs PS3, Love / Hate Marmite, etc etc.
Just because the MEAN shows something, lets be grown up and analyse the data properly. At one point we all took a first degree and most of us will have used statistics to write up our final year dissertations. Use those skills now (or relearn them if you need to)!!
So, do you act on the MEANS of data sets?
Incoming links to this article: “education system failing boys”