everal schools have asked about the interaction between ‘Scaled Scores’ (as reported following End of Key Stage tests) and Standardised Scores (as reported by standardised tests such as those provided by Rising Stars, GL Assessment, CEM or NFER). Whilst these scores look similar, most people in school are by now aware that they are actually quite different. Given the superficial similarity between scores reported as a Scaled Score of ‘100’, say, and a Standardised Score of ‘100’, it isn’t hard to see why.
I wrote a blog for CEM which explains how Standardised Scores are created. As I said in the blog, ‘For nationally standardised tests, a mean and standard deviation based on a representative sample of the population will give an indication of a student’s position within the national population of those taking the test.’ This means that a Standardised Score of 100 tells you that the underlying test score is the same as the mean score on the test recorded by a reasonable national sample of those taking the test.
Standardised Scores have various limitations, but in principle they are effective when it comes to ranking children against a reference group of children. They do not, however, give any information about the performance of children against a set of standards.
Scaled Scores as reported for Key Stage 2 are used to place children on a scale from 80 to 120. These scores are intended to provide a numerical indication of children’s performance, largely so that a Value Added calculation can be made to be used within the government’s accountability structure for schools.
A panel of experts is convened to designate the raw test score which is deemed to indicate be ‘of the expected standard’. Anything above is higher than the expected standard, anything below is not. The raw scores are then converted into an 80 to 120 scale, where 100 is the ‘expected standard’. The tables for the most recent KS2 Scaled Score conversions can be found here.
The government has muddied the waters a little more/made things easier to understand by introducing the terms ‘working towards the expected standard’, ‘working at the expected standard’ and ‘working at greater depth than the expected standard’. Any score between 80 and 99 is ‘working towards the expected standard’, between 100 and 109 is ‘working at the expected standard’ and a score from 110 to 120 is ‘working at greater depth than the expected standard’.
How Scaled Scores and Standardised Scores interact
This is where some interpretation of the two different types of scores is necessary. Head teacher Michael Tidd notes that standardised tests which report using Standardised Scores are different to statutory End of Key Stage tests which report Scaled Scores saying that, ”while only 50% of children can score over 100 on the standardised test, around ¾ can – and do – on the statutory tests.”
As Michael notes, “Scoring 95 on one year’s standardised test is no more an indicator of SATs success than England winning a match this year means they’ll win the World Cup next year.”
We do have some data which helps to understand the interaction between Scaled Scores and Standardised Scores. Data analyst Jamie Pembroke has produced blogs on converting the 2017 and 2018 KS2 scaled scores to standardised scores, the latest of which suggests that a Standardised Score between 90 (most generous) to 95 (least generous) is – very roughly – likely to be similar to a Scaled Score of 100.
Rising Stars (producers of PIRA, PUMA and GAPS tests) suggest a Standardised Score of 94 and above indicates ‘working at the expected standard/greater depth’. They also suggest that ‘Greater Depth’ is indicated by a Standardised Score of 115 and above.
What should Databusting Schools do?
Broadly, schools should use Standardised Tests where possible to generate unbiased pupil performance data. This data can then be used (alongside the various other sources of information) in discussions about children’s development. Administering Standardised Tests should generally be done in Year 3 and above, and – unless you have a particular reason to do so – it should be done no more than once a year.
Children in each cohort should then be placed into three broad groups:
All of these groups should be expected to make good progress through good classroom teaching. Children in Group C will generally need additional targeted support, with the aim where possible of moving into Groups B/A over time.
The cut-offs for each of these groups are broadly as follows:
With interesting noises coming from Ofsted, Primary Schools are thinking more and more about what progress and attainment actually mean for their school, and what they should be doing to ensure that they have a sensible system for monitoring children’s development as they move through school.
Using Standardised Scores to generate unbiased indications of a children’s relative performance, and grouping children into three broad categories each year, will help schools to build up a picture of a child’s relative performance over time. Linking Standard Scores to the standards expected at the end of key stage is not without issues, but can help Databusting schools to direct their resources to best support the children in their care.