• Question: Some schools use CAT tests or similar to try to predict outcomes for students across a whole range of subject areas. Do we know how reliable they are and if not, is there a better alternative?

    Asked by jtracy86 to Paul, Michael, Katherine, Daniel, Catriona, Anna on 24 Apr 2015.
    • Photo: Michael Thomas

      Michael Thomas answered on 24 Apr 2015:

      The short answer: CAT tests probably predict about a third of the outcome. Psychologists would say there are other things we could measure to tell us more (e.g., personality, home environment) but we would still be unable to explain about half of the outcomes in educational achievement based on our measures.

      The long answer (with some stuff on genetics and measuring brain structure):

      SAT tests carried out at aged 11 assess children’s attainment in maths, English and science – they measure what skills and knowledge children have acquired through teaching thus far. CAT tests (Cognitive Ability Tests) taken at 12 years of age are designed to test children’s underlying ability with respect to verbal (words) quantitative (numbers) and non-verbal (shapes and space) skills – a bit like an intelligence test. As a measure of underlying ability rather than attained skills, the reasoning is that CAT tests should be a better indicator of a child’s future potential.

      Performance on SATs tests should improve as a child practices the content of these tests, because they are mastering the relevant knowledge. CAT test are designed not to assess knowledge but abilities. Revising and practicing CAT tests shouldn’t help (though some kids can do worse in CAT tests just because the format is completely unfamiliar to them).

      I don’t have figures to hand on the actual predictive power of CAT tests. But some indication can be gauged from a recent study that reported how well intelligence test scores predicted GCSE results for 13,000 UK sixteen year olds. This was part of a study assessing environmental versus genetic causes of variations in educational achievement (see http://www.pnas.org/content/111/42/15273.full). The correlation of intelligence to GCSE performance was .58.

      Is .58 high or low? It’s worth noting the distinction between a correlation and the amount of variation explained. Technically, the amount of variance explained is the square of the correlation. I think the variance explained is perhaps the more interesting measure. It tells you how much of the explanation of variation in GCSE scores we have, just based on measuring intelligence. Here, it’s 34% (.58 x .58). So things other than intelligence are explaining 66% of what makes a child do well or poorly at GCSEs.

      From the same study, here are some other correlations / variance explained predicting educational achievement: self-efficacy .49 (24%), school environment .34 (12%), home environment .17 (3%), personality .28 (8%), well-being .26 (7%), parent-reported behaviour problems .33 (11%), child-reported behaviour problems (11%), and health .08 (1%).

      You may feel like we’re beginning to get a good picture here of what’s explaining variation in GCSE scores, as the percentages add up. But the complication is, these measures aren’t separate from each other.

      So, for instance, the children with high self-efficacy (belief in their ability to succeed) tended to have higher intelligence (correlation of .35). Variation in personality was also associated with higher intelligence (.18) and higher self-efficacy (.42), and so on. Because these measures used to predict GCSE score are all related to each other, we end up with a big slice of GCSE performance unexplained by any of these measure (55% unexplained in fact). We could say, it’s up for grabs!

      This study set out to uncover the origins of the variability in GCSE scores, using what’s called the ‘twin study’ method. It revealed that 62% of variation came from genetic origins, 26% from shared environmental factors (that is, environmental influences that make kids more similar) and 12% from unique environmental influences (that is, environmental influences that make kids different). 62% genetic seems like a lot, but this doesn’t mean that outcomes are inevitable. As the authors say: “heritability describes what is; it does not predict what could be. For example, despite high heritability, with sufficient educational effort, nearly all children could reach minimal levels of literacy and numeracy” (p.15276). See here for some more on genetics: http://www.psyc.bbk.ac.uk/research/DNL/personalpages/Thomas_etal_MBE_uncorrectedproof.pdf

      Of course, these are all behavioural measures, and as the questioner rightly says, we should be suspicious of their reliability. Kids can have off days with tests. Different tests are not directly comparable. Kids can be trained to the test. And so forth.

      A couple of years back, we ran a study giving intelligence tests to a group of 13 and 14 year olds, to assess their verbal and non-verbal IQs. At the same time, we put them in a scanner and measured the structure of their brains. Then we came back 4 years later and re-ran the tests. The verbal and non-verbal IQ scores were pretty similar. But not identical. In fact, the teenagers’ verbal and non-verbal intelligence could change by up to 20 IQ points up or down over these four years. Perhaps this just shows the unreliability of the intelligence tests? But when we looked at their brain scans, we found that there were local changes in brain structure over the 4 years that correlated with whether the kids’ IQ scores had gone up or down over that time. This tells us that the changes in verbal and non-verbal intelligence were real. (see http://www.psyc.bbk.ac.uk/research/DNL/personalpages/Ramsden_etal_2011.pdf)

      Three points to take from this study. First, neuroscience can offer a separate perspective to help us understand behavioural measures better, even in the area of intelligence. Second, we don’t know exactly why these kids’ verbal and non-verbal abilities changed over the teenage years – it could have been the unfolding of genetic potential, it could have been selection of particular classes to take (e.g., arts or sciences), it could be both. Third, it shows us that intelligence itself is not yet stable or fixed by the teenage years, so using it to predict educational outcomes will likely be a hit and miss affair.