• Question: Remembering that Hattie said only an effect size of 0.4 or greater was worth considering (as almost every initiative in education makes *some difference), I'm trying to determine how to read scientific reports to determine this. In this report (http://onlinelibrary.wiley.com/doi/10.1111/j.1469-8749.2010.03661.x/full) am I right in thinking that none of the results have an effect size that comes anywhere near this, despite the causal claims made?

    Asked by Abena to Alex, Brian, Carolina, Courtney, Paula, Richard, Sara on 13 Mar 2018.
    • Photo: Carolina Kuepper-Tetzel

      Carolina Kuepper-Tetzel answered on 13 Mar 2018:


      Yes, with that size of sample, researchers may find effects of tiny size that reach statistical significance, but the question indeed is whether these effects are worth implementing on a larger scale in the classroom – as they are unlikely to make a real difference in learning and teaching.

    • Photo: Courtney Pollack

      Courtney Pollack answered on 13 Mar 2018:


      An additional way to look at results is in the original metric of the outcome (i.e., mean overall performance score). As Carolina said, with very large samples, very small differences can be statistically significant but lack practical significance. For example, Figure 2 shows that in Week 2 there was a statistically significant difference between Groups A and B, but the difference is only two percentage points in mean overall performance score, which seems like a very small difference. It would be even better if we knew the standard deviation of the outcome. Then we could also think about two percentage points in standard deviation units (e.g., is two percentage points half of a standard deviation? One hundredth of a standard deviation?) and whether an effect of that size is meaningful in the context of the specific outcome.

      It may also be helpful to think about the difference between relationships that are statistically significant and relationships that are causal, since these are different things. An ability to make causal claims largely comes from the design of a study. Indeed, it’s possible that the Classroom Exercise Program caused higher performance scores, but only by a very small bit!

    • Photo: Brian Butterworth

      Brian Butterworth answered on 15 Mar 2018:


      Depends on the test used. A medium effect size using Cohen’s d is about 0.5, using eta-squared it’s about 0.06 with ANOVA, etc etc. So it depends a bit. The cited paper on the effects of exercise seems to use eta-squared with ANOVA, but I found the analysis of the results rather confusing.

    • Photo: Richard Churches

      Richard Churches answered on 15 Mar 2018:


      Hattie’s meta-analysis shows the average effect across a very diverse range of domains as being d = 0.4 – it is hard to extrapolate what this means for an individual study from, for example, a non-included domain. Better to look at the effect for a domain within Hattie and compare a study in the same area to an individual study that looks at the same thing, to see how it compares (e.g. an individual study on feedback compared to the effect across a large number of feedback studies). In addition, the majority of effects in his analysis are from uncontrolled research (just pre-and post-test data from exposure to a single condition). Controlled studies generally produce smaller effects because pupils in the control condition will also probably have made an effect size gain, reducing the difference between control and intervention. The question, if a small effect is found, is which pupils were affected and how important that effect is. For example, in medical research, it was discovered that taking half an aspirin improved patient outcomes with patients at risk of a heart attack. The effect was only 0.2 – but a very important effect for those whose lives may have been saved (including my mother). A 0.2 effect is essentially a 14% non-overlap between the conditions. In other words, about 14% of pupils in the intervention scored higher that the highest score in the control condition. If you find a small effect,you need to look to see who those pupils were. For example, what if those 14% are all African Caribbean heritage boys in care?

Comments