Remembering that Hattie said only an effect size of 0.4 or greater was worth considering (as almost every initiative in

Question: Remembering that Hattie said only an effect size of 0.4 or greater was worth considering (as almost every initiative in education makes *some difference), I'm trying to determine how to read scientific reports to determine this. In this report (http://onlinelibrary.wiley.com/doi/10.1111/j.1469-8749.2010.03661.x/full) am I right in thinking that none of the results have an effect size that comes anywhere near this, despite the causal claims made?
- Keywords:
  causal claims,
  causality,
  effect size
Asked by Abena to Alex, Brian, Carolina, Courtney, Paula, Richard, Sara on 13 Mar 2018.
- Carolina Kuepper-Tetzel answered on 13 Mar 2018:
  
  Yes, with that size of sample, researchers may find effects of tiny size that reach statistical significance, but the question indeed is whether these effects are worth implementing on a larger scale in the classroom – as they are unlikely to make a real difference in learning and teaching.
- Courtney Pollack answered on 13 Mar 2018:
  
  An additional way to look at results is in the original metric of the outcome (i.e., mean overall performance score). As Carolina said, with very large samples, very small differences can be statistically significant but lack practical significance. For example, Figure 2 shows that in Week 2 there was a statistically significant difference between Groups A and B, but the difference is only two percentage points in mean overall performance score, which seems like a very small difference. It would be even better if we knew the standard deviation of the outcome. Then we could also think about two percentage points in standard deviation units (e.g., is two percentage points half of a standard deviation? One hundredth of a standard deviation?) and whether an effect of that size is meaningful in the context of the specific outcome.
  
  It may also be helpful to think about the difference between relationships that are statistically significant and relationships that are causal, since these are different things. An ability to make causal claims largely comes from the design of a study. Indeed, it’s possible that the Classroom Exercise Program caused higher performance scores, but only by a very small bit!
- Brian Butterworth answered on 15 Mar 2018:
  
  Depends on the test used. A medium effect size using Cohen’s d is about 0.5, using eta-squared it’s about 0.06 with ANOVA, etc etc. So it depends a bit. The cited paper on the effects of exercise seems to use eta-squared with ANOVA, but I found the analysis of the results rather confusing.
- Richard Churches answered on 15 Mar 2018:
  
  Hattie’s meta-analysis shows the average effect across a very diverse range of domains as being d = 0.4 – it is hard to extrapolate what this means for an individual study from, for example, a non-included domain. Better to look at the effect for a domain within Hattie and compare a study in the same area to an individual study that looks at the same thing, to see how it compares (e.g. an individual study on feedback compared to the effect across a large number of feedback studies). In addition, the majority of effects in his analysis are from uncontrolled research (just pre-and post-test data from exposure to a single condition). Controlled studies generally produce smaller effects because pupils in the control condition will also probably have made an effect size gain, reducing the difference between control and intervention. The question, if a small effect is found, is which pupils were affected and how important that effect is. For example, in medical research, it was discovered that taking half an aspirin improved patient outcomes with patients at risk of a heart attack. The effect was only 0.2 – but a very important effect for those whose lives may have been saved (including my mother). A 0.2 effect is essentially a 14% non-overlap between the conditions. In other words, about 14% of pupils in the intervention scored higher that the highest score in the control condition. If you find a small effect,you need to look to see who those pupils were. For example, what if those 14% are all African Caribbean heritage boys in care?

Comments

Richard commented on 15 Mar 2018:

Correction – higher than* not higher that

Log in to Reply
- Abena commented on 15 Mar 2018:
  
  Thank you Richard. That helps a lot in terms of making the numbers mean something concrete. And good advice about looking at related studies – that’s super helpful for me (and other interested teachers) moving forward.
  
  Log in to Reply
Abena commented on 15 Mar 2018:

@Courtney – if you have any accessible links for teachers to understand standard deviation that would help. The issue is translating it into something most teachers (non-researchers) would understand. Thanks so much for giving us some starting points for further learning.

Log in to Reply
- Courtney commented on 16 Mar 2018:
  
  Hi Abena,
  Sure!
  I think the write-up on this page (http://www.shankerinstitute.org/blog/what-standard-deviation) does a nice job of introducing the concept of standard deviation and how it relates to interpreting effect sizes, with an example in education. I think it also does a nice job of touching on some points made above about interpreting effect size carefully and thinking about context.
  For folks who want more, there is a video on Khan Academy (https://goo.gl/6wdzAU) that goes through an introduction to what standard deviation is and then walks through an example of how to calculate it (which can be helpful in thinking about what it is). I hope this helps!
  
  Log in to Reply