How common is it for researchers to replicate their own work?

Question: How common is it for researchers to replicate their own work?
- Keywords:
  evidence,
  replication
Asked by Abena to Sara, Richard, Paula, Mike, Courtney, Carolina, Brian, Alex on 12 Mar 2018.
- Richard Churches answered on 12 Mar 2018:
  
  Replication in education research is a rare thing… but essential to scientific method. In the 19 teacher trials we have several teachers who are doing parallel replications. One teacher has trialled spaced learning in KS1 and KS2 lessons in parallel to look at the effects of the approach on different age groups and with different lesson content.
- Paula Clarke answered on 12 Mar 2018:
  
  I think it is very common, especially with smaller scale experimental work. It is more challenging with RCTs as they are typically expensive to run and it can take time to see the impact of intervention. However, I guess I am thinking about it from the perspective of psychological research – I agree replication seems to be less common in education research.
- Brian Butterworth answered on 12 Mar 2018:
  
  Excellent question. In the best cases, we try to replicate within the experiment, rather than in a separate experiment. For example, if we have a large sample, we can divide it in two, and see if the same effect applies in both halves. In genetic experiments, it is standard procedure nowadays to try to replicate in a different sample of people. Sometimes, we will try a slightly different procedure to see if it comes up with a concordant result: this a way of testing whether our explanation is a valid construct.
- Alex Hodgkiss answered on 13 Mar 2018:
  
  Just following up with what Brian said on different procedures with an example (although this is from previous associational research rather than an RCT). In an initial study, I found a link between spatial cognition and science achievement in the 7-11 age range; this was based on broad curriculum-based science measures. The recent follow up study looked at the relationship in a more fine-grained manner, based on children’s participation in a science lesson on a specific topic, but with a similar spatial tasks. The findings of the second study mapped onto the first, e.g, the same spatial task was the best predictor, but went further (e.g, took into account a greater range of possible confounds), and was more valid to the classroom by doing an assessment following learning. So not a direct replication but replicated many of the results.
- Carolina Kuepper-Tetzel answered on 13 Mar 2018:
  
  Wonderful question that is impossible to answer for two reasons:
  
  1) For the longest time, if you replicated a finding, it would have been impossible to publish it because it was just a replication. Things in research are changing in that respect, but the change is slow.
  2) Not replicating a finding – in most cases probably revealing a null finding – are extremely hard (read: impossible) to publish because again the mindset in research for a long time now was that non-significant effects are not worth reporting. So, non-significant findings went into the file drawer.
  
  In both cases, successful and unsuccessful replication attempts could usually not be published and doing replication studies were not rewarded by the scientific community. As a consequence many researchers would decide to stay away from replications altogether. Taken together, no one knows what the rate of replication actually is. Personally, I think, this is a shame and many other researchers think so too. So currently there is a great push towards a more transparent approach to replicability and research, in general. I hope that with this new movement things will get clearer in the future.

Comments

Abena commented on 13 Mar 2018:

Thanks for taking the time to respond.

Log in to Reply
- Richard commented on 15 Mar 2018:
  
  One of the challenges with some of the large-scale RCTs is that although they have larger sample sizes, if they become expensive they preclude replication (because they become too expensive to repeat). Replication is an essential part of scientific method. In controlled experimental studies, all we can ever talk about is the probability that the result may have been arrived at by chance. A threshold is set for significance, by convention a minimum five in a hundred possibility that the result may have occurred by chance (p < 0.05 (1 in 20)). Even if a single study crosses this threshold, it could be the 1 in 20 anomaly. So you need to replicate to know if that result was an anomaly. If you are a science teacher you will know that every time you get a class to do an experiment you will find it fail sometimes, even if it is a well established phenomenon.
  
  Log in to Reply
Richard commented on 15 Mar 2018:

Brian makes a really good point above. If you can afford to have a sample size of 2,000 better to do two replications with 1,000 participants. This is the approach we have taken in the smaller scale teacher-led neuroscience-informed RCTs. For example, one teacher has done a parallel replication with KS1 and KS2 (single lesson of spaced learning), another teacher a parallel replication of retrieval practice with the same protocol in primary and secondary schools in Lincolnshire (in this case, with 900 pupils across both replications). Even doing this, further replication with the different age groups will be required before a theory can be developed.

Log in to Reply
Richard commented on 15 Mar 2018:

In relation to the above, for those unfamiliar with the p-value, the p stands for probability. p = 0.50 is a 50 in a hundred probability the result may have occurred by chance; 0.01 a one in a hundred probability; 0.001 a one in a thousand probability. The probability statistic is a combination of the strength of the effect (effect size) and the sample size. Thus, a large effect with a small sample could produce the same p-value as a small effect with a large sample.

Log in to Reply
Courtenay commented on 22 Mar 2018:

This is becoming more common, but it is perhaps even more useful to have other researchers replicate your findings in different samples. It is more common to tweak the original set up. For example, in my autism work I have been very interested in how variation in core language skills affects performance on a number of tasks. So the basic design (compare autistic children with ‘good’ core language to those with language impairments) and the findings that the autistic children with language impairment are much for severely impaired on social and pragmatic tasks has been replicated many times, using different sorts of tasks. And it has been replicated by other groups using similar tasks. So I now believe this is a true finding and an important one for education and clinical practice.

Log in to Reply

Question: How common is it for researchers to replicate their own work?