A study published last year that effectively accused academic psychologists of producing unverifiable “psychobabble” has itself been criticised for misleading claims, inaccuracies and flawed methodology.
The 2015 study, a unique collaboration by more than 270 researchers, had claimed that just 39 per cent of the findings reported in 100 published studies could be reproduced unambiguously – suggesting that about six out of every 10 research claims in psychology were irreproducible hogwash.
However, a re-appraisal of the study by a team of Harvard academics has poured scorn on the way the research was carried out, claiming that it had introduced errors, biases and “infidelities” that grossly distorted the truth about the reliability of psychological research.
“The first thing we realised was that no matter what they found – good news or bad news – they never had any chance of estimating the reproducibility of psychological science, which is what the very title of their paper claims they did,” said Daniel Gilbert, professor of psychology at Harvard University, who led the re-appraisal.
“Let’s be clear, no one involved in this study was trying to deceive anyone. They just made mistakes, as scientists sometimes do,” Professor Gilbert said.
Nevertheless, the study, when it was published as a research paper in the journal Science, created headlines around the world and even led to changes in policies governing the publication of psychology research.
“This paper has had an extraordinary impact. It was Science magazine’s number three ‘Breakthrough of the Year’ across all fields of science. It led to changes in policy at many scientific journals, changes in priorities at funding agencies, and it seriously undermined public perceptions of psychology,” Professor Gilbert said.
“So it is not enough now, in the sober light of retrospect, to say that mistakes were made. These mistakes had very serious repercussions,” he said.
However, the lead author of the 2015 study refuted suggestions that he and his colleagues had made mistakes when trying to replicate earlier work. Professor Brian Nosek of Virginia University in Charlottesville said that his study made it clear that failure to replicate may be due to key differences between the original research and the replication.
“Where we disagree with Gilbert and colleagues is that they are willing to draw a conclusion that this is what occurred. Gilbert and colleagues selectively report available evidence that favours their conclusion and ignores evidence that is counter to their conclusion,” Professor Nosek told The Independent.
The argument centres on whether the research findings in the 100 published studies could be replicated by other researchers while following the same scientific protocols. The study, published last year by a consortium known as Open Science Collaboration (OSC), tried to replicate each piece of research to answer that question.
They found that more than half of the findings from the 100 studies, which covered subjects ranging from racial stereotyping to perceptions of military service, could not be reproduced when a second set of experimenters supposedly followed the same methodological rules.
However, when the Harvard scientists analysed the OSC’s methodology they found that it had inadvertently introduced statistical errors which made accurate replication impossible, and they questioned whether the same methodology was truly followed.
“If you want to estimate a parameter of a population then you either have to randomly sample from that population or make statistical corrections for the fact that you didn’t. The OSC did neither,” said Gary King, professor of psychology at Harvard and co-author of the re-appraisal, which is also published in Science.
“And that was just the beginning. If you are going to replicate a hundred studies, some will fail by chance alone. That's basic sampling theory. So you have to use statistics to estimate how many of the studies are expected to fail by chance alone because otherwise the number that actually do fail is meaningless,” he said.
When the Harvard researchers performed the calculations as they should have been done by the OSC consortium, they found that the number of failures they observed were about what would be expected from chance effects alone – even if all 100 studies were actually true.
The Harvard group also found important discrepancies between the ways the original studies and the replicating research were carried out.
For instance one study involved showing white students at Stanford University a video of other Stanford students discussing university admissions policies in a racially mixed group. However, when the study was replicate by OSC it was performed with students from the University of Amsterdam.
“They had Dutch students watch a video of Stanford students, speaking in English, about affirmative action policies at a university more than 5000 miles away,” Professor Gilbert said.
In a final twist, the Harvard analysis also found that the OSC study had left out some replicating studies that had actually confirmed the reproducibility of the original research – which exaggerated the irreproducible nature of psychological research.
“All the rules about sampling and calculating error and keeping experimenters blind to the hypothesis – all of those rules must apply whether you are studying people or studying the reproducibility of science,” Professor King said.
“If you violate the basic rules of science, you get the wrong answer, and that's what happened here,” he said.