The replication crisis has shaken our understanding of what rigorous research looks like in the social sciences. Practices that were once common—such as small samples, or extensive re-analysis of data until a significant effect is achieved—are now frowned upon. But, what does this mean for individual researchers who are confronted with flaws in their own research record? In psychology, it seems at least, many authors have decided to opt for silence. New studies may be conducted according to more rigorous standards, but what happened in the past stays in the past.
This is not per se an obstacle to scientific self-correction, which can occur on the collective level even if authors remain silent about issues with past investigations, and even if they stubbornly cling to questionable prior findings. However, as we argue in our recent paper, scientific self-correction could be much more efficient if authors were willing to openly discuss problems with their past studies. For example, if somebody disclosed that a published finding was cherry-picked from a large number of statistical comparisons, this could inform others who were planning to build on said study, or who planned to replicate it. But are psychologists willing to disclose such information?
We launched a website on which we invited researchers to submit a statement describing how they lost confidence in one of their own published findings. We asked for cases in which the central result of an article was called into question, and in which there were theoretical or methodological problems for which the submitter took responsibility. The public reaction to our initiative was amazing: almost everybody agreed that such a project was urgently needed, and there was some early media coverage.
At the same time, barely anybody submitted an actual loss-of-confidence statement. Statements trickled in very slowly, and after repeated solicitation, we were able to collect 13 over the course of more than a year. The content of these statements was quite varied, with some common themes: miss-specified models, invalid inferences, and, in more than half of the statements, some form of p-hacking. All statements can be found in the published article.
In surveys, researchers routinely reveal how widespread certain questionable research practices were. So why did we receive so few statements? We conducted an anonymous follow-up survey, querying researchers across fields for their experiences. Our sample was not representative and so we cannot provide any precise estimates, but we still believe that the survey results shed some additional light on the culture of self-correction. Almost half of the respondents had lost confidence in a previously published finding, and of these, about half believed that this was due to a mistake or shortcoming in judgment on the part of themselves, the researchers.
The overwhelming majority reported that their loss of confidence was not a matter of public record in any way, and the reasons for this were diverse. More than half of the respondents were insufficiently sure about the subject matter to proceed in any form; almost half believed that public disclosure was unnecessary because their finding hadn’t attracted much attention; many were concerned about their coauthors’ feelings or didn’t know an appropriate venue. Overall, it seems like losses of confidence occur frequently, but are rarely reported due to uncertainties regarding both the substantive matter and the best way to proceed forward.
What could we do to encourage public self-correction? Currently, such behavior is actively discouraged by academic incentive structures. Time spent on correcting past work is time that cannot be spent on creating new work, and researchers are frequently evaluated based on the quantity of their output. Assuming that we cannot change much about the focus on quantity, it may thus make sense to establish critical commentaries on one’s own work as an article category. But maybe it is also possible to shift the focus of evaluation from quantity to quality—after all, expectations regarding the quantity of publications in psychology exceed those in other social sciences.
Reputation also plays a role. About a quarter of our survey respondents reported concerns about how a public disclosure of a loss of confidence would be perceived, reflecting the nature of self-correction as a collective action problem, rather than an individual failing. However, worries may be exaggerated. It is, for example, unclear whether self-retractions actually damage researchers’ reputations. Recent high-profile cases of self-correction in psychology have received positive reactions from within the psychological community, and we may try to foster an alternative narrative: scientists make errors, self-correction credibly signals that one cares about the correctness of the scientific record.
Beyond, there are more pragmatic questions that need to be addressed. Journals and publishers have often been reluctant to publish corrections and criticism, sometimes even if the instigator was the original author. There is currently no standardized protocol for what to do if one discovers a major mistake in one’s own published work. Retractions are a standard option but are often associated with the notion of deliberate fraud. Alternative labels have been suggested (“authorial expression of concern”, “voluntary withdrawal”), though in many cases adding to the research record may be the more transparent and productive way forward. The form of such amendments may vary, but they will only be of use if they are directly linked to the original work, be it in established databases (such as PubMed) or directly on the website of the journal. In the end, the static article format may be antithetical to the idea of self-correction, and more dynamic systems incorporating version control (such as the Springer Nature Living Reviews journal series) may be needed to ultimately improve science.