Paper on Showing Significance Proves Very Significant Decade Later

What research makes most impact 19 years later? graphic

You might expect an academic paper that asserts you can present nearly any research finding as significant would be widely read and cited. And in the case of “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant,” a 2011 paper by Joseph P. Simmons, Leif D. Nelson, and Uri Simonsohn appearing in Psychological Science, you’d absolutely be right. 

That paper, with 4,100 citations, is the second-most cited paper publishing by SAGE Publishing in 2011. As a result, the paper and its authors recently received a 10-Year Impact Award for 2022.  

SAGE, the parent of Social Science Space, started the 10-Year Impact Awards in 2020 as one way to demonstrate the value of social and behavioral science. While article citations and journal impact factors are the standard measure of literature-based impact in academia, these measures’ two- or five-year window don’t account for papers whose influence grows over time or that are recognized at a later date. This is especially acute in the social sciences, where impact factors “tend to underestimate” the value of social science research because of time lags and social science’s interest in new approaches, rather than solely iterative ones. 

Joseph P. Simmons, left, Leif D. Nelson, and Uri Simonsohn

In their paper, Simmons and Simonsohn, both of the Wharton School at the University of Pennsylvania (Simonsohn is now at Barcelona’s ESADE Business School), and Nelson, at the University of California, Berkeley’s Haas School of Business, focused on “how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis.” As they wrote then, looking at empirical psychology specifically, “in many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not.”  

Rather than leave a provocative statement like that hanging – and they did get some blowback – the trio also offered “a simple, low-cost, and straightforwardly effective disclosure-based solution” to address the issue with a “minimal burden” on the publication process. 

We asked the authors a few questions about their paper and its reception over the intervening decade. Nelson took the lead in answering, with a few additional thoughts offered by Simonsohn. 

In your estimation, what in your research – and obviously the published paper – is it that has inspired others or that they have glommed onto?  

I think a large part of it is that, at some fundamental level, our claims are simple and practical: Researchers have lots of degrees of freedom in their analyses; those degrees of freedom dramatically increase the likelihood of a false positive finding, but solving that problem could be as easy as asking researchers to transparently disclose what they did. 

None of that is fancy, but it can change how someone reads a published finding, what an editor looks for in a submitted manuscript, and what authors do to better assess their own work. 

What, if anything, would you have done differently in the paper (or underlying research) if you were to go back in time and do it again?  

I am sure that there are many ways in which we could have written a better paper, but those are always hardest for the authors to see.  

There is one finite small thing that we would definitely revisit. We suggested that journals require a minimum sample size of 20 observations per experimental treatment. We received lots of pushback on setting an arbitrary standard. I don’t think that we were wrong about setting an arbitrary standard (arbitrary standards are baked into lots of science), but we were very wrong about the number. N = 20 is ridiculously too low. And yet, a meaningful percentage of the citations for our paper are other researchers merely using our paper to explain why they used that low sample size. We partially caused that unfortunate outcome. 

Of course, we didn’t realize at the time that our suggestion was so low, but subsequent thinking and research has revealed that the threshold should be 100 or 200 or 500… We don’t know what the right number is, but we do know that it should be a lot higher. We should have had a better suggestion or no suggestion at all. 

Simonsohn added: I also would have chosen the sample size minimum as something I’d do different, but rather than choose a higher number I would have gone with what we ended up doing in our pre-registration site (AsPredicted), merely asking authors to indicate sample size was chosen *before* running the study. 

What direct feedback – as opposed to citations – have you received in the decade since your paper appeared?  

Lots of people tell us that it helped shape how they evaluate research. That is nice to hear. Some people say that it changed how they conduct research in their labs and what they encourage in the labs of others. That is nice to hear also. At the extreme, people occasionally tell us that we have improved the field. That is nice to hear also. I don’t know if those people are right. I hope so. 

We also used to get a fairly steady supply of people telling us that we were irresponsible, destructive, and mean. That is not as nice to hear. Over time there has been less of that. I like to hope that is because more people have become persuaded of our (authentically non-destructive and non-mean) argument, but I accept that it might just be because they have gotten bored with telling us how they feel.  

How have others built on what you published? (And how have you yourself built on it?)  

The last decade has seen truly dramatic changes in the practices of behavioral scientists. The norm has shifted to considerably more transparency in every stage of the research process. Many researchers post and publicly share their data, materials, and pre-registrations. Soon it will be “most researchers” instead of many. Furthermore, the entire endeavor of improving practices and meta-scientific inquiry more broadly has increased in volume and in audience. The practices of experimental psychology are very different than they were a decade ago.  

Our paper was only a small part of getting that started, and others have led the way on so many dimensions. 

Could you name a paper (or other scholarly work) that has had the most, or at least a large, impact on you and your work? 

Simonsohn: Eric Eich, editor at Psychological Science shortly after we published our paper, but who did perhaps more than anybody else in psychology to impact the field in general, and help our paper have impact in particular.  

An interview with the lead author of the first most cited article in the 10-year Impact Awards, on Amazon’s Mechanical Turk, appears here. And an interview with the lead author of the third most-cited article, on the Danish National Patient Register, appears here. 

0 0 votes
Article Rating


SAGE Publishing, the parent of Social Science Space, is a leading international publisher of journals, books, and electronic media for academic, educational, and professional markets. An independent company, SAGE has principal offices in Los Angeles, London, New Delhi, Singapore, Melbourne and Washington DC.

Notify of

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Inline Feedbacks
View all comments
Would love your thoughts, please comment.x