The journal Psychological Science is taking steps to encourage would-be authors to give reviewers easy access to the data underlying the analyses reported in their manuscripts. This is part of a wider effort to promote transparency and replicability in works published in the journal. I discussed the rationale for encouraging authors to share data and materials in a recent editorial, “Sharing Data and Materials in Psychological Science.” Here I briefly highlight some of the principle points.
At the recent International Convention of Psychological Science, University of California, Davis psychologist Simine Vazire quoted the motto of the Royal Society, perhaps the oldest learned society for science: Nullius in verba, “On no one’s word.” That is to say, “No offense, mate, but show me the data.”
Science is rooted in data. Scientists use reason and rhetoric and a variety of other tools, but what makes science ‘science’ is that data are the ultimate arbitrators of claims. Yet data are themselves subject to interpretation, and the summary statistics produced by data analysis do not always accurately capture the true nature of a data set. To give a simple example, if a set of scores has a bimodal distribution, then the mean and standard deviation of those scores do not accurately represent the distribution. Moreover, inferential statistical tests used to assess the “statistical significance” of results (e.g., the likelihood that two sets of scores would differ to the observed extent by chance alone) entail various assumptions about the data. When the assumptions are violated, the interpretation of such tests can be badly compromised.
In short, the devil is in the details of the data. It follows that reviewers of manuscripts submitted for publication can do a better job of assessing the extent to which the data support the manuscript’s claims if they can examine the data.
Psychological Science is not requiring submitting authors to share data as part of the review process, but we are strongly encouraging such sharing. Specifically, authors are asked to share their data with reviewers or to explain why they are not sharing it. This fits with the Transparency and Openness Promotion (TOP) guidelines developed and championed by the Center for Open Science, which have been endorsed by more than 2,900 journals and organizations, including the Association for Psychological Science. Our submission procedure also satisfies the Peer Reviewers’ Openness Initiative.
It is not easy to create a data and analysis archive that is so clearly explained and well document that an outsider can understand it. It’s not easy to be a good scientist. Planning for and developing an archive for data and analyses should be a standard part of conducting a hypothesis-testing project. Best practice is to preregister your plan for a project before you begin the hypothesis-testing phase. For further information about preregistration, see the Observer article I coauthored with Dan Simons and Scott Lilienfeld, “Research Preregistration 101.”
What constitutes “the data” is not always clear. Generally, authors do not provide the rawest form of the data, and indeed often what is provided to reviewers is several steps removed from the rawest form. For example, some studies take as their raw data video recordings of subjects’ behavior, which are later summarized or scored in some way. It is helpful for reviewers (and readers) if samples of such videos are available, but practical and/or ethical constraints might preclude sharing all of the footage and in any case it is the quantitative scores submitted to analysis that are generally most of interest to reviewers. The principle is that authors are encouraged to share the data that will enable reviewers to assess the claims advanced in the manuscript.
The ideal to which I hope to inspire authors is that they post their de-identified data in immutable, time-stamped form in a third-party site (not necessarily with free and open access to the world, just providing some way by which reviewers can access the data). See http://www.re3data.org/. I expect that having done this as part of the review process will greatly increase the likelihood that authors of accepted submissions will make the data available to other psychologists in such a repository post publication.
Helping scientists have easy access to one another’s data should increase the value of those data and the amount learned from them. This is not just a matter of detecting errors or shortcomings, but also of building creatively on prior data (e.g., via alternative analyses that shed new light on the meaning of a data set or mega-analyses that combine data from several studies).
Encouraging authors to share data (and materials) is just the latest step Psychological Science has taken to encourage transparency and replicability in the works it publishes. See, for example, Eric Eich’s 2014 editorial, “Business Not as Usual,” and my 2015 editorial, “Replication in Psychological Science.” There is evidence that these efforts are paying off (e.g., see the figure below from Kidwell et al., 2016).