Research

Maintaining Anonymity In Double-Blind Peer Review During The Age of Artificial Intelligence

By Leonard Bauersfeld, Angel Romero, Manasi Muglikar and Davide Scaramuzza

August 23, 2023 1270

Finger pointing at middle of a network of people. — (Photo: Gerd Altmann/Pixabay)

Peer-review stands at the core of the academic world, where researchers diligently review each other’s findings before publication, ensuring the quality and integrity of scholarly work. The double-blind review process, adopted by many publishers and funding agencies, plays a vital role in maintaining fairness and unbiasedness by concealing the identities of authors and reviewers. However, in the era of artificial intelligence (AI) and big data, a pressing question arises: can an author’s identity be deduced even from an anonymized paper (in cases where the authors do not advertise their submitted article on social media)?

In a recent article we investigate this very question, by leveraging an artificial intelligence model trained on the largest authorship attribution dataset to date. Created from the publicly available manuscripts on the arXiv preprint server, it comprises over 2 million research papers and tens of thousands of authors. Focusing purely on well-established researchers with at least a few dozen publications, our work demonstrates that reliable author identification is possible.

Our study delves into the capabilities of an advanced AI model that harnesses the textual content of research papers and the references cited by authors to predict the likelihood of a given researcher being the author of a paper. The one with the highest predicted likelihood is the author “guessed” by the model. The AI model correctly predicts authorship for three out of four papers, even in a dataset with over 2,000 possible authors. For prolific researchers with extensive publication records (over 100 papers), the accuracy increases to over 85 percent.

Following the recent successes of AI for language-related task evaluation (i.e. ChatGPT), these results may not be considered surprising, yet our findings have significant implications for the integrity of the double-blind review process. While our work shows that machine learning methods can be used to attribute anonymous research papers, understanding how the AI is able to identify an author provides valuable guidelines that authors can follow to increase their anonymity:

This article by Leonard Bauersfeld, Angel Romero, Manasi Muglikar and Davide Scaramuzza is taken from The London School of Economics and Political Science’s LSE Impact Blog, a Social Science Space partner site, under the title “AI Can Crack Double Blind Peer Review – Should We Still Use It?” It originally appeared at the LSE Press blog.

Abstract and introduction: we find that the first 512 words of a paper, typically encompassing the abstract and introduction, provide sufficient information for robust authorship attribution. The AI’s performance is only marginally affected when compared to considering the entire paper. We believe that the abstract and introduction frequently reflect the authors’ creative identity and their research domain. These distinct traits facilitate author identification, particularly as authors often tend to rephrase introductions from their prior works.
Self-citations: Our analysis also highlights the role of self-citations in revealing authors’ identities. We confirmed the common hypothesis that authors cite themselves too often. On average, papers in our dataset contain 10.8 percent self-citations, serving as an easy giveaway to their identity. Thus, we encourage authors to omit many self-citations in the submission to a double-blind review to enhance their anonymity.
Citation diversity: Even when self-citations are omitted, the references cited in a paper can still be utilised to identify the author. By including citations from lesser-known papers, authors can bolster their anonymity, while also promoting equal visibility for all research in their field.

While authorship attribution focuses on anonymous papers, our research also explores applications in the context of signed manuscripts to aid plagiarism and ghostwriting detection. By leveraging the AI model’s probability predictions, one can determine the likelihood of the person who signed the document being the actual author. Similarly, one can query the model for the most likely possible authors of a manuscript (e.g. the top five or top 10). This opens avenues for more elaborate methods to cross-validate the model’s initial selection of likely authors.

Often, in small research fields experienced researchers are able to correctly guess from which research group an anonymous submission originates, possibly biasing the peer-review process. Our published article is the first to offer insights on the potential vulnerabilities in maintaining anonymity during the double-blind review process in the age of AI and big data. While our AI model demonstrates the ability to attribute authors to anonymous research papers on large scales, we emphasize the importance of preserving the fairness and unbiasedness that the double-blind review process upholds. At present, simple measures, such as reducing self-citations and embracing citation diversity, could be implemented during the initial submission stage to enhance anonymity.

As peer-review is such a fundamental pillar of science, we hope that this study encourages the research community to further explore how AI is changing peer-review itself. We have open-sourced our codebase (https://github.com/uzh-rpg/authorship_attribution) in the hope that it serves as a starting point for scholars to pick-up our work and build on top of it. Authorship attribution and plagiarism detection are vital to ensure the continued integrity and trustworthiness of academic publishing and enhancing it will be beneficial to the entire scientific community.

Leonard Bauersfeld, Angel Romero, Manasi Muglikar and Davide Scaramuzza

Leonard Bauersfeld (pictured) is a Ph.D. student at the Robotics and Perception Group at the University of Zurich. He researches first-principle based and data-driven models for quadrotors. Angel Romero is also a Ph.D. student in the Robotics and Perception Group. He researches classic control and learning based control for autonomous flight. Manasi Muglikar is also a Ph.D. student at the Robotics and Perception Group. She researches event-based vision, vision-based navigation and more. Davide Scaramuzza is a professor of robotics at the University of Zurich, where he works on the autonomous navigation of microdrones and directs the Robotics and Perception Group.

View all posts by Leonard Bauersfeld, Angel Romero, Manasi Muglikar and Davide Scaramuzza

Published

August 23, 2023

New Opportunity to Support Government Evaluation of Public Participation and Community Engagement Now Open

By Christopher Everett

Read Now

Three Decades of Rural Health Research and a Bumper Crop of Insights from South Africa

Impact

March 27, 2024

Three Decades of Rural Health Research and a Bumper Crop of Insights from South Africa

By Stephen Tollman and Kathleen Kahn

Read Now

Impact

March 21, 2024

Using Translational Research as a Model for Long-Term Impact

By Gabi Lombardo, Jonathan Deer, Anne-Charlotte Fauvel, Vicky Gardner, and Lan Murdock

Read Now

Coping with Institutional Complexity and Voids: An Organization Design Perspective for Transnational Interorganizational Projects

Research

March 19, 2024

Coping with Institutional Complexity and Voids: An Organization Design Perspective for Transnational Interorganizational Projects

By Yongcheng Fu, Lihan Zhang and Yongqiang Chen

Read Now

The Importance of Using Proper Research Citations to Encourage Trustworthy News Reporting

Andy Tattersall 426 Ethics, Impact, Industry, Research

Based on a study of how research is cited in national and local media sources, Andy Tattersall shows how research is often poorly represented in the media and suggests better community standards around linking to original research could improve trust in mainstream media.

Read Now

Revolutionizing Management Research with Immersive Research Methods

Anand van Zelderen, Nicky Dries, and Elise Marescaux 450 Business and Management INK, Research

In this article, Anand van Zelderen, Nicky Dries, and Elise Marescaux reflect on their decision to explore nontraditional research.

Read Now

A Behavioral Scientist’s Take on the Dangers of Self-Censorship in Science

Dan Falk 448 Industry, Insights, Interview, Research

The word censorship might bring to mind authoritarian regimes, book-banning, and restrictions on a free press, but Cory Clark, a behavioral scientist at […]

Read Now