Higher Education Reform

DARPA Aims to Score Social and Behavioral Research

March 6, 2019 5843

The Pentagon’s innovation incubator has set itself an ambitious task – ranking the reliability of social science research that might apply to national security. The Defense Advanced Research Projects Agency’s Defense Sciences Office is currently asking for “innovative research proposals” to algorithmically assign a confidence score to social and behavioral research.

DARPA has named this program to develop an artificially intelligent quantitative metric Systematizing Confidence in Open Research and Evidence, or SCORE. As DARPA explains in its request for proposals:

These tools will assign explainable confidence scores with a reliability that is equal to, or better than, the best current human expert methods. If successful, SCORE will enable [Department of Defense] personnel to quickly calibrate the level of confidence they should have in the reproducibility and replicability of a given SBS result or claim, and thereby increase the effective use of SBS literature and research to address important human domain challenges, such as enhancing deterrence, enabling stability, and reducing extremism.

Outside observers have identified a wider collateral benefit to the academy from the proposal – a tool to address the so-called replication crisis in social science. An article by Adam Rogers at Wired, for example, is headlined “Darpa Wants to Solve Science’s Reproducibility Crisis With AI.”

DARPA implies that the replication crisis is itself a national security concern: “Taken in the context of growing numbers of journals, articles, and preprints, this current state of affairs could result in an SBS consumer mistakenly over-relying on weak SBS research or dismissing strong SBS research entirely.”

Last month, DARPA signed the Center for Open Science (COS) to a three-year agreement, worth $7.6 million, to create a database of 30,000 claims made in peer-reviewed and published papers. Alongside partners from the University of Pennsylvania and Syracuse University, COS will extract – automatically and manually – evidence about the claims, which will be merged with more traditional quality indicators like citations and whether the research was preregistered.

Three steps will follow once the database exists:

Experts will examine 10 percent of the claims, using surveys, panels and even prediction markets, for their likelihood of being replicated.
Other experts will create algorithms to examine the database’s contents and determine, artificially, their likelihood of being replicated.
Other researchers will attempt to replicate a sample of the database’s claims, allowing both the humans’ and the computers’ efforts to be measured and scored.

Appropriately, COS says its own work need to be reproducible. “We are committed to transparency of process and outcomes so that we are accountable to the research community to do the best job that we can,” said COS program manager Beatrix Arendt, “and so that all of our work can be scrutinized and reproduced for future research that will build on this work.”

“Whatever the outcome,” according to Brian Nosek, COS’ executive director, “we will learn a ton about the state of science and how we can improve.”

Rogers quote Microsoft sociologist Duncan Watts about the audacity of creating a scoring mechanism: “It’s such a DARPA thing to do, where they’re like, ‘We’re DARPA, we can just blaze in there and do this super-hard thing that nobody else has even thought about touching.’” Watts then adds, ““Good for them, man.” (Further demonstrating its chutzpah, DARPA has specifically excluded from SCORE proposals “research that primarily results in evolutionary improvements to the existing state of practice.”)

Ideally the scores and how they were determined would be understandable to a non-specialist. In addition, the scores could change based on new information.

As it tries to grade social and behavioral research, DARPA clearly acknowledges the need to fully embrace social science. “Given the accelerating sociotechnical complexity of today’s world—a world that is increasingly connected but often poorly understood—there are growing calls to more effectively leverage Social and Behavioral Sciences (SBS) to help address critical complex national security challenges in the Human Domain,” DARPA wrote in a 41-page document announcing the program in June 2018.

In addition to citing work that has obvious applications to security, such as reducing extremism, the documents cited other federal projects that have explicitly connected SBS and the Pentagon, such as the National Academies of Science’s Decadal Survey of Social and Behavioral Sciences for Applications to National Security and the Minerva Research Initiative (“Supporting social science for a safer world”).

Social Science Space

View all posts by Social Science Space

Published

March 6, 2019

Popular Paper Examines Ensuring Trustworthiness in Qualitative Analysis

By Sage

Read Now

Why Men Have a Bigger Carbon Footprint Than Women

Insights

July 8, 2025

Why Men Have a Bigger Carbon Footprint Than Women

By Joe Sweeney

Read Now

Examining How Open Research Affects Vulnerable Participants

Impact

July 8, 2025

Examining How Open Research Affects Vulnerable Participants

By Jo Hemlatha and Thomas Graves

Read Now

Closing the Gender Pay Gap: Why Intermediaries Matter

Business and Management INK

June 18, 2025

Closing the Gender Pay Gap: Why Intermediaries Matter

By Sally Curtis, Jananie William, Anna von Reibnitz, Miriam Glennie, and Andreas Pekarek

Read Now

Degrading Sites of Punishment and Pain: The Case for Abolishing Prisons

Joe Sim 3867 Insights, Opinion, Public Policy

Prisons have been in crisis in England and Wales for 200 years. The state has responded with piecemeal, ‘pragmatic’ reforms which have […]

Read Now

Who Gets to Flourish?

Joe Sweeney 7576 Bookshelf, Public Policy

In this month’s issue of The Evidence newsletter, Josephine Lethbridge examines how gender shapes experiences of human flourishing. A recently published international […]

Read Now