Interdisciplinarity

Data Sharing: Let’s Do More Than Just What’s FAIR Interdisciplinarity
Unlike many other kinds of sharing, sharing data doesn't consume it. (Photo: Elaine Casap /Unsplash)

Data Sharing: Let’s Do More Than Just What’s FAIR

February 4, 2025 536

Modern science yields unprecedented volumes of data for researchers from myriad different sources. Combined with advances in digital research infrastructure and artificial intelligence, how we use and analyze data is changing, creating ideal conditions for data-intensive science to flourish.

As this new era of science dawns, our best-practice framework for managing and sharing data, the FAIR Principles, is at risk of being left behind.

Combined with advances in digital research infrastructure and artificial intelligence, how we use and analyse data is changing, creating ideal conditions for data-intensive science to flourish.

Introduced in 2016, the FAIR Guiding Principles (Findability, Accessibility, Interoperability and Reusability) were a significant step forward for open science. They were designed to make research outputs, like data, easier to find and integrate into studies with minimal human input. Eight years on, for research to be up to the task of tackling the complex environmental and societal challenges of our time, it’s time to extend the FAIR Principles so open, interoperable and AI-ready data isn’t just a goal, but part of scientific culture.

FAIR’s hidden challenges

In a research ecosystem where data is increasing in volume, variety and complexity, FAIR should be an enabler for open, collaborative and interdisciplinary science. But, like any guiding framework that is nearly a decade old, it is not without issues.

For example, metadata is a critical underpinning for FAIR to function. To enable this, metadata should be well described, yet descriptions aren’t standardised across disciplines, making data messy and harder to locate. Data also isn’t always stored in open-access repositories, excluding valuable datasets from being exchanged and reused. In some cases, data remains on a researcher’s hard drive or institutional file system until their analysis is published. Cultural and technical barriers such as these stem from the siloed nature of science and the incentives and pressures placed on researchers to publish.

This contrasts with best practices in data sharing that take a ‘One Health’-inspired approach to deliver systemic insights into a range of critical environmental questions. Take water quality as an example – 30 percent of people don’t have access to reliable supplies of clean water. Data-led approaches have real potential to unpick the complex interactions affecting the availability and quality of water resources for people and industries globally. To achieve this, there is a need to access and bring together data from a range of disciplines, including genomic data from the life sciences, ecosystems data from a variety of environmental science sub-disciplines, as well as economic, social and health-related datasets. The way researchers integrate this data is currently limited by the unique set of standards and interfaces that each domain uses for data access and storage – hampering what science can achieve.

Moving beyond FAIR

This systemic approach is central to how we carry out research at the UK Centre for Ecology & Hydrology. Rather than simply working to understand soils or water in isolation, we harness data and digital technologies to understand whole ecosystems and their interactions across the planet and its populations. For us to develop holistic solutions for the world, we must be able to harmonize and integrate data across different domains. Extending the scope of FAIR offers a route to achieve this:

  1. Findable – Discoverable is better than findable: Discoverable data goes beyond the ability to simply locate and access a specific data set you are aware of – discoverable data would be found serendipitously. For example, by making data easier to discover, a search strategy for the dataset you know you want has the potential to unearth useful data you didn’t even know existed, like data about the river catchment you are studying that we were not aware of. This discovered data can provide an AI engine with contextual information that enriches research.
  2. Accessible – True accessibility for all: Currently, accessible simply means data can be accessed. A broader picture of accessibility would include an inclusive, cross-domain approach. For example, data would not just be findable and downloadable, but readily accessible to all by a variety of mechanisms. For example, via applications and workflows that automatically discover, retrieve and process relevant data sources, rather than having to search for them manually.
  3. Interoperable – Striving for interoperability across domains: Interoperability needs to harmonise data use across domains, so open data from different disciplines can come together to deliver powerful new insights. Standardising metadata descriptions, for example, is one important step to achieve this. More generally, there is a need to work on common standards and interfaces wherever possible, together with mechanisms to translate between domains where differences inevitably exist.
  4. Reusable – Building a culture of reusability: We need to move from periodically reusing data to building a culture where data reuse is the norm. Going beyond this, we need to consider the reuse of a broader range of digital assets including models and methods. For example, embracing a reuse and exchange-focused ethos would contribute to improving the sustainability of data analysis and modelling, which comes at a high energy cost. This would reduce the need to repeat experiments, limiting the environmental impact of research.

Extending FAIR principles in this way would encourage a more open economy of science where expertise and knowledge are valued above data. This has taken hold in the open-source software movement, where developers readily invest their time and money, yet give their software (their data) away. In this case, value is in expertise and knowledge, not raw materials. Contrast this to the environmental sciences, where many see their data as intellectual property that’s worth holding on to – a culture we must move beyond.

Overcoming barriers

Culture change doesn’t happen overnight. Researchers, funders and institutions each have a role to play in coordinating a cross-disciplinary effort that establishes common standards, interfaces and vocabularies for sharing digital assets including data.

Good practices exist. One example is the Australian Research Data Commons. By bringing together thematic communities – people, planet, humanities, arts and social sciences – they’ve developed a set of common standards that emphasise interoperability and the translation of data between domains, all underpinned by a cloud-based infrastructure. The goal is to produce a national knowledge infrastructure that gives Australia’s researchers a competitive advantage.

In the UK, Health Data Research UK is working towards an open-first and reuse-based approach that brings together multi-modal data across the health research community. Similarly, the UKRI-funded BioFAIR is developing a BioCommons infrastructure for UK life sciences researchers, with shared commons and services to facilitate AI-readiness and improve open science practices. The UK Data Service takes a similar approach to economic, population and social research data.

While the UK’s approach is more siloed, we have the building blocks to build on FAIR and embrace a more inclusive, discoverable, interoperable and reuse-centred culture of data sharing.

From addressing climate change to ensuring food security, the world’s grand challenges demand a more united, data-driven response that rises above disciplinary silos and individual priorities. This would ensure the future of science is not only FAIR but truly fit for purpose.

Gordon Blair is head of environmental digital strategy at the UK Centre for Ecology & Hydrology and a distinguished professor of distributed systems at Lancaster University. His current research interests focus on the role of digital technology in supporting environmental science. He is particularly interested in the future of digital research infrastructure and how such infrastructure can support a new kind of science that is more open, collaborative and integrative.

View all posts by Gordon Blair

Related Articles

Those ‘Indirect Costs’ Targeted by DOGE Directly Support America’s Research Excellence
News
February 12, 2025

Those ‘Indirect Costs’ Targeted by DOGE Directly Support America’s Research Excellence

Read Now
AI is Here, But Is It Here to Help Us or Replace Us?
Bookshelf
February 11, 2025

AI is Here, But Is It Here to Help Us or Replace Us?

Read Now
An Investigation Showing How Fake Academic Papers Contaminate Scientific Literature
International Debate
February 5, 2025

An Investigation Showing How Fake Academic Papers Contaminate Scientific Literature

Read Now
From the University to the Edu-Factory: Understanding the Crisis of Higher Education
Industry
November 25, 2024

From the University to the Edu-Factory: Understanding the Crisis of Higher Education

Read Now
Exploring the Citation Nexus of Life Sciences and Social Sciences

Exploring the Citation Nexus of Life Sciences and Social Sciences

Drawing on a bibliometric study, the authors explore how and why life sciences researchers cite the social sciences and how this relationship has changed in recent years.

Read Now
Neuromania – Or Where Did the Person Go?

Neuromania – Or Where Did the Person Go?

David Canter bemoans how people are disappearing as ‘brains’ take over.

Read Now
Lee Miller: Ethics, photography and ethnography

Lee Miller: Ethics, photography and ethnography

Kate Winslet’s biopic of Lee Miller, the pioneering woman war photographer, raises some interesting questions about the ethics of fieldwork and their […]

Read Now
0 0 votes
Article Rating
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments