The use of the terms “real-world data” and “real-world evidence” in the context of health decision-making has grown substantially in the last 20 years, although unified and consistent definitions of these terms remain elusive. Often referred to as ‘administrative,’ ‘observational,’ ‘routine,’ ‘large,’ or even ‘big’ data sources; over the last decade they have become of increasing interest to those conducting health technology appraisal processes to provide policymakers with evidence to inform decision-making and develop guidance on the reimbursement and administration of new health technologies within a care system.
In general real world data and evidence are now used as terms to encompass data and evidence emerging from non-interventional sources, or from sources other than randomized controlled trials (RCTs). This includes administrative data (e.g. hospital episode statistics), or survey data on populations (e.g. Health Survey for England), which can comprise of standalone or linked datasets. Compared to the often perceived ‘gold standard’ of RCT data, real world data presents particular challenges, especially around data protection. However, it can complement or stand in for RCT data. Analyzed in its own right, it can also provide descriptive information and be used to assess perceived associations between factors (e.g. to what extent a person’s or groups frailty status is associated with their quality of life, and care resources required and consumed).
As part of an NIHR funded Unlocking Data project, we have been exploring sources of such real-world data, often held by local agencies such as councils and clinical commissioning groups in England, and how enabling broader and transparent use of this data (e.g. for research purposes) can be used to promote and protect health, and prevent ill-health. 1
How real-world data can be used to promote and protect health and prevent ill-health
As stated in the Life Sciences Vision, Life Sciences Industrial Strategy, and NIHR Best Research for Best Health, unlocking the potential of real-world data provides huge opportunities to understand and provide solutions for improving health outcomes of patients and populations, informing the development of interventions that optimize disease management and treatment. Addressing these challenges requires partnership between the data collectors (e.g. NHS diagnostic labs, NHS data providers, social care, registries, private providers), owners and guardians of the data (e.g. health and social care providers, commissioners, NHS Digital), and health data users, including researchers, patients, and service providers. Working together to identify and overcome the barriers that exist in accessing data from multiple sources for secondary use will be key to successfully developing linked datasets at scale for secondary use.
Real-world data can be linked to create datasets that address population health by reflecting the whole spectrum of care experienced by patients regardless of organizational boundaries. For example, we can link data for palliative patients from sources such as hospices, with secondary care data to identify where patients might ‘fall through the gaps’ in care – and then establish how such gaps can be plugged leading to improved quality of life.
Linked datasets are also useful to identify at-risk cohorts, describe inequalities in access to care, model different possible outcomes to treatment, deliver efficient trials (e.g. for understanding compliance with drugs, drug interactions and repurposing of drugs for other uses), apply advanced statistical techniques (such as machine learning) for better risk prediction, or developing clinical decision tools.
Challenges to making better use of real-world data
Information systems are designed to efficiently deliver a specific service. Less consideration is given to integration with other systems leading to fragmentation of data within and between organizations. Documentation (metadata) of source systems, their functions, data stores and flows is crucial to understanding what real-world data exists and how it can be used.
The UK’s National Statistician has written that, “Being able to link data will be vital for enhancing our understanding of society, driving policy change for greater public good and minimising respondent burden.” The UK Government, the Office for National Statistics, ADR UK, and HDR UK all have corporate strategies that include increasing use of linked RWD. This will require the sharing of data across organizational boundaries.
UK data protection legislation does not forbid considered and proportionate sharing of personal data for limited, clearly justified purposes. However, protections exist under common law for information provided in confidence (e.g. disclosed to a doctor in the course of a consultation; or provided to a local authority in connection with their functions). Without obtaining consent – which may be impractical or impossible – such information cannot be shared without risk to the sharing organization(s) unless through a specific legislated gateway.
Identifying, agreeing and documenting data sharing initiatives is not routine practice. In case of doubt, organizations are likely to avoid the additional risk of sharing data but also miss the potential benefits. As such, there is also the need to build trust in the use, linkage, and sharing of data
How can we responsibly unlock real-world data
The COVID-19 outbreak has heightened the need for regional, national and international population health management; this has led to significant developments in the use of real-world data e.g. the regional development of the Yorkshire and Humber Care Record. Using real time, real world evidence we have a better chance of tackling the challenges posed by a global pandemic. For example, making data more discoverable has been a key aim of HDRUK through the Innovation Gateway alongside developments, improvements, and access to care metadata. From this then access, analysis, dissemination and transparency for research becomes possible, representing an important aspect of infrastructure which continuously needs ‘levelling up.’
The NHS has recently published, in draft form, a single data strategy for health and care: ‘Data Saves Lives: Reshaping Health and Social Care with Data’. This together with the Life Sciences Vision and Clinical Research Implementation Plan, envisages much more widespread use of data the health and care system generates day-to-day in driving insight to support population health, resource planning, clinical research and health-improving innovations.
To enable this the role of trusted research environments in the health and care system is being defined. Trusted research environments are controlled digital environments used to store and analyze sensitive data securely. The main benefits are improvements in data quality, security, transparency and privacy. Data only resides on systems owned by accredited partners and every interaction is recorded and audited.
In addition, conformance to consistent standards on access and governance is important so that patients, the public and health and care professionals can understand what they do and how they work, and have confidence that they control access to data securely. Through this approach we can unlock the real-world data for the benefit of all. In the short term, collaborations (e.g. between researchers in the NHS and universities) on further data analysis projects are required to fully understand the impact of the global pandemic. In the longer term, collaborations will broaden, relevant skills sets across disciplines will grow and merge, and the use of trusted research environments will allow analysis on a range of healthcare questions to improve patient outcomes and save lives.
Readers interested in linked data and its related challenges may also be interested in these two video animations: Benefits and Risks of Patient Data Sharing and What Happens to My Patient Data?