By Richard Gliklich, MD, CEO, OM1
In the rapidly evolving landscape of clinical trials, data collection remains one of the most time-consuming and cost-intensive processes. For decades, the manual abstraction of clinical data into electronic case report forms (eCRFs) has been the predominant method for gathering information from trial participants. This paradigm, largely unchanged since the 1990s, remains resource- intensive and relies heavily on site investigators and clinical staff. However, recent advancements in real-world and passive data collection offer a transformative potential to significantly reduce trial costs and enhance site satisfaction. This blog explores the current challenges associated with manual data collection, the emerging regulatory focus on real-world data, and the promising future of integrating active and passive data collection strategies in clinical trials.
The Current Paradigm: Manual Data Collection
Since the 1990s, the model for data collection in clinical trials has largely remained static. A significant portion of clinical data is still manually abstracted from patient medical records and entered into electronic case report forms (eCRFs). This process is not only labor-intensive but also prone to errors as it involves the abstraction of data from medical records and relies on patient recall, both of which can lead to translational errors. Site investigators must spend substantial time transcribing information, which detracts from the time they can devote to patient care and other clinical responsibilities.
The inefficiency of the manual process drives up costs by requiring additional site resources, such as clinical coordinators, to manage data entry. Additionally, the complexity of this data abstraction process increases the fair market value of data collection, resulting in higher payments to trial sites. The manual nature of data collection also limits the number of concurrent studies that a site can manage, further slowing down the recruitment and enrollment processes for clinical trials. Sites become constrained by the resources available to them, leading to longer trial durations and, consequently, higher overall costs. This outdated paradigm has persisted despite the growth of more sophisticated technologies in data collection, largely due to regulatory inertia and the absence of widely adopted alternatives.
New Regulatory Guidance on Real-World Data
In response to these challenges, regulatory agencies such as the U.S. Food and Drug Administration (FDA) have released new guidance documents focusing on the use of real-world data (RWD) in clinical trials. Real-world data refers to data that is routinely collected during the provision of healthcare, such as electronic medical records (EMRs), pharmacy claims, and laboratory results. The FDA’s guidance emphasizes the importance of ensuring that RWD is “fit for purpose” meaning it must be relevant and reliable for the specific investigation being conducted.
Importantly, this regulatory shift highlights that real-world data is not confined to use in late-phase or observational studies. Instead, it can be applied across various phases of clinical research, provided the data is appropriate for the trial’s goals. This opens up opportunities for clinical trials to incorporate real-world data from EMRs and other secondary sources, reducing the reliance on manual data collection and abstracting only the data that is necessary and not already available. While some electronic medical record (EMR) to electronic data capture (EDC) tools exist, they often fall short in capturing the full spectrum of data required for clinical trials. This limitation hinders their potential to significantly reduce the data collection burden on sites.
The Shift to Active and Passive Data Collection
A new paradigm for clinical data collection is emerging, which involves the use of both active and passive data collection methods. Passive data collection refers to the automatic extraction of data that is already captured in systems of record as part of routine healthcare, while active data collection refers to the manual gathering of data not available in these systems, such as randomized trial results, specific clinical assessments, and patient-reported outcomes.
Passive data sources include electronic medical records, tumor registries, laboratory information management systems (LIMS), radiology image management systems, pathology systems, and third-party data such as Medicare claims or mortality data. These sources contain a wealth of information that can be leveraged in clinical trials without requiring additional data entry from site staff. By maximizing the use of passive data, trial sponsors can minimize the need for active data collection, reducing the burden on site investigators and clinical staff.
Systems like OM1® Aspen have been developed to support this new paradigm by integrating passive data collection in a manner that is regulatory compliant and easily interoperable with active data collection processes. These systems traceably capture, transform, and normalize both structured and unstructured data from clinical records and reports. Moreover, they map this data into eCRFs, ensuring that all required information is captured in a format suitable for regulatory submission. This approach is also fully auditable, allowing inspectors to verify the accuracy and integrity of the data being used in the trial.
Changing the Research Paradigm
Case Examples: Reducing Costs and Increasing Site Satisfaction
Case examples of clinical trials that have adopted a combined active and passive data collection model demonstrate the significant cost savings associated with this approach. The cost reductions become more pronounced as the number of trial participants increases. For instance, in a study that involved more than one million women undergoing breast cancer screening, costs per subject were reduced exponentially to single dollars per subject. Even in studies with single digit thousands of subjects, the cost savings are remarkable and increase as the number of data
elements increase.
Sites participating in trials that use these systems consistently report higher satisfaction rates, as reflected in their net promoter scores (NPS). We have typically obtained NPS scores above 70 at sites where most of the data collected is passive. The reduced administrative burden on site investigators and staff frees up time for other critical activities, such as patient care and trial management. By automating much of the data collection process, sites are able to conduct more studies concurrently, without being overwhelmed by the logistical demands of data entry.
Additionally, the accuracy of data is improved, as passive data collection minimizes the risk of transcription errors associated with manual abstraction. This leads to cleaner datasets, fewer queries, and faster trial completion times. As more sponsors adopt systems that integrate passive data collection, the overall efficiency of clinical trials is expected to improve, resulting in lower costs and shorter timelines for drug development.
Conclusion
The integration of real-world and passive data into clinical trials represents a significant opportunity to improve site satisfaction and reduce costs. By shifting away from the outdated model of manual data abstraction, clinical trials can leverage the wealth of data already available in electronic medical records, laboratory systems, and other healthcare databases. This approach not only reduces the burden on site investigators but also minimizes the potential for data entry errors, leading to cleaner and more reliable datasets. As regulatory agencies continue to emphasize the importance of using real-world data that is fit for purpose, the adoption of passive data collection systems will likely become a critical factor in the future success of clinical trials. Systems like OM1® Aspen are at the forefront of this transformation, offering a streamlined and auditable approach to data collection that benefits both sponsors and sites alike.