By Gary Curhan, MD, CMO, OM1

Real-world data (RWD) can be used to find appropriate patients for observational studies and clinical trials, but for study eligibility, clinicians and researchers need details not found in claims. Using heart failure with preserved ejection fraction (HFpEF) as an example, ejection fraction (EF) values are needed but they cannot be obtained from claims and study planners often do not account for how EF can change over time. Additionally, inclusion/exclusion criteria need additional details such as comorbidities and severity, medications, and lab values (e.g. eGFR), which are not available in claims. 


Clinical Trials with Help from RWD

RWD and Artificial Intelligence(AI) make trial feasibility and enrollment challenges easier. When deciding inclusion/exclusion criteria and protocol feasibility:

  • Ensure that the study population exists in sufficiently large numbers to complete the trial
  • Include subjects who are likely to achieve a study outcome during the trial
  • Exclude subjects who are more likely to experience a serious adverse event
    • Find sites that have access to qualified subjects
    • Help sites find and enroll those subjects


How does OM1 get RWD?

We focus on speciality data networks that go beyond general claims with additional details. Through these networks we connect to sites where we automatically ingest and process the information, making a more powerful and rich dataset. Once we have received the data, OM1 EngineTM de-identifies, aggregates and links the data to each patient and maps them to healthcare systems patients have visited. Next, it is cleaned, normalized and enriched. Then, it is divided into condition areas and can be used to determine outcomes. 


OM1 PremiOM(™) Heart Failure Dataset

From this process, we have been able to create the OM1 PremiOM Heart Failure Dataset with 251,000 patients including: 

  • Open and closed claims
  • Extracted and estimated  NYHA scores
  • Left ventricular ejection fraction (LVEF) with repeat measures for many patients
  • Providers and procedures
  • Clinical EMR data and more


Since we are using RWD, the mean follow-up is 7 years per patient and for many the initial follow-up may start before their initial diagnosis of HF and allows us to follow them longitudinally. 

Another advantage is having additional details on patient characteristics such as 

  • Comorbidities (Diabetes Type 1 & 2, CKD, COPD)
  • Select treatments (medications and reasons for discontinuation)
  • Select labs (BNP/proBNP, eGFR)


Critical Endpoints from Clinical Narratives

You may already know clinicians dictate their findings in their clinical notes much more commonly than they click dropdowns or fill in fields. The data are in the record, but the record needs to go through several key processes to become useful information. First, unstructured data needs to be de-identified, which is a harder task than structured information. Second, we have three ways to enrich data from the clinical narrative: 

  • Abstraction— applying clinical abstractors to read through the notes and fill in data fields with their findings. This is a manual process that can work for a rare disease cohort, but won’t work for the tens or hundreds of thousands of patients. 
  • Medical language processing— the automated extraction of concepts in a reliable and validated manner by an algorithm.  
  • Estimation—where a key clinical endpoint can be estimated from the totality of the narrative encounter. For example, we estimate disease activity scores and NYHA class.


Identifying Desired Sub-Populations

When thinking about choosing a subpopulation for a study, it is important to think about comorbidities. Patients with heart failure often have a large number of comorbidities but the proportion varies by ejection fraction. While hypertension is the most common comorbidity for patients with preserved ejection fraction (pEF) and reduced ejection fraction (rEF), obesity is the second leading comorbidity for pEF but for rEF it is ischemic HF. 

Predicting Outcomes 

Our Data Science team is highly skilled in generating insights. Some posters we have presented were on predicting Machine Learning Generated Risk Model to Predict Unplanned Hospital Admission in Heart Failure and A Simple Predictive Score for Pre-Admission Identification of Risk of 30-Day Hospital Readmission or Death in Heart Failure

The Data Science group also has the ability to impute or estimate disease activity scores, which we call endpoint amplification. In HF, we have done this with estimated NYHA scores. Endpoint amplification creates a more complete longitudinal picture by increasing the number of patients and the number of timepoints per patient with this valuable data point. 

Our team is able to do this because we have so many patients with NYHA recorded in their notes by their cardiologist. We use the patient activity in the form of symptoms and patient conditions without specifying NYHA, along with observed disease activity measurements. Medical language processing and machine learning is then used to develop a model to estimate or impute NYHA scores for patients who have not had it reported. 


Case Example: Finding the Right Patients

Real-world data can help find patients, but we don’t always know if a patient will have a specific condition. One case example is identifying patients who are likely to exhibit an abdominal aortic aneurysm (AAA). Current screening guidelines only capture a minority of cases, by recommending screening men who are 65-75 years old and who have smoked, unless there is another clinical indication. This means women and men who have never smoked don’t get screened, but clearly those people can get AAA, leading to a large number of patients left undiagnosed. 

We used a targeted approach with AI modeling to identify those who are recommended for screening and may be at higher risk. By using the OM1 Data Cloud(™) with over 300M patients, we found those with known AAA. Then, we developed an algorithm to find patients who are at highest risk for AAA at two health systems and invited them in to get an ultrasound to assess the size of their aorta. 

Through this we found:

  • 2-fold higher detection rate than current screening guidelines
  • This rate was in patients who fall outside current screening guidelines

RWD brings substantial value to understanding HFpEF. We’re able to stratify patients with HFpEF and identify new phenotypes, assess similarities and differences between HFpEF, HFmrEF, and HFrEF, evaluate treatment effectiveness, develop risk stratification models to proactively intervene preventing important medical events, expand number of patients with disease activity scores using sophisticated imputation methods, and so much more. Together, each of these will work to improve the care and outcomes of patients living with HFpEF.