Mapping Existing Biobank Data To EHR Common Data Model Not Clean Or Straightforward

Pixabay License | Source: Tumisu , No changes made.
Advertisement img

The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) information synthesis approach transforms data contained within electronic health records (EHR) and other medical databases into a common format with a common terminology, vocabulary, and coding scheme. This should facilitate systematic analyses of disparate observational databases based on a common database structure. No studies have yet mapped biospecimen data into the OMOP CDM format. 

To address this deficit Thomas R. Campion, Jr. of Weill Cornell Medicine, New York City, and colleagues evaluated the feasibility of transforming their biospecimen database into the OMOP CDM format so that it could be integrated with EHR data to support acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS) research. The result was published in the AMIA Joint Summits on Translational Science proceedings.

Of the 5,453 AML and MDS records with local unique medical record numbers 1,397 (26%) were transformed to the OMOP CDM format. The remaining records were unmappable due to the absence of a required Disease Status field. 

Featured Partners

The mapping was not clean or straightforward. Some granularity was lost due to the need to combine source data fields for some OMOP fields. Conversely some data was diluted as the same source field was used to populate more than one OMOP field. 

The study was limited by implementation challenges inherent to biospecimen research data, which may be incomplete or use local codes that do not align well with the common data model vocabularies. 

“Our OMOP Sample table implementation demonstrates challenges within curating clinical research data in general, and for biospecimens in particular. We expect other biobank informatics teams to face similar challenges and tradeoff decisions as they implement OMOP or other common data models. While imperfect and incomplete, the opportunity to combine limited data collected for biobanking with more comprehensive and standardized EHR datasets in a common data model dramatically increases the utility of samples for additional studies,” concluded the authors.