International Journal of Biomedical Informatics

Silvia P. Canelón

Heather H. Burris

Lisa D. Levine

Mary Regina Boland


November 17, 2020

Publication 2020 AMIA Poster medRxiv Preprint


Objective. To develop an algorithm that infers patient delivery dates (PDDs) and delivery-specific details from Electronic Health Records (EHRs) with high accuracy; enabling pregnancy-level outcome studies in women’s health.

Materials and Methods. We obtained EHR data from 1,060,100 female patients treated at Penn Medicine hospitals or outpatient clinics between 2010-2017. We developed an algorithm called MADDIE: Method to Acquire Delivery Date Information from Electronic Health Records that infers a PDD for distinct deliveries based on EHR encounter dates assigned a delivery code, the frequency of code usage, and the time differential between code assignments. We validated MADDIE’s PDDs against a birth log independently maintained by the Department of Obstetrics and Gynecology.

Results. MADDIE identified 50,560 patients having 63,334 distinct deliveries. MADDIE was 98.6% accurate (F1-score 92.1%) when compared to the birth log. The PDD was on average 0.68 days earlier than the true delivery date for patients with only one delivery (± 1.43 days) and 0.52 days earlier for patients with more than one delivery episode (± 1.11 days).

Discussion. MADDIE is the first algorithm to successfully infer PDD information using only structured delivery codes and identify multiple deliveries per patient. MADDIE is also the first to validate the accuracy of the PDD using an external gold standard of known delivery dates as opposed to manual chart review of a sample.

Conclusion. MADDIE augments the EHR with delivery-specific details extracted with high accuracy and relies only on structured EHR elements while harnessing temporal information and the frequency of code usage to identify accurate PDDs.

Poster presented at the 2020 AMIA Annual Symposium
Back to top