Manara - Qatar Research Repository
10.1186_s13040-022-00298-7.pdf (1.43 MB)

Analysis of risk factors progression of preterm delivery using electronic health records

Download (1.43 MB)
journal contribution
submitted on 2024-04-01, 09:33 and posted on 2024-04-01, 09:34 authored by Zeineb Safi, Neethu Venugopal, Haytham Ali, Michel Makhlouf, Faisal Farooq, Sabri Boughorbel


Preterm deliveries have many negative health implications on both mother and child. Identifying the population level factors that increase the risk of preterm deliveries is an important step in the direction of mitigating the impact and reducing the frequency of occurrence of preterm deliveries. The purpose of this work is to identify preterm delivery risk factors and their progression throughout the pregnancy from a large collection of Electronic Health Records (EHR).


The study cohort includes about 60,000 deliveries in the USA with the complete medical history from EHR for diagnoses, medications and procedures. We propose a temporal analysis of risk factors by estimating and comparing risk ratios and variable importance at different time points prior to the delivery event. We selected the following time points before delivery: 0, 12 and 24 week(s) of gestation. We did so by conducting a retrospective cohort study of patient history for a selected set of mothers who delivered preterm and a control group of mothers that delivered full-term. We analyzed the extracted data using logistic regression and random forests models. The results of our analyses showed that the highest risk ratio and variable importance corresponds to history of previous preterm delivery. Other risk factors were identified, some of which are consistent with those that are reported in the literature, others need further investigation.


The comparative analysis of the risk factors at different time points showed that risk factors in the early pregnancy related to patient history and chronic condition, while the risk factors in late pregnancy are specific to the current pregnancy. Our analysis unifies several previously reported studies on preterm risk factors. It also gives important insights on the changes of risk factors in the course of pregnancy. The code used for data analysis will be made available on github.

Other Information

Published in: BioData Mining
See article on publisher's website:



  • English


Springer Nature

Publication Year

  • 2022

License statement

This Item is licensed under the Creative Commons Attribution 4.0 International License.

Institution affiliated with

  • Sidra Medicine
  • Hamad Bin Khalifa University
  • Qatar Computing Research Institute - HBKU

Usage metrics

    Qatar Computing Research Institute - HBKU



    Ref. manager