SMU Data Science Review


Identifying which genes are early indicators for preterm births using cell-free ribonucleic acid (cfRNA) from non-invasive blood tests provided by pregnant women can improve prenatal care. Currently, there are no medical tests for early detection of preterm birth risk in routine checkups for pregnant women. Recent studies have shown potential genes that can predict preterm birth. Machine learning techniques are utilized to see if the Area Under the Curve (AUC) can be improved upon when evaluating the prediction accuracy for chosen genes sequences and concentrations. Using cell-free RNA data from non-invasive blood tests in conjunction with machine learning, we improve upon the current methodology in an effort to identify and provide evidence between gene expression data and preterm birth. In our analysis, the model accuracy is improved using cfRNA Sequence Counts by expanding the feature space in which we have increased model AUC from 81% to 100%. These results are intended to provide additional evidence of model validity as an early indicator of preterm birth.

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License