Authors

Zeyu LuFollow

Subject Area

Statistics

Abstract

Transcriptional regulators (TRs), proteins controlling gene expression, play a critical role in health and disease. Dysregulated TRs have been implicated in various diseases and cancers, driving the development of computational methods using next-generation sequencing (NGS) data for TR identification. However, a systematic evaluation of these NGS-based methods has been lacking. This dissertation presents a comprehensive review of thirteen existing methods, benchmarking their performance using gene sets derived from TR perturbations.

We further address the emerging challenge of identifying TRs from epigenomic regions (peaks) using NGS techniques. Existing methods often rely on motif-based analysis or user-provided gene lists, both having severe limitations. In addition, many methods lack interpretability and confidence measures due to the naïve use of statistical tests. To overcome these challenges, we introduce BIT (Bayesian Identification of Transcriptional Regulators from Epigenomics-Based Query Region Sets), a novel Bayesian hierarchical approach. BIT leverages extensive TR ChIP-seq data to estimate consistency between user-provided epigenomic regions and TR binding profiles while quantifying uncertainty, thereby enhancing accuracy and interpretability. We demonstrate BIT's superior performance across various applications, offering deeper biological insights into transcriptional regulation.

Degree Date

Fall 12-21-2024

Document Type

Dissertation

Degree Name

Ph.D.

Department

Statistics and Data Science

Advisor

Xinlei Wang

Second Advisor

Lin Xu

Number of Pages

105

Format

.pdf

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License

Deposit Agreement Form.pdf (270 kB)
Deposit Agreement Form

Share

COinS