ORCID
0009-0005-9846-6056
Subject Area
Computer Science
Abstract
Inspired by Dr. David Snowden's Nun Study, which linked early-life Propositional Idea Density (PID) to later-life Alzheimer's disease, this thesis investigates two questions: whether fine-tuned Transformer-based large language models (LLM) can detect cognitive decline from patient speech transcripts with meaningful feature attribution, and whether longitudinal PID trends are observable across large-scale internet and academic text corpora. We evaluate dementia prediction on the DementiaBank Pitt Corpus and conduct an exploratory longitudinal PID analysis across seven diverse datasets spanning up to 29 years and over 12.6 million documents. This work suggests that linguistic ability metrics, traditional PID metrics and novel LLM-based analysis, should be further applied on longitudinal text datasets, paired with better-generalized ML architectures and more granular analytical methods, to discover meaningful trends and evaluate the impact of technology use and societal influence on future cognitive health.
Degree Date
Spring 5-16-2026
Document Type
Thesis
Degree Name
M.S.
Department
Computer Science
Advisor
Jennifer Dworak
Second Advisor
Zhang Jia
Third Advisor
Eric Larson
Acknowledgements
Thanks for the extensive support of my committee members, Dr. Dworak, Dr. Zhang and Dr. Larson, for their advising, support and encouragement on my work.
Number of Pages
114
Format
Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License
Recommended Citation
Ma, Zerui, "Predictive Natural Language Metrics of Alzheimer's Disease and Cognitive Decline Trend Analysis" (2026). Computer Science and Engineering Theses and Dissertations. 54.
https://scholar.smu.edu/engineering_compsci_etds/54
Included in
