Authors

ORCID

0009-0005-9846-6056

Subject Area

Computer Science

Abstract

Inspired by Dr. David Snowden's Nun Study, which linked early-life Propositional Idea Density (PID) to later-life Alzheimer's disease, this thesis investigates two questions: whether fine-tuned Transformer-based large language models (LLM) can detect cognitive decline from patient speech transcripts with meaningful feature attribution, and whether longitudinal PID trends are observable across large-scale internet and academic text corpora. We evaluate dementia prediction on the DementiaBank Pitt Corpus and conduct an exploratory longitudinal PID analysis across seven diverse datasets spanning up to 29 years and over 12.6 million documents. This work suggests that linguistic ability metrics, traditional PID metrics and novel LLM-based analysis, should be further applied on longitudinal text datasets, paired with better-generalized ML architectures and more granular analytical methods, to discover meaningful trends and evaluate the impact of technology use and societal influence on future cognitive health.

Degree Date

Spring 5-16-2026

Document Type

Thesis

Degree Name

M.S.

Department

Computer Science

Advisor

Jennifer Dworak

Second Advisor

Zhang Jia

Third Advisor

Eric Larson

Acknowledgements

Thanks for the extensive support of my committee members, Dr. Dworak, Dr. Zhang and Dr. Larson, for their advising, support and encouragement on my work.

Number of Pages

114

Format

.pdf

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License

Share

COinS