Subject Area
Computer Science
Abstract
In recent years, the progress in inter-disciplinary application of machine learning and artificial intelligence (ML/AI) have truly transformed various fields, from weather forecasting and drug development to medical diagnostics, energy, and sustainability. Computational chemistry uses computational tools to model, predict, analyze, and explain chemical phenomena, while the Quantum chemistry specifically uses techniques based on quantum mechanics (as opposed to classical mechanics or empirical models). Quantum chemistry or Computational chemistry has also observed a momentum in application of ML techniques over the past decade significantly accelerating results and providing valuable insights into vast datasets, often surpassing traditional methods.
This dissertation explores the integration of machine learning to enhance the efficiency and accuracy of computational chemistry methods. The prime area of focus is to minimize the errors associated when the complex tensor hyper-contraction (THC) approximation technique is applied over the third-order Møller-Plesset Perturbation (MP3) theory.
We started by comparing different levels of THC approximations based on the dataset and amount of correction. The varying $\delta$ values of THC indicate the level of approximation. An attempt was made to reduce the errors for both molecular and reaction energy.
The research then systematically applies machine learning methods from linear model to more complex neural networks. Part of this research is published in Journal of Computational Chemistry (JCC).
Multiple Linear regression (MLR): Albeit a simpler technique, MLR yielded good results with up to 84\% improvement in calculating the energy levels over MP3b baseline values.
Kernel Ridge Regression: This model yielded even better results than MLR, up to 89\% improvement in the calculated energy level values. This strongly suggested towards non-linearity in the datasets.
Multi-Layer Perceptron Artificial Neural Network Model (MLP-ANN): The research evaluated the MLP architecture as a viable candidate for the THC-MP3 dataset. While the model offered high learning capacity, they were more sensitive to training procedures and data splits than KRR.
Tabular Prior-data Fitted Network (TabPFN): Since the contextual data of THC-MP3 is in tabulated form, pre-trained TabPFN was also implemented to evaluate the performance of transformer-based models. The results of TabPFN also yielded a varied range of results based on the $\delta$ of the THC-MP3 dataset.
Hybrid Stacking Technique: A major contribution of the research was development of a two-stage stacking framework that augmented KRR predictions to the input features for secondary MLP or TabPFN models. This successfully combined regression stability with neural network capacity to reach improvements up to 90\%.
The results also indicate that while molecular energy corrections were highly successful, improving reaction energies remained more challenging due to the limited ability of statistical models to exploit the physical error cancellation inherent in reactions.
By focusing on improving speed and accuracy, this dissertation contributes to making quantum chemistry processes more efficient and cost-effective. This work can deepen our understanding of molecular ground states and reaction dynamics, paving the way for advancements in drug design, vaccine development, advancing polymer processing techniques, climate modeling, artificial photosynthesis, and sustainable energy solutions.
Degree Date
Spring 5-16-2026
Document Type
Dissertation
Degree Name
Ph.D.
Department
Lyle School of Engineering
Advisor
Dr. Devin Matthews
Second Advisor
Dr. Eric Larson
Number of Pages
162
Format
Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License
Recommended Citation
Satyarth, Ishna, "MACHINE LEARNING AND FORMAL METHODS IN QUANTUM CHEMISTRY: THEORY AND APPLICATION" (2026). Computer Science and Engineering Theses and Dissertations. 57.
https://scholar.smu.edu/engineering_compsci_etds/57
Included in
