Subject Area



Allostery is referred to as protein function changes due to external perturbations, and ubiquitous in the living cells. Investigation of the allosteric mechanism is essential for developing drugs or assisting protein engineering. Recently, the population shift model has been proposed to explain the allosteric mechanism. This model emphasizes the importance of conformational distribution in the allostery. In the past several years, I have been continuously working on developing computational methods to explore and quantify protein allostery mechanism. By fully utilizing protein simulation results, several machine learning related methods have been applied to protein systems to investigate the underlying conformational changes in different allosteric states. These methods include Rigid Residue Scan (RRS), Machine Learning based Classification, Relative Entropy based Dynamic Allosteric Network, Markov State Models and Directed Kinetic Transition Networks. These models investigate allostery from three different perspectives, including the thermodynamic, the kinetics and the structures.

Rigid Residue Scan (RRS) has been applied as the initial attempt to search for important residues. The importance of configurational entropy is emphasized in the RRS method. Key allosteric residues are identified through the comparison between unperturbed and perturbed simulations. After RRS, several machine-learning models, including artificial neural networks, random forest, and support vector machines, have been applied for a classification of various conformational ensembles. Among them, neural network models have the best performance in distinguishing different states. The random forest models align best with the population shift hypothesis, which demonstrate that allosteric states could be altered from one distribution to another upon an external perturbation. Random forest models can also provide the feature importance for each feature. These important features align well with other experimental studies.

To incorporate both structural and thermodynamic view of allostery, an allosteric network model has also been developed to model the protein as the network where each residue represents different vertex. The significance of an allosteric effect upon a perturbation is treated as the weight of each edge connecting residues (vertices). The shortest pathway algorithms and the community detections algorithms have been applied here to separate the protein into several allosteric communities. These analyses could provide several possible communication pathways and allosteric communities. Based on these results, possible allosteric mechanism can be unraveled.

Furthermore, the kinetics of allostery has been investigated through Markov State Models (MSMs). The MSMs represent protein structures as individual Markov states. Following the kinetic theory of protein folding, each state represents a different local minimum on the free energy surface. By estimating the transition probability matrix, a steady state distribution (equilibrium) of each Markov state can be quantified. DKTN model is a further improvement of MSMs. The underlying assumption of DKTN model is replaced as the rate theory instead of the Markovian property. Analogy from DKTN to Continuous-Time Markov Chain (CTMC) demonstrates that the MSM is a special case of DKTN, and DKTN model could evolve into real equilibrium.

Overall, allosteric mechanisms could be thoroughly studied using the above models through three different perspectives, including the thermodynamics, the kinetics and the structures. These models have been applied on various proteins. A good agreement from our studies with the experimental results suggests an efficiency and validity of our models. The possible allosteric mechanism for the Light-Oxygen-Voltage (LOV) domain protein VIVID has been proposed. In summary, these models enhanced the understanding of the allosteric mechanism in terms of conformational distributions, and could be applied as a standard toolbox for studying the allosteric mechanism.

Degree Date

Spring 5-2019

Document Type


Degree Name





Peng Tao

Number of Pages




Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License