Document Type

Dissertation

Degree

Doctor of Philosophy (PhD)

Major/Program

Computer Science

First Advisor's Name

Giri Narasimhan

First Advisor's Committee Title

Committee Chair

Second Advisor's Name

Prem Chapagain

Second Advisor's Committee Title

Committee Member

Third Advisor's Name

Kalai Mathee

Third Advisor's Committee Title

Committee Member

Fourth Advisor's Name

Trevor Cickovski

Fourth Advisor's Committee Title

Committee Member

Fifth Advisor's Name

Ananda Mondal

Fifth Advisor's Committee Title

Committee Member

Sixth Advisor's Name

Cuong Nguyen

Sixth Advisor's Committee Title

Committee Member

Keywords

Deep learning, structural bioinformatics, transformer networks, image analysis, vision AI, molecular mimicry, pandemic preparedness, protein interface features, contrastive learning, affinity prediction, protein flexibility, protein complex conformations

Date of Defense

3-30-2023

Abstract

The binding of proteins plays an essential role in the majority of critical processes in a cell. The protein binding investigation methods have significant practical importance in understanding biological processes and the development of modern vaccines, drugs, and therapeutics. Existing tools have been unable to accurately assess the protein binding in a cost-effective manner. Experimental techniques, such as X-ray crystallography, are time-consuming, labor-intensive, and expensive.

The binding of proteins plays an essential role in most critical biological processes. Investigating methods to study protein binding is of great practical importance in the development of modern vaccines, drugs, and therapeutics. Existing tools have been unable to study protein binding in a cost-effective manner. Laboratory experiments are time-consuming, labor-intensive, and expensive. This dissertation presents state-of-the-art, scalable, and interpretable deep learning solutions to investigate protein binding for the study of molecular mimicry, protein docking, and binding affinity estimation. In the first part, we developed a computational pipeline called EMoMiS that predicts cross-reactivity events induced by molecular mimicry with particular emphasis on antibody-antigen binding. The EMoMiS pipeline used sequence similarity search and structural alignment to identify similar proteins, followed by the application of a deep learning model to evaluate the cross-reactive binding between an antibody (or antigen) and a mimicking protein. The resulting molecular mimicry search pipeline, EMoMiS, can be used as a tool for pandemic preparedness. When applied to the SARS-CoV-2 Spike protein and its antibodies, the pipeline identified many examples of molecular mimicry that can explain COVID-19-related side effects. In the second part, a deep learning approach called PIsToN was developed to classify and rank protein interfaces. The PIsToN can identify viable protein complexes from a large set of candidate complexes. It introduces a novel way to extract features to represent protein interfaces as 2D multi-channel images. The PIsToN was designed as a hybrid multi-attention transformer network endowed with explainability and significantly outperformed current state-of-the-art methods to classify and rank protein docking models. In the last part, we present a deep learning model called PI-KD that incorporates protein molecular dynamics to predict the binding strength of protein complexes. PI-KD achieved state-of-the-art performance in predicting the binding affinity of two macromolecules. Overall, this dissertation is a significant step toward the use of deep learning to investigate the complex world of protein binding in an efficient and accurate manner.

Identifier

FIDC011058

ORCID

https://orcid.org/0000-0001-9377-0263

Previously Published In

Stebliankin, V., Baral, P., Balbin, C., Nunez-Castilla, J., Sobhan, M., Cickovski, T., Mondal, A.M., Siltberg-Liberles, J., Chapagain, P., Mathee, K. and Narasimhan, G., 2022. EMoMiS: A pipeline for epitope-based molecular mimicry search in protein structures with applications to SARS-CoV-2. BioRxiv, pp.2022-02.


Stebliankin, V., Shirali, A., Baral, P., Chapagain, P. and Narasimhan, G., 2023. PIsToN: Evaluating Protein Binding Interfaces with Transformer Networks. bioRxiv, pp.2023-01.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Files over 15MB may be slow to open. For best results, right-click and select "Save as..."

Share

COinS
 

Rights Statement

Rights Statement

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).