Learning Electrostatic Complementarity of Protein-peptide Interactions Using Inception Networks
The ability to accurately identify peptide ligands for a given major histocompatibility complex class I (MHC-I) molecule has immense value for targeted anticancer and antiviral therapeutics. However, the highly polymorphic nature of the MHC-I protein makes universal prediction of peptide ligands challenging due to lack of experimental data describing most MHC-I variants. To address this challenge, we have developed a deep convolutional neural network, HLA-Inception, capable of predicting MHC-I peptide binding motifs using biophysical properties of the MHC-I binding pocket. By approaching this problem from a 3-dimensional perspective, we can fully consider the impact of sidechain arrangement and topology on peptide binding, a feature not inherently captured by the popular protein sequence-based MHC-I prediction methods. Through a combination of molecular modeling and simulation, 5,821 MHC-I alleles were modeled, providing extensive coverage of all human populations. The topology and interaction forces within the MHC-I binding pocket were accounted for by solving the electrostatic potential near the surface of the protein. HLA-Inception was then trained on all MHC-I alleles with known peptide binding motifs and applied to the full set of MHC-I models. Predicted peptide binding motifs fell into distinct and well-defined clusters, which maintained disease associations. We demonstrate that the predicted MHC-I binding motifs can be used for MHC-I ligand prediction, and are more generalizable than sequence-based methods. The scores generated by HLA-Inception are strongly correlated with quantitative MHC-I binding data, indicating predicted peptides can be ranked. Finally, we show that HLA-inception has a higher precision than the current state-of-the-art models when predicting naturally presented MHC-I ligands.
2022 BIOVIA Conference @PG @AG @TL
