Machine Learning Prediction of Early and Clinical Stage Antibody Aggregation and Viscosity at High Concentration
Machine learning has been recently applied to predict antibody aggregation rates and viscosity. In this study, we measured accelerated aggregation rates at 45oC and viscosity at 150 mg/ml for 20 clinical-stage antibodies. Features obtained from molecular dynamics simulations and sequences are utilized for machine learning model construction. We found a k-nearest neighbors regression model with two features, spatial positive charge map on the CDRH2 and solvent accessible surface area of hydrophobic residues on the Fv, gives the best performance for predicting aggregation rates, yielding a correlation coefficient of 0.89. For the viscosity classification model, the best model is a logistic regression model with two features, spatial negative charge map on the heavy chain variable region and spatial negative charge map on the light chain variable region. The accuracy and the area under precision recall curve of the classification model are 0.86 and 0.7, respectively. The accuracy and the area under precision recall curve of the baseline model are 0.7 and 0.3, respectively. The classification model shows significant improvement. The aggregation rates and viscosity models can be applied to predict early to clinical stage antibody stability to facilitate pharmaceutical development. 2022 BIOVIA Conference