TY - JOUR
T1 - Shared Consensus Machine Learning Models for Predicting Blood Stage Malaria Inhibition
AU - Verras, Andreas
AU - Waller, Chris L.
AU - Gedeck, Peter
AU - Green, Darren V.S.
AU - Kogej, Thierry
AU - Raichurkar, Anandkumar
AU - Panda, Manoranjan
AU - Shelat, Anang A.
AU - Clark, Julie
AU - Guy, R. Kiplin
AU - Papadatos, George
AU - Burrows, Jeremy
N1 - Publisher Copyright:
© 2017 American Chemical Society.
PY - 2017/3/27
Y1 - 2017/3/27
N2 - The development of new antimalarial therapies is essential, and lowering the barrier of entry for the screening and discovery of new lead compound classes can spur drug development at organizations that may not have large compound screening libraries or resources to conduct high-throughput screens. Machine learning models have been long established to be more robust and have a larger domain of applicability with larger training sets. Screens over multiple data sets to find compounds with potential malaria blood stage inhibitory activity have been used to generate multiple Bayesian models. Here we describe a method by which Bayesian quantitative structure-activity relationship models, which contain information on thousands to millions of proprietary compounds, can be shared between collaborators at both for-profit and not-for-profit institutions. This model-sharing paradigm allows for the development of consensus models that have increased predictive power over any single model and yet does not reveal the identity of any compounds in the training sets.
AB - The development of new antimalarial therapies is essential, and lowering the barrier of entry for the screening and discovery of new lead compound classes can spur drug development at organizations that may not have large compound screening libraries or resources to conduct high-throughput screens. Machine learning models have been long established to be more robust and have a larger domain of applicability with larger training sets. Screens over multiple data sets to find compounds with potential malaria blood stage inhibitory activity have been used to generate multiple Bayesian models. Here we describe a method by which Bayesian quantitative structure-activity relationship models, which contain information on thousands to millions of proprietary compounds, can be shared between collaborators at both for-profit and not-for-profit institutions. This model-sharing paradigm allows for the development of consensus models that have increased predictive power over any single model and yet does not reveal the identity of any compounds in the training sets.
UR - http://www.scopus.com/inward/record.url?scp=85025075857&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85025075857&partnerID=8YFLogxK
U2 - 10.1021/acs.jcim.6b00572
DO - 10.1021/acs.jcim.6b00572
M3 - Article
C2 - 28257198
AN - SCOPUS:85025075857
SN - 1549-9596
VL - 57
SP - 445
EP - 453
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 3
ER -