Skip to main content

VWSCC logo Virtual Winter School on Computational Chemistry

CET

Cecam Logo

Deep Learning Imputation – Using AI to Derive Valuable Insights from Drug Discovery Data

06-10 February 2023

Hulinks, Inc.
Tokyo, Japan
Optibrium Limited
Cambridge, UK
09:00 CET 07-Feb-23

Please log in to be able to watch

Dr. Sumie Tajima and Dr. Matt Segall

(Dr. S.T.) Hulinks, Inc., Tokyo, Japan; (Dr. M.S.) Optibrium Limited, Cambridge, UK

It’s impossible to experimentally measure all of the data we want for all compounds in a drug discovery project. Furthermore, the limited data we have are noisy because of experimental variability and error.
We will describe a method that uses deep learning to impute these sparse and noisy data. Imputation is the process of filling in missing data in a dataset using the data that are present, which appears simple when only a few datapoints are missing, but is challenging when more than half, or even 99% of the data are missing.
Deep learning imputation learns from both structure-activity relationships (SAR) and directly from the relationship between experimental endpoints based on sparse data [1]. The resulting models can proactively highlight high-quality compounds by ‘filling in’ missing data more accurately than conventional quantitative structure-activity relationship (QSAR) models. Furthermore, it can identify hidden opportunities caused by missing, uncertain or inaccurate data, and prioritise experimental resources by focussing on measuring the most valuable data to inform decisions about compound progression.
We will describe practical applications of deep learning imputation and compare the results with those from conventional predictive modelling methods. We will demonstrate the application in the context of a drug discovery project, in which deep learning imputation achieved an average R2 of 0.72 vs 0.50 for the best QSAR method across 18 heterogeneous endpoints, including compound activities and ADME properties [2]. We will also present an application in combination with generative chemistry methods to identify a novel, active antimalarial compound that revealed new SAR, previously unknown to the project team [3]. Finally, we will show an application to the prediction of particularly challenging sensory properties, assessed in panels of human subjects and compare the results with other methods, including multi-target deep neural networks [4].


References

[1] - Irwin et al. App. AI Lett. (2021) DOI: 10.1002/ail2.31
[2] - Irwin et al. J. Chem. Inf Model. (2020) 60(6), pp. 2848–2857 
[3] - Tse et al. J. Med. Chem. (2021) 64(22) pp 1645-16463
[4] - Mahmoud et al. J. Comput. Aided Mol. Des. (2021) 35(11) pp. 1125-1140
 
 

Recording:

Video is available only for registered users.

No comments

Financial Support

The Cooper Union for the Advancement of Science and Art is pleased to provide support for the 2024 VWSCC through a generous donation from Alan Fortier.

We thank Leibniz Institute for Catalysis (LIKAT) and CECAM for their support.