Machine Learning in Drug Discovery

Abstract

Advancements in machine learning technology, the exponential growth of drug-related data, and the widespread availability of user-friendly machine learning frameworks in popular programming languages ¹ ² are making machine learning methodologies increasingly prevalent throughout all stages of the drug discovery and development process. ³

Data quality and representation significantly impact the performance of machine learning-based predictive models, as both are crucial for effective pre-training. As a result, there has been a surge of research interest in molecular representation. This research encompasses pre-computed or fixed molecular representations, such as molecular graph representations and linear notations (e.g., SMILES and molecular fingerprints), ⁴ ⁵ as well as learned molecular representations. ⁶.

This literature review provides an overview of the various molecular representation approaches used in machine learning-based drug development and explores their applications in conjunction with machine learning models for predicting molecular properties and reactions.

Molecular Representations/Descriptors in Machine Learning-Based Drug Development
1.1 Molecular Graph Theory
    1.1.1 Introduction To The Molecular Graph Representation
    1.1.2 Mathematical Defintion of a Graph
    1.1.3 Graph Traversal Algorithms
    1.1.4 Molecular Graph Reprentations
    1.1.5 Advantages of Molecular Graph Representations
    1.1.6 Disadvantages of Molecular Graph Representations
    1.1.7 Molecular Graphs in AI-Driven Small Molecule Drug Discovery
    1.1.8 References
1.2 Molecular Descriptors
    1.2.1 Introduction to Molecular Descriptors
    1.2.2 Molecular Fingerprints
    1.2.3 Key-Based Molecular Fingerprints - MACCS Keys
    1.2.4 Hash-Based Molecular Fingerprints - Daylight Fingerprint & ECFPs
    1.2.5 Advantages & Applications of Molecular Fingerprints
    1.2.6 Molecular Fingerprints in Machine Learning
    1.2.7 References
Machine Learning-Based Drug Development
2.1 Introduction to Machine Learning
    2.1.1 How does Machine Learning Work?
    2.1.2 Machine Learning Methods
    2.1.3 Machine Learning Notation
    2.1.4 References
2.2 Supervised Learning
    2.2.1 Classification Algorithms in Supervised Learning
    2.2.2 Regression Algorithms in Supervised Learning

References

[1] Abadi, M. et al. (2015) ‘TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems’, https://www.tensorflow.org/, Software available from tensorflow.org.

[2] Paszke, A. et al. (2017) ‘NIPS Autodiff Workshop’.

[3] Kim, J. et al. (2021) ‘Comprehensive survey of recent drug discovery using Deep Learning’, International Journal of Molecular Sciences, 22(18), p. 9983.

[4] Rifaioglu, A.S. et al. (2020) ‘DEEPScreen: High performance drug–target interaction prediction with convolutional neural networks using 2-D structural compound representations’, Chemical Science, 11(9), pp. 2531–2557.

[5] David, L. et al. (2020) ‘Molecular representations in AI-Driven Drug Discovery: A review and practical guide’, Journal of Cheminformatics, 12(1).

[6] Yang, K. et al. (2019) ‘Analyzing learned molecular representations for property prediction’, Journal of Chemical Information and Modeling, 59(8), pp. 3370–3388.

Name		Name	Last commit message	Last commit date
Latest commit History 188 Commits
1_molecular_representations		1_molecular_representations
2_machine_learning		2_machine_learning
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning in Drug Discovery

Abstract

Contents

References

About

Releases

Packages

Languages

c-vandenberg/machine-learning-in-drug-discovery

Folders and files

Latest commit

History

Repository files navigation

Machine Learning in Drug Discovery

Abstract

Contents

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages