Molecular Docking

Machine learning and artificial intelligence applications have received a significant boost in performance and attention in both academic research and industry. A computational technique used in drug discovery to search libraries of small molecules in order to identify those structures which are most likely to bind to a drug target, typically a protein receptor or enzyme. Molecular docking of small molecules in the protein binding sites is the most widely used computational technique in modern structure-based drug discovery. CD ComputaBio has the state-of-the-art machine learning (ML) techniques in computational docking. Computational docking is the process of predicting the best pose (orientation + conformation) of a small molecule (drug candidate) when bound to a target larger receptor molecule (protein) in order to form a stable complex molecule.

Step 1. Docking
Step 2. Scoring


Computer Aided Drug Design Technologies (Physics-based)
Artificial Intelligence (Experiences-based)


  • nomain-title-log-pic2 Directory of Useful Decoys – Enhanced (DUD-E)
    A dataset designed to help benchmark structure-based virtual screening methods including 102 targets, whose decoys were selected from ZINC, 50 decoys for each active having similar physicochemical properties but dissimilar 2-D topology. There are classical targets of different protein families in DUD-E dataset, including kinase, protease, nuclear receptor, GPCR, and others.
  • nomain-title-log-pic2 Maximum Unbiased Validation (MUV)
    A dataset that is equally unbiased for assessment of the quality of virtual screening methods, each target with ~30 actives and 15,000 decoys, whose decoys are selected from a primary screen (PubChem). There are classical targets of different protein families in MUV dataset, including kinase, protease, nuclear receptor, PPI, and others.
  • nomain-title-log-pic2 Dataset of Proteins with Their Possible Ligands (PDB, PDBbind, Binding DB, DUD etc.)

Docking and Scoring Software Programs

Deep learning systems, as convolutional neural networks (CNN) implementations have been previously used to create a function that predicts the free energy of molecular binding (a score) using the structural information generated by docking software. Our molecular dynamics (MD)-based protocols are capable in estimating the free energy of binding between the ligand and target protein.

  • nomain-title-log-pic1 AutoDock, eHiTS, iDock, etc.
  • nomain-title-log-pic1 Smina(CADD), Glide SP, AtomNet, etc.

Model Input


Atom type


Atomic partial charges


Amino acid types


Distances from neighbors to the reference atom

Model Structure


* A sigmoid function is a type of activation function, and more specifically defined as a squashing function. Squashing functions limit the output to a range between 0 and 1, making these functions useful in the prediction of probabilities. Sigmoidal functions are frequently used in machine learning, specifically in the testing of artificial neural networks, as a way of understanding the output of a node or "neuron."


  • nomain-title-log-pic2 Structure-based virtual screening is an important tool for compound prioritization. Experiences-based scoring function works well on such field, which can displace traditional physics-based way (CADD).
  • nomain-title-log-pic2 Deep learning-based method developed by CD ComputaBio's experts provides an alternative and promising way to prioritize compound during screening.
  • nomain-title-log-pic2 The universe docking model (trained by mixed data such as DUD-E) works well in cross-validation.
  • nomain-title-log-pic2 Target-specific docking model perform much better than universe model in independent validation.
  • nomain-title-log-pic2 CD ComputaBio provides a new solution to solve the generalization issue of deep learning based docking system for virtual screening.


Online Inquiry

CD ComputaBio

Copyright © 2024 CD ComputaBio Inc. All Rights Reserved.