Curated Compendium of Drug Discovery

 



Drug discovery is a multidisciplinary process that integrates biology, chemistry, pharmacology, and cutting-edge technologies to identify and develop new therapeutic agents. From target identification to lead optimization and clinical evaluation, each stage requires precision, innovation, and collaboration. A curated list of drug discovery resources provides researchers, students, and professionals with a structured pathway to explore advancements, tools, and strategies that shape modern therapeutics. This compilation serves as a gateway to understanding the evolution of drug discovery, recent breakthroughs, and future directions, fostering knowledge-sharing and accelerating translational research.

Databases and Chemical Libraries

General Compound Libraries

  • DrugBank - Comprehensive data on approved and investigational drugs.
  • ZINC - Free compounds for screening.
  • ChemSpider - Chemical structures and data.
  • DrugSpaceX - Chemical and biological spaces.
  • Mcule - Virtual screening platform with purchasable compounds.
  • Otava Chemicals - Screening compounds and building blocks.
  • Vitas-M Laboratory - Chemical libraries for HTS and lead discovery.
  • Eximed - 60k+ compounds for virtual screening.
  • OTAVA NP-like Library - Screening compounds for prompt delivery.
  • Ambinter - 40M+ compounds for HTS, building blocks, and a wide selection of fragments and natural products.

Natural Product Libraries

Bioactivity Databases


Target and Protein Data

Protein Structures

  • RCSB PDB - Repository for macromolecular structures.
  • PDBe - European counterpart to RCSB PDB.
  • OPM - Orientation of proteins in membranes.
  • UniProt - Protein sequences, structures, and functions.
  • InterPro - Protein classification and domain prediction.
  • AlphaFold DB - Predicted structures from AlphaFold.
  • Proteopedia - Interactive protein visualizations.

Binding Site and Pocket Detection

  • PrankWeb - Pocket prediction and analysis.
  • CASTp - Pocket geometry and volume analysis.
  • CavityPlus - Pocket detection and druggability.
  • CaverWeb - Tunnel and channel detection.
  • PASSer - Allosteric site prediction.

Protein Engineering and Modeling

  • DynaMut - Predicts mutation-induced stability changes.
  • SWISS-MODEL - A fully automated protein structure homology-modeling server.
  • MODELLER - A software for homology or comparative modeling of protein structures.
  • PDBFixer - Repairs PDB files by adding missing atoms, residues, and hydrogens for MD simulations.

Network Pharmacology

  • GeneCards - Human gene database with genomic, proteomic, and clinical data.
  • SwissTargetPrediction - Predicts targets of small molecules via similarity-based screening.
  • STITCH - Integrates chemical–protein interactions across organisms.
  • STRING - A database of known and predicted protein–protein interactions.
  • Cytoscape - Visualizes and analyzes molecular interaction networks.
  • Open Targets - Integrative platform for therapeutic target identification.
  • OmicsNet - Builds multi-omics networks for systems biology.
  • DisGeNET - Curated gene–disease associations for network analysis.
  • PharmMapper - Identifies potential targets via reverse pharmacophore mapping.
  • ChEA3 - Transcription factor enrichment tool integrating ChIP-seq, co-expression, and perturbation datasets.
  • miRDB - Predicts functional microRNA targets using machine learning and high-throughput data.
  • Venny 2.1 - A web tool for comparing lists using Venn diagrams.

Ligand Design and Optimization

Pharmacophore Modeling

  • ZINCPharmer - Pharmacophore screening.
  • Pharmit - Interactive pharmacophore modeling.
  • AnchorQuery - Pharmacophore-based search engine specialized in protein–protein interaction sites.

QSAR and Descriptor Tools

  • QSAR Toolbox - Hazard assessment and QSAR.
  • OCHEM - QSAR model building and prediction.
  • ChemMaster - QSAR and cheminformatics suite.
  • 3D-QSAR - 3D-QSAR modeling resources.
  • QSAR-Co - Robust multitarget QSAR modeling.
  • DataWarrior - Free software for chemical analysis, QSAR, and visualization.
  • KNIME - Workflow platform for cheminformatics and ML integration.
  • pyADA - Assesses the applicability domain of molecular fingerprints via similarity-based thresholds for QSAR validation.

Descriptor and Featurization Tools

  • RDKit - Open-source cheminformatics toolkit with descriptor, fingerprint, and molecular manipulation support.
  • PaDEL-Descriptor - Java tool for calculating molecular descriptors and fingerprints.
  • Mordred - Python library with 1800+ molecular descriptors.
  • CDK - Java cheminformatics library with descriptor calculators.
  • alvaDesc - Commercial software for molecular descriptors and fingerprints.
  • MolFeat - Python package for molecular featurization and embeddings.
  • Dragon - Commercial molecular descriptor calculator (widely cited).

Molecular Property Prediction

  • SwissADME - Drug-likeness and PK.
  • pkCSM - ADMET property prediction.
  • DeepPK - DL-based pharmacokinetics.
  • admetSAR 2.0 - Comprehensive ADMET.
  • ADMETlab 2.0 - PK, toxicity and drug-likeness.
  • ProTox-II - Toxicity predictions.
  • PreADMET - PK property predictions.
  • FAF-Drugs - ADMET filtering.
  • Admetboost - ML-based ADMET prediction.
  • MetaPredict - Predict molecular properties from structure.
  • ADMET-AI - A web-based tool for predicting ADMET properties based on Chemprop-RDKit models trained on datasets from the TDC.

Fragment-Based Drug Design

  • SwissSidechain - Fragment and linker library for small molecule design.
  • BoBER - Bioisosteric replacements for lead optimization.
  • FragBuilder - Python API for building peptide-like and small molecule fragments.
  • SeeSAR - Fragment growing and linking software (free academic version).
  • Enamine Fragment Libraries - Large curated collection of diverse fragments for FBDD.

Virtual Screening and Docking

  • OpenBabel - Format conversion and ligand prep.
  • Meeko - Prepares ligands/receptors for AutoDock by assigning partial charges and atom types.
  • MolScrub - Enumerates tautomers, pH states, and conformers for docking and structure-based modeling.
  • MGLTools - Structure preparation.
  • AutoDockTools - AutoDock GUI.
  • AutoDock Vina - Popular docking software.
  • AutoDock-GPU - GPU-accelerated version of AutoDock for faster ligand-receptor docking.
  • DiffDock - Deep learning-based docking tool that predicts ligand poses directly from protein structures using diffusion models.
  • EasyDockVina2 - Vina automation.
  • Webina - Web-based Vina.
  • Smina - Vina fork with extra features.
  • Gnina - CNN-scoring docking.
  • EasyDock - Vina/Smina pipeline.
  • HADDOCK - Flexible docking suite.
  • PandaDock - Python docking tool.
  • ZDOCK - Protein-protein docking.
  • ClusPro - Protein-protein docking server.
  • pyDockWEB - Electrostatics-based docking.
  • SwissDock - Web docking for beginners.
  • MzDOCK - GUI docking pipeline.
  • Uni-Mol Docking V2 - AI-assisted docking.
  • Vina on Colab - Run Vina in Google Colab.

Interaction Analysis and Visualization

  • PLIP - Protein-ligand interaction profiling.
  • LigPlot+ - 2D interaction diagrams.
  • Discovery Studio Visualizer - Advanced visualization.
  • PyMOL - Python-based molecular visualization software.
  • UCSF ChimeraX - A molecular visualization program with emphasis on structural biology.
  • Avogadro - Cross-platform molecular editor and visualizer featuring an extensible plugin system.

Molecular Dynamics and Simulation

Engines

  • GROMACS - Fast, scalable MD engine optimized for biomolecular simulations and energy minimization.
  • OpenMM - Flexible MD toolkit with GPU acceleration and Python bindings.
  • LAMMPS - Classical MD simulator for materials science and soft matter.
  • NAMD - Highly parallel MD engine tailored for large biomolecular systems.
  • AMBER - Suite for biomolecular simulations and free energy calculations.
  • Desmond - GPU-accelerated MD engine for high-performance simulations.

Topology and Force Field Tools

  • CGenFF - CHARMM force field parametrization of drug-like molecules.
  • SwissParam - Rapid generation of CHARMM-compatible parameters for small organic molecules.
  • ATB - Automated topology builder and repository for classical force field parameters.
  • CHARMM-GUI - Web-based interface for building complex biomolecular systems and generating MD input files.
  • LigParGen - Automated OPLS-AA parameter generator for organic ligands.

Analysis Tools

  • MD DaVis - Interactive visualization and analysis of MD trajectories.
  • iMod - Normal Mode Analysis toolkit using internal coordinates.
  • MolAiCal - Web-based platform for binding free energy calculations using MM/PBSA and MM/GBSA methods.
  • gmx_MMPBSA - Port of AMBER MMPBSA.py for GROMACS.
  • VMD - Large biomolecular systems visualization and analysis using 3D graphics and scripting.
  • Grace - 2D plotting tool for Unix-like systems with advanced graphing, fitting, and analysis features.
  • CPPTRAJ - Fast, parallelizable trajectory analysis from AMBER.
  • MDAnalysis - Open-source Python library for analyzing MD simulations.

Synthesis and Retrosynthesis Planning

  • Spaya - AI-driven retrosynthesis engine with route ranking and synthetic feasibility scoring.
  • AiZynthFinder - Monte Carlo tree search-based retrosynthesis using trained neural networks.
  • ASKCOS - Synthesis route prediction with ML, developed by MIT.
  • IBM RoboRXN - Automated reaction prediction using transformer models.
  • MANIFOLD - Search engine for synthetically accessible molecules and building blocks.

Specialized Modalities

PROTACs and Ternary Complexes

  • PROTAC-db - Curated database of PROTAC molecules, targets, and linkers for degrader design.
  • PROsettaC - Structure-based modeling of ternary complexes for targeted protein degradation.

Peptide Design

  • PepDraw - Peptide visualization with annotated physicochemical properties.
  • PepSite - Predict peptide binding sites on protein surfaces using structural data.
  • Peptimap - Peptide mapping and binding hotspots identification.

Machine Learning and AI

Core Libraries

  • scikit-learn - General-purpose ML library for classification, regression, clustering, and model evaluation.
  • PyTorch - Deep learning framework with extensive support for neural network modeling.
  • TensorFlow - End-to-end ML platform for scalable model development and deployment.
  • Keras - High-level neural network API running on top of TensorFlow, designed for fast experimentation.
  • NumPy - Core library for numerical computing with support for arrays, matrices, and linear algebra.
  • Pandas - Data manipulation and analysis toolkit built on top of NumPy.
  • Matplotlib - Comprehensive library for creating static, animated, and interactive visualizations in Python.
  • seaborn - Statistical data visualization library built on top of Matplotlib.

Chemistry-focused ML Frameworks

  • DeepChem - Open-source deep learning framework for chemistry and biology.
  • scikit-mol - Scikit-learn compatible cheminformatics extensions for molecular ML workflows.
  • Chemprop - Directed message passing neural networks for molecular property prediction.
  • ChemML - Machine learning and informatics suite for analyzing, mining, and modeling chemical and materials data.
  • Oloren ChemEngine - Unified API for molecular property prediction with uncertainty quantification, interpretability, and model tuning.
  • TorchDrug - A machine learning library for drug discovery with support for GNNs and molecular datasets.
  • DGL-LifeSci - Graph deep learning toolkit for life sciences using the Deep Graph Library.

Pretrained Models

  • MolBERT - Transformer-based molecular representation learning.
  • ChemBERTa - Pretrained BERT-like models for molecules from SMILES.
  • Uni-Mol - 3D molecular representation learning framework.
  • Boltz-2 - A foundation model that jointly predicts structure and binding affinity, rivaling physics-based FEP methods in accuracy.

AutoML and Optimization

  • Auto-sklearn - Automated machine learning for scikit-learn.
  • TPOT - Genetic programming-based AutoML for optimizing ML pipelines.
  • Optuna - Hyperparameter optimization framework for machine learning.

Molecule Standardization

  • MolVS - Molecule validation and standardization library based on RDKit.

Utility and Workflow Tools

  • ProteinsPlus - A web-based platform designed to assist life scientists in analyzing and working with protein structures.
  • OPSIN - Convert IUPAC names to chemical structures.
  • OSRA - Extract chemical structures from images.
  • ChemPlot - Chemical space visualization.
  • ChemDB - Chemoinformatics portal with compound data and tools.
  • Screening Explorer - Analyze screening datasets and hit distributions.
  • LigRMSD - Calculate RMSD between ligand poses.
  • NERDD - Curated drug discovery resources.
  • LigBuilder3 - De novo ligand design.
  • ChemMine Tools - Web-based cheminformatics toolkit for compound analysis.
  • MayaChemTools - Perl/Python scripts for cheminformatics.
  • Click2Drug - CADD software and databases directory.
  • Galaxy Europe - Galaxy instance for cheminformatics.
  • CADD Vault - CADD resources repository.
  • BioMoDes - Biomolecular structure prediction and modeling tools.
  • PlayMolecule - Interactive molecular modeling and simulation platform.
  • Ertl Molecular - Cheminformatics tools for medicinal chemists, including scaffold analysis, ring replacement, and property calculators.

Learning Resources

Free Courses

Blogs

Instructional Notebooks

Labs and Research Groups

  • Carlsson Lab - GPCR modeling, receptor-ligand interactions, MD, docking, and AI for drug discovery. (Uppsala University, Sweden)
  • InSiliChem - Computational chemobiology and metalloenzyme design. (Universitat Autònoma de Barcelona, Spain)
  • LCBC - Molecular dynamics, free energy calculations, retrosynthesis using machine learning. (Seoul National University, Korea)
  • Angelo Raymond Rossi - High-performance computing for computational chemistry and cheminformatics. (University of Connecticut, USA)
  • Laboratory of Chemoinformatics - QSAR/QSPR, chemical similarity, and virtual screening. (Université de Strasbourg / CNRS, France)
  • Erastova Lab - Molecular modeling of soft matter and biomolecular simulations. (University of Edinburgh, UK)
  • The Ballester Group - Developing ML/AI methods for structure-based scoring and virtual screening. (Imperial College London, UK)
  • Meiler Lab - Rosetta software, protein design, and ML-based protein engineering. (Vanderbilt / Leipzig University, USA / Germany)
  • COMP3D - Develops and applies AI methods to design safe, effective pharmaceuticals and agrochemicals. (University of Vienna, Austria)
  • Bonvin Lab - Computational structural biology, HADDOCK, and integrative modeling. (Utrecht University, Netherlands)
  • Volkamer Lab - Binding site analysis and AI-powered virtual screening. (Saarland University, Germany)
  • AI Laboratory for Molecular Engineering - PROTACs, molecular glues, and ML for chemistry and life sciences. (Chalmers University, Sweden)
  • Loschmidt Labs - PEG - Protein and enzyme engineering, AI-assisted enzyme design. (Masaryk University, Czechia)
  • QSAR4U - Cheminformatics tools, QSAR modeling, CReM, and EasyDock. (Palacky University, Czechia)




Comments

Popular posts from this blog

Understanding NMR Spectroscopy and Chemical Shift Ranges for Functional Groups

HAPPY WORLD PHARMACIST DAY TO ASPIRING PHARMACIST