Curated Compendium of Drug Discovery
Drug discovery is a multidisciplinary process that integrates biology, chemistry, pharmacology, and cutting-edge technologies to identify and develop new therapeutic agents. From target identification to lead optimization and clinical evaluation, each stage requires precision, innovation, and collaboration. A curated list of drug discovery resources provides researchers, students, and professionals with a structured pathway to explore advancements, tools, and strategies that shape modern therapeutics. This compilation serves as a gateway to understanding the evolution of drug discovery, recent breakthroughs, and future directions, fostering knowledge-sharing and accelerating translational research.
Databases and Chemical Libraries
- DrugBank - Comprehensive data on approved and investigational drugs.
- ZINC - Free compounds for screening.
- ChemSpider - Chemical structures and data.
- DrugSpaceX - Chemical and biological spaces.
- Mcule - Virtual screening platform with purchasable compounds.
- Otava Chemicals - Screening compounds and building blocks.
- Vitas-M Laboratory - Chemical libraries for HTS and lead discovery.
- Eximed - 60k+ compounds for virtual screening.
- OTAVA NP-like Library - Screening compounds for prompt delivery.
- Ambinter - 40M+ compounds for HTS, building blocks, and a wide selection of fragments and natural products.
- ZINC15 Natural Products - 200k+ natural compounds.
- COCONUT - 400k+ natural products.
- LOTUS - Annotated molecular data with sourcing organisms.
- NPASS - 94k activity-species links.
- ANPDB - 27k+ African medicinal plant compounds.
- SANCDB - Natural compounds from the plant and marine life in and around South Africa.
- CMNPD - 31k+ marine natural products.
- SistematX - 8k+ secondary metabolites.
- CoumarinDB - A manually curated database on coumarins from plants.
- ArtemisiaDB - Artemisia genus compounds.
- BIAdb - A database for benzylisoquinoline alkaloids.
- IMPPAT - Phytochemicals from Indian medicinal plants.
- NP-MRD - 280k+ NMR-based NP studies.
- IBS Natural Compounds - 60k+ compounds.
- PhytoHub - Dietary phytochemicals and metabolites.
- Dr. Duke's Phytochemical DB - Plant compounds and uses.
- CyanoMetDB - Over 3,000 cyanobacterial metabolites.
- Seaweed Metabolite DB - Marine algae compounds.
- FooDB - A comprehensive resource on food constituents.
- ChEMBL - Bioactivity and ADMET data.
- SureChEMBL - Patent chemistry search.
- BindingDB - Binding affinities for biomolecules.
- PubChem - Structures, properties, and bioassays.
- PDBbind - Protein-ligand affinity data.
- BRENDA - Enzyme properties and functions.
- ExCAPE-DB - A large-scale chemogenomics database.
- Therapeutics Data Commons - AI/ML-ready datasets and learning tasks for therapeutics.
- Therapeutic Target Database (TTD) - Drug targets with linked diseases and compounds.
- RCSB PDB - Repository for macromolecular structures.
- PDBe - European counterpart to RCSB PDB.
- OPM - Orientation of proteins in membranes.
- UniProt - Protein sequences, structures, and functions.
- InterPro - Protein classification and domain prediction.
- AlphaFold DB - Predicted structures from AlphaFold.
- Proteopedia - Interactive protein visualizations.
- PrankWeb - Pocket prediction and analysis.
- CASTp - Pocket geometry and volume analysis.
- CavityPlus - Pocket detection and druggability.
- CaverWeb - Tunnel and channel detection.
- PASSer - Allosteric site prediction.
- DynaMut - Predicts mutation-induced stability changes.
- SWISS-MODEL - A fully automated protein structure homology-modeling server.
- MODELLER - A software for homology or comparative modeling of protein structures.
- PDBFixer - Repairs PDB files by adding missing atoms, residues, and hydrogens for MD simulations.
- GeneCards - Human gene database with genomic, proteomic, and clinical data.
- SwissTargetPrediction - Predicts targets of small molecules via similarity-based screening.
- STITCH - Integrates chemical–protein interactions across organisms.
- STRING - A database of known and predicted protein–protein interactions.
- Cytoscape - Visualizes and analyzes molecular interaction networks.
- Open Targets - Integrative platform for therapeutic target identification.
- OmicsNet - Builds multi-omics networks for systems biology.
- DisGeNET - Curated gene–disease associations for network analysis.
- PharmMapper - Identifies potential targets via reverse pharmacophore mapping.
- ChEA3 - Transcription factor enrichment tool integrating ChIP-seq, co-expression, and perturbation datasets.
- miRDB - Predicts functional microRNA targets using machine learning and high-throughput data.
- Venny 2.1 - A web tool for comparing lists using Venn diagrams.
- ZINCPharmer - Pharmacophore screening.
- Pharmit - Interactive pharmacophore modeling.
- AnchorQuery - Pharmacophore-based search engine specialized in protein–protein interaction sites.
- QSAR Toolbox - Hazard assessment and QSAR.
- OCHEM - QSAR model building and prediction.
- ChemMaster - QSAR and cheminformatics suite.
- 3D-QSAR - 3D-QSAR modeling resources.
- QSAR-Co - Robust multitarget QSAR modeling.
- DataWarrior - Free software for chemical analysis, QSAR, and visualization.
- KNIME - Workflow platform for cheminformatics and ML integration.
- pyADA - Assesses the applicability domain of molecular fingerprints via similarity-based thresholds for QSAR validation.
- RDKit - Open-source cheminformatics toolkit with descriptor, fingerprint, and molecular manipulation support.
- PaDEL-Descriptor - Java tool for calculating molecular descriptors and fingerprints.
- Mordred - Python library with 1800+ molecular descriptors.
- CDK - Java cheminformatics library with descriptor calculators.
- alvaDesc - Commercial software for molecular descriptors and fingerprints.
- MolFeat - Python package for molecular featurization and embeddings.
- Dragon - Commercial molecular descriptor calculator (widely cited).
- SwissADME - Drug-likeness and PK.
- pkCSM - ADMET property prediction.
- DeepPK - DL-based pharmacokinetics.
- admetSAR 2.0 - Comprehensive ADMET.
- ADMETlab 2.0 - PK, toxicity and drug-likeness.
- ProTox-II - Toxicity predictions.
- PreADMET - PK property predictions.
- FAF-Drugs - ADMET filtering.
- Admetboost - ML-based ADMET prediction.
- MetaPredict - Predict molecular properties from structure.
- ADMET-AI - A web-based tool for predicting ADMET properties based on Chemprop-RDKit models trained on datasets from the TDC.
- SwissSidechain - Fragment and linker library for small molecule design.
- BoBER - Bioisosteric replacements for lead optimization.
- FragBuilder - Python API for building peptide-like and small molecule fragments.
- SeeSAR - Fragment growing and linking software (free academic version).
- Enamine Fragment Libraries - Large curated collection of diverse fragments for FBDD.
- OpenBabel - Format conversion and ligand prep.
- Meeko - Prepares ligands/receptors for AutoDock by assigning partial charges and atom types.
- MolScrub - Enumerates tautomers, pH states, and conformers for docking and structure-based modeling.
- MGLTools - Structure preparation.
- AutoDockTools - AutoDock GUI.
- AutoDock Vina - Popular docking software.
- AutoDock-GPU - GPU-accelerated version of AutoDock for faster ligand-receptor docking.
- DiffDock - Deep learning-based docking tool that predicts ligand poses directly from protein structures using diffusion models.
- EasyDockVina2 - Vina automation.
- Webina - Web-based Vina.
- Smina - Vina fork with extra features.
- Gnina - CNN-scoring docking.
- EasyDock - Vina/Smina pipeline.
- HADDOCK - Flexible docking suite.
- PandaDock - Python docking tool.
- ZDOCK - Protein-protein docking.
- ClusPro - Protein-protein docking server.
- pyDockWEB - Electrostatics-based docking.
- SwissDock - Web docking for beginners.
- MzDOCK - GUI docking pipeline.
- Uni-Mol Docking V2 - AI-assisted docking.
- Vina on Colab - Run Vina in Google Colab.
- PLIP - Protein-ligand interaction profiling.
- LigPlot+ - 2D interaction diagrams.
- Discovery Studio Visualizer - Advanced visualization.
- PyMOL - Python-based molecular visualization software.
- UCSF ChimeraX - A molecular visualization program with emphasis on structural biology.
- Avogadro - Cross-platform molecular editor and visualizer featuring an extensible plugin system.
- GROMACS - Fast, scalable MD engine optimized for biomolecular simulations and energy minimization.
- OpenMM - Flexible MD toolkit with GPU acceleration and Python bindings.
- LAMMPS - Classical MD simulator for materials science and soft matter.
- NAMD - Highly parallel MD engine tailored for large biomolecular systems.
- AMBER - Suite for biomolecular simulations and free energy calculations.
- Desmond - GPU-accelerated MD engine for high-performance simulations.
- CGenFF - CHARMM force field parametrization of drug-like molecules.
- SwissParam - Rapid generation of CHARMM-compatible parameters for small organic molecules.
- ATB - Automated topology builder and repository for classical force field parameters.
- CHARMM-GUI - Web-based interface for building complex biomolecular systems and generating MD input files.
- LigParGen - Automated OPLS-AA parameter generator for organic ligands.
- MD DaVis - Interactive visualization and analysis of MD trajectories.
- iMod - Normal Mode Analysis toolkit using internal coordinates.
- MolAiCal - Web-based platform for binding free energy calculations using MM/PBSA and MM/GBSA methods.
- gmx_MMPBSA - Port of AMBER MMPBSA.py for GROMACS.
- VMD - Large biomolecular systems visualization and analysis using 3D graphics and scripting.
- Grace - 2D plotting tool for Unix-like systems with advanced graphing, fitting, and analysis features.
- CPPTRAJ - Fast, parallelizable trajectory analysis from AMBER.
- MDAnalysis - Open-source Python library for analyzing MD simulations.
- Spaya - AI-driven retrosynthesis engine with route ranking and synthetic feasibility scoring.
- AiZynthFinder - Monte Carlo tree search-based retrosynthesis using trained neural networks.
- ASKCOS - Synthesis route prediction with ML, developed by MIT.
- IBM RoboRXN - Automated reaction prediction using transformer models.
- MANIFOLD - Search engine for synthetically accessible molecules and building blocks.
- PROTAC-db - Curated database of PROTAC molecules, targets, and linkers for degrader design.
- PROsettaC - Structure-based modeling of ternary complexes for targeted protein degradation.
- PepDraw - Peptide visualization with annotated physicochemical properties.
- PepSite - Predict peptide binding sites on protein surfaces using structural data.
- Peptimap - Peptide mapping and binding hotspots identification.
- scikit-learn - General-purpose ML library for classification, regression, clustering, and model evaluation.
- PyTorch - Deep learning framework with extensive support for neural network modeling.
- TensorFlow - End-to-end ML platform for scalable model development and deployment.
- Keras - High-level neural network API running on top of TensorFlow, designed for fast experimentation.
- NumPy - Core library for numerical computing with support for arrays, matrices, and linear algebra.
- Pandas - Data manipulation and analysis toolkit built on top of NumPy.
- Matplotlib - Comprehensive library for creating static, animated, and interactive visualizations in Python.
- seaborn - Statistical data visualization library built on top of Matplotlib.
- DeepChem - Open-source deep learning framework for chemistry and biology.
- scikit-mol - Scikit-learn compatible cheminformatics extensions for molecular ML workflows.
- Chemprop - Directed message passing neural networks for molecular property prediction.
- ChemML - Machine learning and informatics suite for analyzing, mining, and modeling chemical and materials data.
- Oloren ChemEngine - Unified API for molecular property prediction with uncertainty quantification, interpretability, and model tuning.
- TorchDrug - A machine learning library for drug discovery with support for GNNs and molecular datasets.
- DGL-LifeSci - Graph deep learning toolkit for life sciences using the Deep Graph Library.
- MolBERT - Transformer-based molecular representation learning.
- ChemBERTa - Pretrained BERT-like models for molecules from SMILES.
- Uni-Mol - 3D molecular representation learning framework.
- Boltz-2 - A foundation model that jointly predicts structure and binding affinity, rivaling physics-based FEP methods in accuracy.
- Auto-sklearn - Automated machine learning for scikit-learn.
- TPOT - Genetic programming-based AutoML for optimizing ML pipelines.
- Optuna - Hyperparameter optimization framework for machine learning.
- MolVS - Molecule validation and standardization library based on RDKit.
- ProteinsPlus - A web-based platform designed to assist life scientists in analyzing and working with protein structures.
- OPSIN - Convert IUPAC names to chemical structures.
- OSRA - Extract chemical structures from images.
- ChemPlot - Chemical space visualization.
- ChemDB - Chemoinformatics portal with compound data and tools.
- Screening Explorer - Analyze screening datasets and hit distributions.
- LigRMSD - Calculate RMSD between ligand poses.
- NERDD - Curated drug discovery resources.
- LigBuilder3 - De novo ligand design.
- ChemMine Tools - Web-based cheminformatics toolkit for compound analysis.
- MayaChemTools - Perl/Python scripts for cheminformatics.
- Click2Drug - CADD software and databases directory.
- Galaxy Europe - Galaxy instance for cheminformatics.
- CADD Vault - CADD resources repository.
- BioMoDes - Biomolecular structure prediction and modeling tools.
- PlayMolecule - Interactive molecular modeling and simulation platform.
- Ertl Molecular - Cheminformatics tools for medicinal chemists, including scaffold analysis, ring replacement, and property calculators.
- TMP Chem Lectures - Recorded lectures from a leading cheminformatics summer school.
- Strasbourg Summer School in Chemoinformatics - Summer school lectures.
- BIGCHEM - Online course on big data applications in chemistry.
- Drug Discovery Course - Foundations of drug discovery and development.
- drugdesign.org - Free courses on drug design and cheminformatics.
- Cheminformatics OLCC - Intercollegiate course on cheminformatics theory and coding.
- Python For Cheminformatics Docking - Python tutorials for molecular docking via RCSB.
- DDA CDD Workshop - Workshop on generative and computational drug design.
- MDTutorials - Step-by-step tutorials for MD simulations using GROMACS.
- Practical Fragments - Insights into fragment-based drug discovery.
- Practical Cheminformatics - Tools and tips for cheminformatics workflows.
- Neovarsity - Deep-tech blog on cheminformatics and drug discovery applications.
- Cheminformania - Cheminformatics meets deep learning and molecular modeling.
- Daily Dose of Data Science - Digestible data science tutorials and concepts.
- Machine Learning Mastery - Practical ML guides for developers and scientists.
- Chem-Workflows - Jupyter-based chemistry workflows and tutorials.
- Structural Bioinformatics - Guide to structure-based drug design and protein modeling.
- McConnellsMedChem - Medicinal chemistry insights and commentary.
- DrugDiscovery.NET - AI-powered approaches to drug discovery.
- MacinChem - Computational chemistry tools for macOS users.
- Jeremy Monat - Cheminformatics research and academic resources.
- RDKit blog - A rich collection of tutorials, technical tips, and experimental insights from Greg Landrum.
- DeepMedChem - AI-powered insights, tool reviews, and workflows for modern drug discovery.
- TeachOpenCADD - Modular Jupyter tutorials for CADD workflows and concepts.
- intro_pharma_ai - Notebook-based introduction to AI applications in pharma.
- Practical Cheminformatics Tutorials - Hands-on Jupyter tutorials for RDKit, SAR, clustering, generative models, and ML pipelines.
- AI/DL for Life Sciences - Interactive notebooks showcasing AI/DL use cases in life sciences.
- Carlsson Lab - GPCR modeling, receptor-ligand interactions, MD, docking, and AI for drug discovery. (Uppsala University, Sweden)
- InSiliChem - Computational chemobiology and metalloenzyme design. (Universitat Autònoma de Barcelona, Spain)
- LCBC - Molecular dynamics, free energy calculations, retrosynthesis using machine learning. (Seoul National University, Korea)
- Angelo Raymond Rossi - High-performance computing for computational chemistry and cheminformatics. (University of Connecticut, USA)
- Laboratory of Chemoinformatics - QSAR/QSPR, chemical similarity, and virtual screening. (Université de Strasbourg / CNRS, France)
- Erastova Lab - Molecular modeling of soft matter and biomolecular simulations. (University of Edinburgh, UK)
- The Ballester Group - Developing ML/AI methods for structure-based scoring and virtual screening. (Imperial College London, UK)
- Meiler Lab - Rosetta software, protein design, and ML-based protein engineering. (Vanderbilt / Leipzig University, USA / Germany)
- COMP3D - Develops and applies AI methods to design safe, effective pharmaceuticals and agrochemicals. (University of Vienna, Austria)
- Bonvin Lab - Computational structural biology, HADDOCK, and integrative modeling. (Utrecht University, Netherlands)
- Volkamer Lab - Binding site analysis and AI-powered virtual screening. (Saarland University, Germany)
- AI Laboratory for Molecular Engineering - PROTACs, molecular glues, and ML for chemistry and life sciences. (Chalmers University, Sweden)
- Loschmidt Labs - PEG - Protein and enzyme engineering, AI-assisted enzyme design. (Masaryk University, Czechia)
- QSAR4U - Cheminformatics tools, QSAR modeling, CReM, and EasyDock. (Palacky University, Czechia)


Comments
Post a Comment