Awesome Python Chemistry
A curated list of awesome Python frameworks, libraries, software and resources related to Chemistry.
Inspired by awesome-python.
Table of contents
General Chemistry
Packages and tools for general chemistry.
-
AQME - Ensemble of automated QM workflows that can be run through jupyter notebooks, command lines and yaml files.
-
aizynthfinder - A tool for retrosynthetic planning.
-
batchcalculator - A GUI app based on wxPython for calculating the correct amount of reactants (batch) for a particular composition given by the molar ratio of its components.
-
cctbx - The Computational Crystallography Toolbox.
-
ChemFormula - ChemFormula provides a class for working with chemical formulas. It allows parsing chemical formulas, calculating formula weights, and generating formatted output strings (e.g. in HTML, LaTeX, or Unicode).
-
chemlib - A robust and easy-to-use package that solves a variety of chemistry problems.
-
chempy - ChemPy is a package useful for chemistry (mainly physical/inorganic/analytical chemistry).
-
datamol: - Molecular Manipulation Made Easy. A light wrapper build on top of RDKit.
-
GoodVibes - A Python program to compute quasi-harmonic thermochemical data from Gaussian frequency calculations.
-
hgraph2graph - Hierarchical Generation of Molecular Graphs using Structural Motifs.
-
ionize - Calculates the properties of individual ionic species in aqueous solution, as well as aqueous solutions containing arbitrary sets of ions.
-
LModeA-nano - Calculates the intrinsic chemical bond strength based on local vibrational mode theory in solids and molecules.
-
mendeleev - A package that provides a python API for accessing various properties of elements from the periodic table of elements.
-
nmrglue - A package for working with nuclear magnetic resonance (NMR) data including functions for reading common binary file formats and processing NMR data.
-
Open Babel - A chemical toolbox designed to speak the many languages of chemical data.
-
periodictable - This package provides a periodic table of the elements with support for mass, density and xray/neutron scattering information.
-
propka - Predicts the pKa values of ionizable groups in proteins and protein-ligand complexes based in the 3D structure.
-
pybel - Pybel provides convenience functions and classes that make it simpler to use the Open Babel libraries from Python.
-
pycroscopy - Scientific analysis of nanoscale materials imaging data.
-
pyEQL - A set of tools for conventional calculations involving solutions (mixtures) and electrolytes.
-
pyiron - pyiron - an integrated development environment (IDE) for computational materials science.
-
pymatgen - Python Materials Genomics is a robust, open-source library for materials analysis.
-
pymatviz - A toolkit for visualizations in materials informatics.
-
symfit - a curve-fitting library ideally suited to chemistry problems, including fitting experimental kinetics data.
-
symmetry - Symmetry is a library for materials symmetry analysis.
-
stk - A library for building, manipulating, analyzing and automatic design of molecules, including a genetic algorithm.
-
spectrochempy - A library for processing, analyzing and modeling spectroscopic data.
Machine Learning
Packages and tools for employing machine learning and data science in chemistry.
-
amp - Is an open-source package designed to easily bring machine-learning to atomistic calculations.
-
atom3d - Enables machine learning on three-dimensional molecular structure.
-
chainer-chemistry - A deep learning framework (based on Chainer) with applications in Biology and Chemistry.
-
chemml - A machine learning and informatics program suite for the analysis, mining, and modeling of chemical and materials data.
-
chemprop - Message Passing Neural Networks for Molecule Property Prediction .
-
cgcnn - Crystal graph convolutional neural networks for predicting material properties.
-
deepchem - Deep-learning models for Drug Discovery and Quantum Chemistry.
-
DeepPurpose - A Deep Learning Library for Compound and Protein Modeling DTI, Drug Property, PPI, DDI, Protein Function Prediction.
-
DescriptaStorus - Descriptor computation (chemistry) and (optional) storage for machine learning.
-
DScribe - Descriptor library containing a variety of fingerprinting techniques, including the Smooth Overlap of Atomic Positions (SOAP).
-
graphein - Provides functionality for producing geometric representations of protein and RNA structures, and biological interaction networks.
-
Matminer - Library of descriptors to aid in the data-mining of materials properties, created by the Lawrence Berkeley National Laboratory.
-
MoleOOD - a robust molecular representation learning framework against distribution shifts.
-
megnet - Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals.
-
MAML - Aims to provide useful high-level interfaces that make ML for materials science as easy as possible.
-
MORFEUS - Library for fast calculations of molecular features from 3D structures for machine learning with a focus on steric descriptors.
-
olorenchemengine - Molecular property prediction with unified API for diverse models and respresentations,
with integrated uncertainty quantification, interpretability, and hyperparameter/architecture tuning.
-
ROBERT - Ensemble of automated machine learning protocols that can be run sequentially through a single command line. The program works for regression and classification problems.
-
schnetpack - Deep Neural Networks for Atomistic Systems.
-
selfies - Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular string representation.
-
Summit - Package for optimizing chemical reactions using machine learning (contains 10 algorithms + several benchmarks).
-
TDC - Therapeutics Data Commons (TDC) is the first unifying framework to systematically access and evaluate machine learning across the entire range of therapeutics.
-
XenonPy - Library with several compositional and structural material descriptors, along with a few pre-trained neural network models of material properties.
Generative Molecular Design
Packages and tools for generating molecular species
-
GraphINVENT - A platform for graph-based molecular generation using graph neural networks.
-
GuacaMol - A package for benchmarking of models for de novo molecular design.
-
moses - A benchmarking platform for molecular generation models.
-
perses - Experiments with expanded ensembles to explore chemical space.
Simulations
Packages for atomistic simulations and computational chemistry.
-
alchemlyb - Makes alchemical free energy calculations easier by leveraging the full power and flexibility of the PyData stack.
-
atomate2 - atomate2 is a library of computational materials science workflows.
-
Atomic Silumation Environment (ASE) - Is a set of tools and modules for setting up, manipulating, running, visualizing and analyzing atomistic simulations.
-
basis_set_exchange - A library containing basis sets for use in quantum chemistry calculations. In addition, this library has functionality for manipulation of basis set data.
-
CACTVS - Cactvs is a universal, scriptable cheminformatics toolkit, with a large collection of modules for property computation, chemistry data file I/O and other tasks.
-
CalcUS - Quantum chemisttry web platform that brings all the necessary tools to perform quantum chemistry in a user-friendly web interface.
-
cantera - A collection of object-oriented software tools for problems involving chemical kinetics, thermodynamics, and transport processes.
-
CatKit - General purpose tools for high-throughput catalysis.
-
ccinput - A tool and library for creating quantum chemistry input files.
-
cclib - A library for parsing output files various quantum chemical programs.
-
cinfony - A common API to several cheminformatics toolkits (Open Babel, RDKit, the CDK, Indigo, JChem, OPSIN and cheminformatics webservices).
-
chemlab - Is a library that can help the user with chemistry-relevant calculations.
-
emmet - A package to 'build' collections of materials properties from the output of computational materials calculations.
-
fromage - The "FRamewOrk for Molecular AGgregate Excitations" enables localised QM/QM' excited state calculations in a solid state environment.
-
GPAW - Is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE).
-
horton - Helpful Open-source Research TOol for N-fermion system, a quantum-chemistry program that can perform computations involving model Hamiltonians.
-
HTMD - High-Throughput Molecular Dynamics: Programming Environment for Molecular Discovery.
-
Indigo - Universal cheminformatics libraries, utilities and database search tools.
-
Jarvis-tools - An open-access software package for atomistic data-driven materials design
-
mathchem - Is a free open source package for calculating topological indices and other invariants of molecular graphs.
-
MDAnalysis - Is an object-oriented library to analyze trajectories from molecular dynamics (MD) simulations in many popular formats.
-
MDTraj - Package for manipulating molecular dynamics trajectories with support for multiple formats.
-
MMTK - The Molecular Modeling Toolkit is an Open Source program library for molecular simulation applications.
-
MolMod - A library with many components that are useful to write molecular modeling programs.
-
oddt - Open Drug Discovery Toolkit, a modular and comprehensive toolkit for use in cheminformatics, molecular modeling etc.
-
OPEM - Open source PEM (Proton Exchange Membrane) fuel cell simulation tool.
-
openmmtools - A batteries-included toolkit for the GPU-accelerated OpenMM molecular simulation engine.
-
overreact - A library and command-line tool for building and analyzing complex homogeneous microkinetic models from quantum chemistry calculations, with support for quasi-harmonic thermochemistry, quantum tunnelling corrections, molecular symmetries and more.
-
ParmEd - Parameter/topology editor and molecular simulator with visualization capability.
-
pGrAdd - A library for estimating thermochemical properties of molecules and adsorbates using group additivity.
-
phonopy - An open source package for phonon calculations at harmonic and quasi-harmonic levels.
-
PLAMS - Python Library for Automating Molecular Simulation: input preparation, job execution, file management, output processing and building data workflows.
-
pMuTT - A library for ab-initio thermodynamic and kinetic parameter estimation.
-
PorePy - A Simulation Tool for Fractured and Deformable Porous Media.
-
ProDy - An open source package for protein structural dynamics analysis with a flexible and responsive API.
-
ProLIF - Interaction Fingerprints for protein-ligand complexes and more.
-
Psi4 - A hybrid Python/C++ open-source package for quantum chemistry.
-
Psi4NumPy - Psi4-based reference implementations and Jupyter notebook-based tutorials for foundational quantum chemistry methods.
-
pyEMMA - Library for the estimation, validation and analysis Markov models of molecular kinetics and other kinetic and thermodynamic models from molecular dynamics data.
-
pygauss - An interactive tool for supporting the life cycle of a computational molecular chemistry investigations.
-
PyQuante - Is an open-source suite of programs for developing quantum chemistry methods.
-
pysic - A calculator incorporating various empirical pair and many-body potentials.
-
Pyscf - A quantum chemistry package written in Python.
-
pyvib2 - A program for analyzing vibrational motion and vibrational spectra.
-
RDKit - Open-Source Cheminformatics Software.
-
ReNView - A program to visualize reaction networks.
-
stk - A library for building, manipulating, analyzing and automatic design of molecules.
-
QMsolve - A module for solving and visualizing the Schrödinger equation.
-
QUIP - A collection of software tools to carry out molecular dynamics simulations.
-
torchmd - End-To-End Molecular Dynamics (MD) Engine using PyTorch.
-
tsase - The library which depends on ASE to tackle transition state calculations.
-
yank - An open, extensible Python framework for GPU-accelerated alchemical free energy calculations.
Force Fields
Packages related to force fields
-
CHGNet - Pretrained universal neural network potential for charge-informed atomistic modeling.
-
FitSNAP - A Package For Training SNAP Interatomic Potentials for use in the LAMMPS molecular dynamics package.
-
fftool - Tool to build force field input files for molecular simulation.
-
FLARE - A package for creating fast and accurate interatomic potentials.
-
global-chem - A Chemical Knowledge Graph and Toolkit, writting in IUPAC/SMILES/SMARTS, for common small molecules from diverse communities to aid users in selecting compounds for forcefield parametirization.
-
matbench-discovery - A benchmark for ML-guided high-throughput materials discovery.
-
NeuralForceField - Neural Network Force Field based on PyTorch.
-
openff-toolkit - The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools.
Molecular Visualization
Packages for viewing molecular structures.
-
ase-gui - The graphical user-interface allows users to visualize, manipulate, and render molecular systems and atoms objects.
-
chemiscope - An interactive structure/property explorer for materials and molecules.
-
chemview - An interactive molecular viewer designed for the IPython notebook.
-
imolecule - An embeddable webGL molecule viewer and file format converter.
-
moleculekit - A molecule manipulation library.
-
nglview - A Jupyter widget to interactively view molecular structures and trajectories.
-
PyMOL - A user-sponsored molecular visualization system on an open-source foundation, maintained and distributed by Schrödinger.
-
pymoldyn - A viewer for atomic clusters, crystalline and amorphous materials in a unit cell corresponding to one of the seven 3D Bravais lattices.
-
sumo - A toolkit for plotting and analysis of ab initio solid-state calculation data.
-
surfinpy - A library for the analysis, plotting and visualisation of ab initio surface calculation data.
-
trident-chemwidgets - Jupyter Widgets to interact with molecular datasets.
Database Wrappers
Providing a python layer for accessing chemical databases
-
ccdc - An API for the Cambridge Structural Database System.
-
ChemSpiPy - ChemSpider wrapper, that allows chemical searches, chemical file downloads, depiction and retrieval of chemical properties.
-
CIRpy - An interface for the Chemical Identifier Resolver (CIR) by the CADD Group at the NCI/NIH.
-
pubchempy - PubChemPy provides a way to interact with PubChem in Python.
-
chembl-downloader - Automate downloading and querying the latest (or a given) version of ChEMBL
-
drugbank-downloader - Automate downloading, opening, and parsing DrugBank
Learning Resources
Resources for learning to apply python to chemistry.
-
An Introduction to Applied Bioinformatics - A Jupyter book demonstrating working with biochemical data using the scikit-bio library for tasks such as sequence alignment and calculating Hamming distances.
-
Computational Thermodynamics - This collection of Jupyter notebooks demonstrates solutions to a range of thermodynamic problems including solving chemical equilibria, comparing real versus ideal gas behavior, and calculating the temperature and composition of a combustion reaction.
-
SciCompforChemists - Scientific Computing for Chemists with Python is a Jupyter book teaching basic python in chemistry skills, including relevant libraries, and applies them to solving chemical problems.
Miscellaneous Awesome
-
Colorful Nuclide Chart - A beatuful, interactive visualization of nuclides with access to a varirty of nuclear properties and allows saving high quality images for publications, presentations and outreach.
See Also
-
awesome-cheminformatics Another list focuses on Cheminformatics, including tools not only in Python.
-
awesome-small-molecule-ml A collection of papers, datasets, and packages for small-molecule drug discovery. Most links to code are in Python.
-
awesome-molecular-docking A curated list of molecular docking software, datasets, and papers.
-
jarvis Joint Automated Repository for Various Integrated Simulations is a repository designed to automate materials discovery and optimization using classical force-field, density functional theory, machine learning calculations and experiments.
-
polypharmacy-ddi-synergy-survey A collection of research papers (with Python implementations) focusing on drug-drug interactions, synergy and polypharmacy.