An overview of the RDKit

What is it?

Open source toolkit for cheminformatics

  • Business-friendly BSD license

  • Core data structures and algorithms in C++

  • Python 3.x wrappers generated using Boost.Python

  • Java and C# wrappers generated with SWIG

  • JavaScript wrappers of most-important functionality

  • 2D and 3D molecular operations

  • Descriptor generation for machine learning

  • Molecular database cartridge for PostgreSQL

  • Cheminformatics nodes for KNIME (distributed from the KNIME community site: https://www.knime.com/rdkit)

Operational:

  • http://www.rdkit.org

  • Supports Mac/Windows/Linux

  • Releases every 6 months

  • Web presence:

    • Homepage: http://www.rdkit.org Documentation, links

    • Github (https://github.com/rdkit) Downloads, discussion, bug tracker, git repository

    • Sourceforge (http://sourceforge.net/projects/rdkit) Mailing lists

    • Blog (https://greglandrum.github.io/rdkit-blog/) Tips, tricks, random stuff

    • KNIME integration (https://github.com/rdkit/knime-rdkit) RDKit nodes for KNIME

  • Mailing lists at https://sourceforge.net/p/rdkit/mailman/, searchable archives available for rdkit-discuss and rdkit-devel

  • Social media:

    • Twitter: @RDKit_org

    • LinkedIn: https://www.linkedin.com/groups/8192558

    • Slack: https://rdkit.slack.com (invite required, contact Greg)

History:

  • 2000-2006: Developed and used at Rational Discovery for building predictive models for ADME, Tox, biological activity

  • June 2006: Open-source (BSD license) release of software, Rational Discovery shuts down

  • to present: Open-source development continues, use within Novartis, contributions from Novartis back to open-source version

Citing the RDKit

There is still no official RDKit publication, our recommended citation is:

RDKit: Open-source cheminformatics. https://www.rdkit.org

We also recommend that you include the DOI for the version of the RDKit you used in the work. You can look these up here: https://doi.org/10.5281/zenodo.591637

Powered by RDKit

RDKit badge

If you use RDKit in one of your projects, you can show your support and help us track it by adding our badge. Simply copy the code from one of the markup languages below and paste it in your README file:

Markdown
reStructuredText
HTML

Integration with other open-source projects

  • KNIME: Workflow and analytics tool

  • PostgreSQL: Extensible relational database

  • Django: “The web framework for perfectionists with deadlines”

  • SQLite: “The most used database engine in the world”

  • Lucene: Text-search engine [1]

Usage by other open-source projects

This will, inevitably, be out of date. If you know of others, please let us know or submit a pull request!

  • Datamol (docs, repo) - A Python library to intuitively manipulate molecules.

  • DockOnSurf (docs, paper) - A high-throughput python code to automatically find the most stable geometry for molecules adsorbed on surfaces.

  • Scopy (docs, paper) - an integrated negative design Python library for desirable HTS/VS database design

  • Open Force Field Toolkit - A parametrization engine for force fields based on direct chemical perception.

  • stk (docs, paper) - a Python library for building, manipulating, analyzing and automatic design of molecules.

  • gpusimilarity - A Cuda/Thrust implementation of fingerprint similarity searching

  • Samson Connect - Software for adaptive modeling and simulation of nanosystems

  • mol_frame - Chemical Structure Handling for Dask and Pandas DataFrames

  • RDKit.js - The official JavaScript release of RDKit

  • DeepChem - python library for deep learning for chemistry

  • mmpdb - Matched molecular pair database generation and analysis

  • CheTo (paper)- Chemical topic modeling

  • OCEAN (paper)- Optimized cross reactivity estimation

  • ChEMBL Beaker - standalone web server wrapper for RDKit and OSRA

  • myChEMBL (blog post, paper) - A virtual machine implementation of open data and cheminformatics tools

  • ZINC - Free database of commercially-available compounds for virtual screening

  • sdf_viewer.py - an interactive SDF viewer

  • sdf2ppt - Reads an SDFile and displays molecules as image grid in powerpoint/openoffice presentation.

  • MolGears - A cheminformatics tool for bioactive molecules

  • PYPL - Simple cartridge that lets you call Python scripts from Oracle PL/SQL.

  • shape-it-rdkit - Gaussian molecular overlap code shape-it (from silicos it) ported to RDKit backend

  • WONKA - Tool for analysis and interrogation of protein-ligand crystal structures

  • OOMMPPAA - Tool for directed synthesis and data analysis based on protein-ligand crystal structures

  • OCEAN - web-tool for target-prediction of chemical structures which uses ChEMBL as datasource

  • chemfp - very fast fingerprint searching

  • rdkit_ipynb_tools - RDKit Tools for the IPython Notebook

  • Vernalis KNIME nodes

  • Erlwood KNIME nodes

  • AZOrange

The Contrib Directory

The Contrib directory, part of the standard RDKit distribution, includes code that has been contributed by members of the community.

License

This document is copyright (C) 2013-2022 by Greg Landrum

This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

The intent of this license is similar to that of the RDKit itself. In simple words: “Do whatever you want with it, but please give us some credit.”