rdkit.Chem.rdRGroupDecomposition module

Module containing RGroupDecomposition classes and functions.

class rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment

Bases: enum

MCS = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS
NoAlignment = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment
None = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.None
names = {'MCS': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS, 'NoAlignment': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment, 'None': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.None}
values = {0: rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment, 1: rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS}
rdkit.Chem.rdRGroupDecomposition.RGroupDecompose((rdkit.Chem.rdMolDescriptors.AtomPairsParameters)cores, (rdkit.Chem.rdMolDescriptors.AtomPairsParameters)mols[, (bool)asSmiles=False[, (bool)asRows=True[, (RGroupDecompositionParameters)options=<rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters object at 0x7b87f0f3acd0>]]]) object :
Decompose a collection of molecules into their Rgroups
ARGUMENTS:
  • cores: a set of cores from most to least specific.

    See RGroupDecompositionParameters for more details on how the cores can be labelled

  • mols: the molecules to be decomposed

  • asSmiles: if True return smiles strings, otherwise return molecules [default: False]

  • asRows: return the results as rows (default) otherwise return columns

  • options: RGroupDecompositionParameters object that defines the parameters for the decomposition.

    See RGroupDecompositionParameters for defaults

RETURNS: row_or_column_results, unmatched

Row structure:

rows[idx] = {rgroup_label: molecule_or_smiles}

Column structure:

columns[rgroup_label] = [ mols_or_smiles ]

unmatched is a vector of indices in the input mols that were not matched.

C++ signature :

boost::python::api::object RGroupDecompose(boost::python::api::object,boost::python::api::object [,bool=False [,bool=True [,RDKit::RGroupDecompositionParameters=<rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters object at 0x7b87f0f3acd0>]]])

class rdkit.Chem.rdRGroupDecomposition.RGroupDecomposition((object)self, (rdkit.Chem.rdMolDescriptors.AtomPairsParameters)cores) None :

Bases: instance

Construct from a molecule or sequence of molecules

C++ signature :

void __init__(_object*,boost::python::api::object)

__init__( (object)self, (rdkit.Chem.rdMolDescriptors.AtomPairsParameters)cores, (RGroupDecompositionParameters)params) -> None :

Construct from a molecule or sequence of molecules and a parameters object

C++ signature :

void __init__(_object*,boost::python::api::object,RDKit::RGroupDecompositionParameters)

Add((RGroupDecomposition)self, (rdkit.Chem.rdchem.Mol)mol) int :
C++ signature :

int Add(RDKit::RGroupDecompositionHelper {lvalue},RDKit::ROMol)

GetMatchingCoreIdx((RGroupDecomposition)self, (rdkit.Chem.rdchem.Mol)mol[, (rdkit.Chem.rdMolDescriptors.AtomPairsParameters)matches=None]) int :
C++ signature :

int GetMatchingCoreIdx(RDKit::RGroupDecompositionHelper {lvalue},RDKit::ROMol [,boost::python::api::object {lvalue}=None])

GetRGroupLabels((RGroupDecomposition)self) list :

Return the current list of found rgroups. Note, Process() should be called first

C++ signature :

boost::python::list GetRGroupLabels(RDKit::RGroupDecompositionHelper {lvalue})

GetRGroupsAsColumns((RGroupDecomposition)self[, (bool)asSmiles=False]) dict :
Return the rgroups as columns (note: can be fed directly into a pandas datatable)
ARGUMENTS:
  • asSmiles: if True return smiles strings, otherwise return molecules [default: False]

Column structure:

columns[rgroup_label] = [ mols_or_smiles ]

C++ signature :

boost::python::dict GetRGroupsAsColumns(RDKit::RGroupDecompositionHelper {lvalue} [,bool=False])

GetRGroupsAsRows((RGroupDecomposition)self[, (bool)asSmiles=False]) list :
Return the rgroups as rows (note: can be fed directly into a pandas datatable)
ARGUMENTS:
  • asSmiles: if True return smiles strings, otherwise return molecules [default: False]

Row structure:

rows[idx] = {rgroup_label: molecule_or_smiles}

C++ signature :

boost::python::list GetRGroupsAsRows(RDKit::RGroupDecompositionHelper {lvalue} [,bool=False])

Process((RGroupDecomposition)self) bool :

Process the rgroups (must be done prior to GetRGroupsAsRows/Columns and GetRGroupLabels)

C++ signature :

bool Process(RDKit::RGroupDecompositionHelper {lvalue})

ProcessAndScore((RGroupDecomposition)self) tuple :

Process the rgroups and returns the score (must be done prior to GetRGroupsAsRows/Columns and GetRGroupLabels)

C++ signature :

boost::python::tuple ProcessAndScore(RDKit::RGroupDecompositionHelper {lvalue})

class rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters((object)self) None :

Bases: instance

RGroupDecompositionParameters controls how the RGroupDecomposition sets labelling and matches structures OPTIONS:

  • RGroupCoreAlignment: can be one of RGroupCoreAlignment.None_ or RGroupCoreAlignment.MCS

    If set to MCS, cores labels are mapped to each other using their Maximum common substructure overlap.

  • RGroupLabels: optionally set where the rgroup labels to use are encoded.

    RGroupLabels.IsotopeLabels - labels are stored on isotopes RGroupLabels.AtomMapLabels - labels are stored on atommaps RGroupLabels.MDLRGroupLabels - labels are stored on MDL R-groups RGroupLabels.DummyAtomLabels - labels are stored on dummy atoms RGroupLabels.AtomIndexLabels - use the atom index as the label RGroupLabels.RelabelDuplicateLabels - fix any duplicate labels RGroupLabels.AutoDetect - auto detect the label [default]

    Note: in all cases, any rgroups found on unlabelled atoms will be automatically

    labelled.

  • RGroupLabelling: choose where the rlabels are stored on the decomposition

    RGroupLabelling.AtomMap - store rgroups as atom maps (for smiles) RGroupLabelling.Isotope - store rgroups on the isotope RGroupLabelling.MDLRGroup - store rgroups as mdl rgroups (for molblocks)

    default: AtomMap | MDLRGroup

  • onlyMatchAtRGroups: only allow rgroup decomposition at the specified rgroups

  • removeAllHydrogenRGroups: remove all user-defined rgroups that only have hydrogens

  • removeAllHydrogenRGroupsAndLabels: remove all user-defined rgroups that only have hydrogens, and also remove the corresponding labels from the core

  • removeHydrogensPostMatch: remove all hydrogens from the output molecules

  • allowNonTerminalRGroups: allow labelled Rgroups of degree 2 or more

  • doTautomers: match all tautomers of a core against each input structure

  • doEnumeration: expand input cores into enumerated mol bundles

  • allowMultipleRGroupsOnUnlabelled: permit more than one rgroup to be attached to an unlabelled core atom

  • allowMultipleCoresInSameMol: permit a core to match more than once in the same molecule if the sets of matched atoms are not equal (default=False)

Constructor, takes no arguments

C++ signature :

void __init__(_object*)

property alignment

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

unsigned int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property allowMultipleCoresInSameMol

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property allowMultipleRGroupsOnUnlabelled

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property allowNonTerminalRGroups

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property chunkSize

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

unsigned int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property doEnumeration

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property doTautomers

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property gaMaximumOperations

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property gaNumberOperationsWithoutImprovement

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property gaNumberRuns

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property gaParallelRuns

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property gaPopulationSize

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property gaRandomSeed

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property includeTargetMolInResults

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property labels

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

unsigned int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property matchingStrategy

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

unsigned int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property onlyMatchAtRGroups

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property removeAllHydrogenRGroups

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property removeAllHydrogenRGroupsAndLabels

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property removeHydrogensPostMatch

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> bool:

C++ signature :

bool {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property rgroupLabelling

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

unsigned int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property scoreMethod

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> int:

C++ signature :

unsigned int {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property substructMatchParams

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> rdkit.Chem.rdchem.SubstructMatchParameters:

C++ signature :

RDKit::SubstructMatchParameters {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

property timeout

None( (rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters)arg1) -> float:

C++ signature :

double {lvalue} None(RDKit::RGroupDecompositionParameters {lvalue})

class rdkit.Chem.rdRGroupDecomposition.RGroupLabelling

Bases: enum

AtomMap = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap
Isotope = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope
MDLRGroup = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup
names = {'AtomMap': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap, 'Isotope': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope, 'MDLRGroup': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup}
values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap, 2: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope, 4: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup}
class rdkit.Chem.rdRGroupDecomposition.RGroupLabels

Bases: enum

AtomIndexLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels
AtomMapLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels
AutoDetect = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect
DummyAtomLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels
IsotopeLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels
MDLRGroupLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels
RelabelDuplicateLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels
names = {'AtomIndexLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels, 'AtomMapLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels, 'AutoDetect': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect, 'DummyAtomLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels, 'IsotopeLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels, 'MDLRGroupLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels, 'RelabelDuplicateLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels}
values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels, 2: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels, 4: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels, 8: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels, 16: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels, 32: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels, 255: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect}
class rdkit.Chem.rdRGroupDecomposition.RGroupMatching

Bases: enum

Exhaustive = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive
GA = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA
Greedy = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy
GreedyChunks = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks
NoSymmetrization = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization
names = {'Exhaustive': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive, 'GA': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA, 'Greedy': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy, 'GreedyChunks': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks, 'NoSymmetrization': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization}
values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy, 2: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks, 4: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive, 8: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization, 16: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA}
class rdkit.Chem.rdRGroupDecomposition.RGroupScore

Bases: enum

FingerprintVariance = rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance
Match = rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match
names = {'FingerprintVariance': rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance, 'Match': rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match}
values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match, 4: rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance}
rdkit.Chem.rdRGroupDecomposition.RelabelMappedDummies((rdkit.Chem.rdchem.Mol)mol[, (int)inputLabels=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling(7)[, (int)outputLabels=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup]]) None :

Relabel dummy atoms bearing an R-group mapping (as atom map number, isotope or MDLRGroup label) such that they will be displayed by the rendering code as R# rather than #*, :#, #:#, etc. By default, only the MDLRGroup label is retained on output; this may be configured through the outputLabels parameter. In case there are multiple potential R-group mappings, the priority on input is Atom map number > Isotope > MDLRGroup. The inputLabels parameter allows to configure which mappings are taken into consideration.

C++ signature :

void RelabelMappedDummies(RDKit::ROMol {lvalue} [,unsigned int=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling(7) [,unsigned int=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup]])