Functions | Variables

RDKit::AtomPairs Namespace Reference

Functions

boost::uint32_t getAtomCode (const Atom *atom, unsigned int branchSubtract=0)
boost::uint32_t getAtomPairCode (boost::uint32_t codeI, boost::uint32_t codeJ, unsigned int dist)
SparseIntVect< boost::int32_t > * getAtomPairFingerprint (const ROMol &mol, unsigned int minLength, unsigned int maxLength, const std::vector< boost::uint32_t > *fromAtoms=0, const std::vector< boost::uint32_t > *ignoreAtoms=0)
 returns the atom-pair fingerprint for a molecule
SparseIntVect< boost::int32_t > * getAtomPairFingerprint (const ROMol &mol, const std::vector< boost::uint32_t > *fromAtoms=0, const std::vector< boost::uint32_t > *ignoreAtoms=0)
SparseIntVect< boost::int32_t > * getHashedAtomPairFingerprint (const ROMol &mol, unsigned int nBits=2048, unsigned int minLength=1, unsigned int maxLength=maxPathLen-1, const std::vector< boost::uint32_t > *fromAtoms=0, const std::vector< boost::uint32_t > *ignoreAtoms=0)
 returns the hashed atom-pair fingerprint for a molecule
ExplicitBitVectgetHashedAtomPairFingerprintAsBitVect (const ROMol &mol, unsigned int nBits=2048, unsigned int minLength=1, unsigned int maxLength=maxPathLen-1, const std::vector< boost::uint32_t > *fromAtoms=0, const std::vector< boost::uint32_t > *ignoreAtoms=0, unsigned int nBitsPerEntry=4)
 returns the hashed atom-pair fingerprint for a molecule as a bit vector
boost::uint64_t getTopologicalTorsionCode (const std::vector< boost::uint32_t > &atomCodes)
SparseIntVect< boost::int64_t > * getTopologicalTorsionFingerprint (const ROMol &mol, unsigned int targetSize=4, const std::vector< boost::uint32_t > *fromAtoms=0, const std::vector< boost::uint32_t > *ignoreAtoms=0)
 returns the topological-torsion fingerprint for a molecule
SparseIntVect< boost::int64_t > * getHashedTopologicalTorsionFingerprint (const ROMol &mol, unsigned int nBits=2048, unsigned int targetSize=4, const std::vector< boost::uint32_t > *fromAtoms=0, const std::vector< boost::uint32_t > *ignoreAtoms=0)
 returns a hashed topological-torsion fingerprint for a molecule
ExplicitBitVectgetHashedTopologicalTorsionFingerprintAsBitVect (const ROMol &mol, unsigned int nBits=2048, unsigned int targetSize=4, const std::vector< boost::uint32_t > *fromAtoms=0, const std::vector< boost::uint32_t > *ignoreAtoms=0, unsigned int nBitsPerEntry=4)
 returns a hashed topological-torsion fingerprint for a molecule as a bit vector

Variables

const std::string atomPairsVersion = "1.1.0"
const unsigned int numTypeBits = 4
const unsigned int atomNumberTypes [1<< numTypeBits] = {5,6,7,8,9,14,15,16,17,33,34,35,51,52,43}
const unsigned int numPiBits = 2
const unsigned int maxNumPi = (1<<numPiBits)-1
const unsigned int numBranchBits = 3
const unsigned int maxNumBranches = (1<<numBranchBits)-1
const unsigned int codeSize = numTypeBits+numPiBits+numBranchBits
const unsigned int numPathBits = 5
const unsigned int maxPathLen = (1<<numPathBits)-1
const unsigned int numAtomPairFingerprintBits = numPathBits+2*codeSize

Function Documentation

boost::uint32_t RDKit::AtomPairs::getAtomCode ( const Atom *  atom,
unsigned int  branchSubtract = 0 
)

returns a numeric code for the atom (the atom's hash in the atom-pair scheme)

Parameters:
atom the atom to be considered
branchSubtract (optional) a constant to subtract from the number of neighbors when the hash is calculated (used in the topological torsions code)
boost::uint32_t RDKit::AtomPairs::getAtomPairCode ( boost::uint32_t  codeI,
boost::uint32_t  codeJ,
unsigned int  dist 
)

returns an atom pair hash based on two atom hashes and the distance between the atoms.

Parameters:
codeI the hash for the first atom
codeJ the hash for the second atom
dist the distance (number of bonds) between the two atoms
SparseIntVect<boost::int32_t>* RDKit::AtomPairs::getAtomPairFingerprint ( const ROMol &  mol,
const std::vector< boost::uint32_t > *  fromAtoms = 0,
const std::vector< boost::uint32_t > *  ignoreAtoms = 0 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

SparseIntVect<boost::int32_t>* RDKit::AtomPairs::getAtomPairFingerprint ( const ROMol &  mol,
unsigned int  minLength,
unsigned int  maxLength,
const std::vector< boost::uint32_t > *  fromAtoms = 0,
const std::vector< boost::uint32_t > *  ignoreAtoms = 0 
)

returns the atom-pair fingerprint for a molecule

The algorithm used is described here: R.E. Carhart, D.H. Smith, R. Venkataraghavan; "Atom Pairs as Molecular Features in Structure-Activity Studies: Definition and Applications" JCICS 25, 64-73 (1985).

Parameters:
mol,: the molecule to be fingerprinted
minLength,: minimum distance between atoms to be considered in a pair. Default is 1 bond.
maxLength,: maximum distance between atoms to be considered in a pair. Default is maxPathLen-1 bonds.
fromAtoms,: if provided, only atom pairs that involve the specified atoms will be included in the fingerprint
ignoreAtoms,: if provided, any atom pairs that include the specified atoms will not be included in the fingerprint
Returns:
a pointer to the fingerprint. The client is responsible for calling delete on this.
SparseIntVect<boost::int32_t>* RDKit::AtomPairs::getHashedAtomPairFingerprint ( const ROMol &  mol,
unsigned int  nBits = 2048,
unsigned int  minLength = 1,
unsigned int  maxLength = maxPathLen-1,
const std::vector< boost::uint32_t > *  fromAtoms = 0,
const std::vector< boost::uint32_t > *  ignoreAtoms = 0 
)

returns the hashed atom-pair fingerprint for a molecule

Parameters:
mol,: the molecule to be fingerprinted
nBits,: the length of the fingerprint to generate
minLength,: minimum distance between atoms to be considered in a pair. Default is 1 bond.
maxLength,: maximum distance between atoms to be considered in a pair. Default is maxPathLen-1 bonds.
fromAtoms,: if provided, only atom pairs that involve the specified atoms will be included in the fingerprint
ignoreAtoms,: if provided, any atom pairs that include the specified atoms will not be included in the fingerprint
Returns:
a pointer to the fingerprint. The client is responsible for calling delete on this.
ExplicitBitVect* RDKit::AtomPairs::getHashedAtomPairFingerprintAsBitVect ( const ROMol &  mol,
unsigned int  nBits = 2048,
unsigned int  minLength = 1,
unsigned int  maxLength = maxPathLen-1,
const std::vector< boost::uint32_t > *  fromAtoms = 0,
const std::vector< boost::uint32_t > *  ignoreAtoms = 0,
unsigned int  nBitsPerEntry = 4 
)

returns the hashed atom-pair fingerprint for a molecule as a bit vector

Parameters:
mol,: the molecule to be fingerprinted
nBits,: the length of the fingerprint to generate
minLength,: minimum distance between atoms to be considered in a pair. Default is 1 bond.
maxLength,: maximum distance between atoms to be considered in a pair. Default is maxPathLen-1 bonds.
fromAtoms,: if provided, only atom pairs that involve the specified atoms will be included in the fingerprint
ignoreAtoms,: if provided, any atom pairs that include the specified atoms will not be included in the fingerprint
nBitsPerEntry,: number of bits to use in simulating counts
Returns:
a pointer to the fingerprint. The client is responsible for calling delete on this.
SparseIntVect<boost::int64_t >* RDKit::AtomPairs::getHashedTopologicalTorsionFingerprint ( const ROMol &  mol,
unsigned int  nBits = 2048,
unsigned int  targetSize = 4,
const std::vector< boost::uint32_t > *  fromAtoms = 0,
const std::vector< boost::uint32_t > *  ignoreAtoms = 0 
)

returns a hashed topological-torsion fingerprint for a molecule

The algorithm used is described here: R. Nilakantan, N. Bauman, J. S. Dixon, R. Venkataraghavan; "Topological Torsion: A New Molecular Descriptor for SAR Applications. Comparison with Other Descriptors" JCICS 27, 82-85 (1987).

Parameters:
mol,: the molecule to be fingerprinted
nBits,: number of bits to include in the fingerprint
targetSize,: the number of atoms to include in the "torsions"
fromAtoms,: if provided, only torsions that start or end at the specified atoms will be included in the fingerprint
ignoreAtoms,: if provided, any torsions that include the specified atoms will not be included in the fingerprint
Returns:
a pointer to the fingerprint. The client is responsible for calling delete on this.
ExplicitBitVect* RDKit::AtomPairs::getHashedTopologicalTorsionFingerprintAsBitVect ( const ROMol &  mol,
unsigned int  nBits = 2048,
unsigned int  targetSize = 4,
const std::vector< boost::uint32_t > *  fromAtoms = 0,
const std::vector< boost::uint32_t > *  ignoreAtoms = 0,
unsigned int  nBitsPerEntry = 4 
)

returns a hashed topological-torsion fingerprint for a molecule as a bit vector

Parameters:
mol,: the molecule to be fingerprinted
nBits,: number of bits to include in the fingerprint
targetSize,: the number of atoms to include in the "torsions"
fromAtoms,: if provided, only torsions that start or end at the specified atoms will be included in the fingerprint
ignoreAtoms,: if provided, any torsions that include the specified atoms will not be included in the fingerprint
nBitsPerEntry,: number of bits to use in simulating counts
Returns:
a pointer to the fingerprint. The client is responsible for calling delete on this.
boost::uint64_t RDKit::AtomPairs::getTopologicalTorsionCode ( const std::vector< boost::uint32_t > &  atomCodes  ) 

returns an topological torsion hash based on the atom hashes passed in

Parameters:
atomCodes the vector of atom hashes
SparseIntVect<boost::int64_t >* RDKit::AtomPairs::getTopologicalTorsionFingerprint ( const ROMol &  mol,
unsigned int  targetSize = 4,
const std::vector< boost::uint32_t > *  fromAtoms = 0,
const std::vector< boost::uint32_t > *  ignoreAtoms = 0 
)

returns the topological-torsion fingerprint for a molecule

The algorithm used is described here: R. Nilakantan, N. Bauman, J. S. Dixon, R. Venkataraghavan; "Topological Torsion: A New Molecular Descriptor for SAR Applications. Comparison with Other Descriptors" JCICS 27, 82-85 (1987).

Parameters:
mol,: the molecule to be fingerprinted
targetSize,: the number of atoms to include in the "torsions"
fromAtoms,: if provided, only torsions that start or end at the specified atoms will be included in the fingerprint
ignoreAtoms,: if provided, any torsions that include the specified atoms will not be included in the fingerprint
Returns:
a pointer to the fingerprint. The client is responsible for calling delete on this.

Variable Documentation

const unsigned int RDKit::AtomPairs::atomNumberTypes[1<< numTypeBits] = {5,6,7,8,9,14,15,16,17,33,34,35,51,52,43}

Definition at line 25 of file AtomPairs.h.

const std::string RDKit::AtomPairs::atomPairsVersion = "1.1.0"

Definition at line 23 of file AtomPairs.h.

Definition at line 30 of file AtomPairs.h.

const unsigned int RDKit::AtomPairs::maxNumBranches = (1<<numBranchBits)-1

Definition at line 29 of file AtomPairs.h.

const unsigned int RDKit::AtomPairs::maxNumPi = (1<<numPiBits)-1

Definition at line 27 of file AtomPairs.h.

const unsigned int RDKit::AtomPairs::maxPathLen = (1<<numPathBits)-1

Definition at line 32 of file AtomPairs.h.

Definition at line 33 of file AtomPairs.h.

const unsigned int RDKit::AtomPairs::numBranchBits = 3

Definition at line 28 of file AtomPairs.h.

const unsigned int RDKit::AtomPairs::numPathBits = 5

Definition at line 31 of file AtomPairs.h.

const unsigned int RDKit::AtomPairs::numPiBits = 2

Definition at line 26 of file AtomPairs.h.

const unsigned int RDKit::AtomPairs::numTypeBits = 4

Definition at line 24 of file AtomPairs.h.