Package Chem :: Package AtomPairs :: Module Utils
[hide private]
[frames] | no frames]

Module Utils

source code

Functions [hide private]
 
ExplainAtomCode(code, branchSubtract=0)
**Arguments**: - the code to be considered - branchSubtract: (optional) the constant that was subtracted off the number of neighbors before integrating it into the code.
source code
 
GetAtomCode(...)
GetAtomPairAtomCode( (Atom)atom [, (int)branchSubtract=0]) -> int :...
source code
 
NumPiElectrons(atom)
Returns the number of electrons an atom is using for pi bonding >>> m = Chem.MolFromSmiles('C=C') >>> NumPiElectrons(m.GetAtomWithIdx(0)) 1 >>> m = Chem.MolFromSmiles('C#CC') >>> NumPiElectrons(m.GetAtomWithIdx(0)) 2 >>> NumPiElectrons(m.GetAtomWithIdx(1)) 2 >>> m = Chem.MolFromSmiles('O=C=CC') >>> NumPiElectrons(m.GetAtomWithIdx(0)) 1 >>> NumPiElectrons(m.GetAtomWithIdx(1)) 2 >>> NumPiElectrons(m.GetAtomWithIdx(2)) 1 >>> NumPiElectrons(m.GetAtomWithIdx(3)) 0 FIX: this behaves oddly in these cases: >>> m = Chem.MolFromSmiles('S(=O)(=O)') >>> NumPiElectrons(m.GetAtomWithIdx(0)) 2 >>> m = Chem.MolFromSmiles('S(=O)(=O)(O)O') >>> NumPiElectrons(m.GetAtomWithIdx(0)) 0 In the second case, the S atom is tagged as sp3 hybridized.
source code
 
BitsInCommon(v1, v2)
Returns the number of bits in common between two vectors...
source code
 
DiceSimilarity(v1, v2, bounds=None)
Implements the DICE similarity metric.
source code
 
Dot(v1, v2)
Returns the Dot product between two vectors:...
source code
 
CosineSimilarity(v1, v2)
Implements the Cosine similarity metric.
source code
 
_test() source code
Function Details [hide private]

ExplainAtomCode(code, branchSubtract=0)

source code 


**Arguments**:

  - the code to be considered

  - branchSubtract: (optional) the constant that was subtracted off
    the number of neighbors before integrating it into the code.  
    This is used by the topological torsions code.
    

>>> m = Chem.MolFromSmiles('C=CC(=O)O')
>>> code = GetAtomCode(m.GetAtomWithIdx(0))
>>> ExplainAtomCode(code)
('C', 1, 1)
>>> code = GetAtomCode(m.GetAtomWithIdx(1))
>>> ExplainAtomCode(code)
('C', 2, 1)
>>> code = GetAtomCode(m.GetAtomWithIdx(2))
>>> ExplainAtomCode(code)
('C', 3, 1)
>>> code = GetAtomCode(m.GetAtomWithIdx(3))
>>> ExplainAtomCode(code)
('O', 1, 1)
>>> code = GetAtomCode(m.GetAtomWithIdx(4))
>>> ExplainAtomCode(code)
('O', 1, 0)

GetAtomCode(...)

source code 

GetAtomPairAtomCode( (Atom)atom [, (int)branchSubtract=0]) -> int :
    Returns the atom code (hash) for an atom

    C++ signature :
        unsigned int GetAtomPairAtomCode(RDKit::Atom const* [,unsigned int=0])

BitsInCommon(v1, v2)

source code 
Returns the number of bits in common between two vectors

**Arguments**:

  - two vectors (sequences of bit ids)

**Returns**: an integer

**Notes**

  - the vectors must be sorted

  - duplicate bit IDs are counted more than once

>>> BitsInCommon( (1,2,3,4,10), (2,4,6) )
2

Here's how duplicates are handled:
>>> BitsInCommon( (1,2,2,3,4), (2,2,4,5,6) )
3
 

DiceSimilarity(v1, v2, bounds=None)

source code 
Implements the DICE similarity metric.
 This is the recommended metric in both the Topological torsions
 and Atom pairs papers.

**Arguments**:

  - two vectors (sequences of bit ids)

**Returns**: a float.

**Notes**

  - the vectors must be sorted

  
>>> DiceSimilarity( (1,2,3), (1,2,3) )
1.0
>>> DiceSimilarity( (1,2,3), (5,6) )
0.0
>>> DiceSimilarity( (1,2,3,4), (1,3,5,7) )
0.5
>>> DiceSimilarity( (1,2,3,4,5,6), (1,3) )
0.5

Note that duplicate bit IDs count multiple times:
>>> DiceSimilarity( (1,1,3,4,5,6), (1,1) )
0.5

but only if they are duplicated in both vectors:
>>> DiceSimilarity( (1,1,3,4,5,6), (1,) )==2./7
True

Dot(v1, v2)

source code 
Returns the Dot product between two vectors:

**Arguments**:

  - two vectors (sequences of bit ids)

**Returns**: an integer

**Notes**

  - the vectors must be sorted

  - duplicate bit IDs are counted more than once

>>> Dot( (1,2,3,4,10), (2,4,6) )
2

Here's how duplicates are handled:
>>> Dot( (1,2,2,3,4), (2,2,4,5,6) )
5
>>> Dot( (1,2,2,3,4), (2,4,5,6) )
2
>>> Dot( (1,2,2,3,4), (5,6) )
0
>>> Dot( (), (5,6) )
0

CosineSimilarity(v1, v2)

source code 
Implements the Cosine similarity metric.
 This is the recommended metric in the LaSSI paper

**Arguments**:

  - two vectors (sequences of bit ids)

**Returns**: a float.

**Notes**

  - the vectors must be sorted

>>> print '%.3f'%CosineSimilarity( (1,2,3,4,10), (2,4,6) )
0.516
>>> print '%.3f'%CosineSimilarity( (1,2,2,3,4), (2,2,4,5,6) )
0.714
>>> print '%.3f'%CosineSimilarity( (1,2,2,3,4), (1,2,2,3,4) )
1.000
>>> print '%.3f'%CosineSimilarity( (1,2,2,3,4), (5,6,7) )
0.000
>>> print '%.3f'%CosineSimilarity( (1,2,2,3,4), () )
0.000