rdkit.ML.InfoTheory.BitClusterer module

class rdkit.ML.InfoTheory.BitClusterer.BitClusterer(idList, nCluster, type=rdkit.SimDivFilters.rdSimDivPickers.ClusterMethod.WARD)

Bases: object

Class to cluster a set of bits based on their correllation

The correlation matrix is first built using by reading the fingerprints from a database or a list of fingerprints

ClusterBits(corrMat)
GetClusters()
MapToClusterFP(fp)

Map the fingerprint to a smaller sized (= number of clusters) fingerprint

Each cluster get a bit in the new fingerprint and is turned on if any of the bits in the cluster are turned on in the original fingerprint

MapToClusterScores(fp)

Map the fingerprint to a real valued vector of score based on the bit clusters

The dimension of the vector is same as the number of clusters. Each value in the vector corresponds to the number of bits in the corresponding cluster that are turned on in the fingerprint

ARGUMENTS:
  • fp : the fingerprint

SetClusters(clusters)