rdkit.ML.Cluster.Clusters module

contains the Cluster class for representing hierarchical cluster trees

class rdkit.ML.Cluster.Clusters.Cluster(metric=0.0, children=None, position=None, index=-1, name=None, data=None)

Bases: object

a class for storing clusters/data

General Remarks

  • It is assumed that the bottom of any cluster hierarchy tree is composed of the individual data points which were clustered.

  • Clusters objects store the following pieces of data, most are

    accessible via standard Setters/Getters:

    • Children: Not Settable, the list of children. You can add children with the _AddChild()_ and _AddChildren()_ methods.

      Note this can be of arbitrary length, but the current algorithms I have only produce trees with two children per cluster

    • Metric: the metric for this cluster (i.e. how far apart its children are)

    • Index: the order in which this cluster was generated

    • Points: Not Settable, the list of original points in this cluster

      (calculated recursively from the children)

    • PointsPositions: Not Settable, the list of positions of the original

      points in this cluster (calculated recursively from the children)

    • Position: the location of the cluster Note for a cluster this probably means the location of the average of all the Points which are its children.

    • Data: a data field. This is used with the original points to store their data value (i.e. the value we’re using to classify)

    • Name: the name of this cluster

Constructor

Arguments

see the class documentation for the meanings of these arguments

my wrists are tired

AddChild(child)

Adds a child to our list

Arguments

  • child: a Cluster

AddChildren(children)

Adds a bunch of children to our list

Arguments

  • children: a list of Clusters

Compare(other, ignoreExtras=1)

not as choosy as self==other

FindSubtree(index)

finds and returns the subtree with a particular index

GetChildren()
GetData()
GetIndex()
GetMetric()
GetName()
GetPoints()
GetPointsPositions()
GetPosition()
IsTerminal()
Print(level=0, showData=0, offset='\t')
RemoveChild(child)

Removes a child from our list

Arguments

  • child: a Cluster

SetData(data)
SetIndex(index)
SetMetric(metric)
SetName(name)
SetPosition(pos)
rdkit.ML.Cluster.Clusters.cmp(t1, t2)