RDInfoTheory::InfoBitRanker Class Reference

#include <InfoBitRanker.h>

List of all members.

Public Types

enum  InfoType { ENTROPY = 1, BIASENTROPY = 2, CHISQUARE = 3, BIASCHISQUARE = 4 }
 

the type of measure for information

More...

Public Member Functions

 InfoBitRanker (unsigned int nBits, unsigned int nClasses, InfoType infoType=InfoBitRanker::ENTROPY)
 Constructor.
 ~InfoBitRanker ()
void accumulateVotes (const ExplicitBitVect &bv, unsigned int label)
 Accumulate the votes for all the bits turned on in a bit vector.
void accumulateVotes (const SparseBitVect &bv, unsigned int label)
double * getTopN (unsigned int num)
 Returns the top n bits ranked by the information metric.
unsigned int getNumInstances () const
 return the number of labelled instances(examples) or fingerprints seen so far
unsigned int getNumClasses () const
 return the number of classes
void setBiasList (RDKit::INT_VECT &classList)
 Set the classes to which the entropy calculation should be biased.
void setMaskBits (RDKit::INT_VECT &maskBits)
 Set the bits to be used as a mask.
void writeTopBitsToStream (std::ostream *outStream) const
 Write the top N bits to a stream.
void writeTopBitsToFile (std::string fileName) const
 Write the top bits to a file.

Detailed Description

Definition at line 78 of file InfoBitRanker.h.


Member Enumeration Documentation

the type of measure for information

Enumerator:
ENTROPY 
BIASENTROPY 
CHISQUARE 
BIASCHISQUARE 

Definition at line 84 of file InfoBitRanker.h.


Constructor & Destructor Documentation

RDInfoTheory::InfoBitRanker::InfoBitRanker ( unsigned int  nBits,
unsigned int  nClasses,
InfoType  infoType = InfoBitRanker::ENTROPY 
) [inline]

Constructor.

ARGUMENTS:

  • nBits: the dimension of the bit vectors or the fingerprint length
  • nClasses: the number of classes used in the classification problem (e.g. active, moderately active, inactive etc.). It is assumed that the classes are numbered from 0 to (nClasses - 1)
  • infoType: the type of information metric

Definition at line 101 of file InfoBitRanker.h.

RDInfoTheory::InfoBitRanker::~InfoBitRanker (  )  [inline]

Definition at line 117 of file InfoBitRanker.h.


Member Function Documentation

void RDInfoTheory::InfoBitRanker::accumulateVotes ( const SparseBitVect bv,
unsigned int  label 
)
void RDInfoTheory::InfoBitRanker::accumulateVotes ( const ExplicitBitVect bv,
unsigned int  label 
)

Accumulate the votes for all the bits turned on in a bit vector.

ARGUMENTS:

  • bv : bit vector that supports [] operator
  • label : the class label for the bit vector. It is assumed that 0 <= class < nClasses
unsigned int RDInfoTheory::InfoBitRanker::getNumClasses (  )  const [inline]

return the number of classes

Definition at line 155 of file InfoBitRanker.h.

unsigned int RDInfoTheory::InfoBitRanker::getNumInstances (  )  const [inline]

return the number of labelled instances(examples) or fingerprints seen so far

Definition at line 148 of file InfoBitRanker.h.

double* RDInfoTheory::InfoBitRanker::getTopN ( unsigned int  num  ) 

Returns the top n bits ranked by the information metric.

This is actually the function where most of the work of ranking is happening

Parameters:
num the number of top ranked bits that are required
Returns:
a pointer to an information array. The client should *not* delete this
void RDInfoTheory::InfoBitRanker::setBiasList ( RDKit::INT_VECT classList  ) 

Set the classes to which the entropy calculation should be biased.

This list contains a set of class ids used when in the BIASENTROPY mode of ranking bits. In this mode, a bit must be correllated higher with one of the biased classes than all the other classes. For example, in a two class problem with actives and inactives, the fraction of actives that hit the bit has to be greater than the fraction of inactives that hit the bit

ARGUMENTS: classList - list of class ids that we want a bias towards

void RDInfoTheory::InfoBitRanker::setMaskBits ( RDKit::INT_VECT maskBits  ) 

Set the bits to be used as a mask.

If this function is called, only the bits which are present in the maskBits list will be used.

ARGUMENTS: maskBits - the bits to be considered

void RDInfoTheory::InfoBitRanker::writeTopBitsToFile ( std::string  fileName  )  const

Write the top bits to a file.

void RDInfoTheory::InfoBitRanker::writeTopBitsToStream ( std::ostream *  outStream  )  const

Write the top N bits to a stream.


The documentation for this class was generated from the following file:
Generated on Wed Jun 30 07:07:32 2010 for RDCode by  doxygen 1.6.3