ModErn Text Analysis
META Enumerates Textual Applications
Public Member Functions | Static Public Attributes | Private Member Functions | Private Attributes | List of all members
meta::classify::winnow Class Reference

Implements the Winnow classifier, a simplistic linear classifier for linearly-separable data. More...

#include <winnow.h>

Inheritance diagram for meta::classify::winnow:
meta::classify::classifier

Public Member Functions

 winnow (std::shared_ptr< index::forward_index > idx, double m=default_m, double gamma=default_gamma, size_t max_iter=default_max_iter)
 Constructs a winnow classifier with the given multiplier, error threshold, and maximum iterations. More...
 
void train (const std::vector< doc_id > &docs) override
 Trains the winnow on the given training documents. More...
 
class_label classify (doc_id d_id) override
 Classifies the given document. More...
 
void reset () override
 Resets all learned information for this winnow so it may be re-learned.
 
- Public Member Functions inherited from meta::classify::classifier
 classifier (std::shared_ptr< index::forward_index > idx)
 
virtual confusion_matrix test (const std::vector< doc_id > &docs)
 Classifies a collection document into specific groups, as determined by training data; this function will make repeated calls to classify(). More...
 
virtual confusion_matrix cross_validate (const std::vector< doc_id > &input_docs, size_t k, bool even_split=false, int seed=1)
 Performs k-fold cross-validation on a set of documents. More...
 

Static Public Attributes

static const constexpr double default_m = 1.5
 The default \(m\) parameter.
 
static const constexpr double default_gamma = 0.05
 The default \(gamma\) parameter.
 
static const constexpr size_t default_max_iter = 100
 The default number of allowed iterations.
 
static const std::string id = "winnow"
 The identifier for this classifier.
 

Private Member Functions

double get_weight (const class_label &label, const term_id &term) const
 
void zero_weights (const std::vector< doc_id > &docs)
 Initializes the weight vectors to zero for every class label. More...
 

Private Attributes

std::unordered_map< class_label, std::unordered_map< term_id, double > > weights_
 The weight vectors for each class label.
 
const double m_
 \(m\), the multiplicative learning rate.
 
const double gamma_
 \(\gamma\), the error threshold.
 
const size_t max_iter_
 The maximum number of iterations for training.
 

Additional Inherited Members

- Protected Attributes inherited from meta::classify::classifier
std::shared_ptr< index::forward_indexidx_
 the index that the classifer is run on
 

Detailed Description

Implements the Winnow classifier, a simplistic linear classifier for linearly-separable data.

As opposed to winnow (which uses an additive update rule), winnow uses a multiplicative update rule.

Constructor & Destructor Documentation

meta::classify::winnow::winnow ( std::shared_ptr< index::forward_index idx,
double  m = default_m,
double  gamma = default_gamma,
size_t  max_iter = default_max_iter 
)

Constructs a winnow classifier with the given multiplier, error threshold, and maximum iterations.

Parameters
idxThe index to run the classifier on
m\(m\), the multiplicative learning rate
gamma\(gamma\), the error threshold
max_iterThe maximum number of iterations for training

Member Function Documentation

void meta::classify::winnow::train ( const std::vector< doc_id > &  docs)
overridevirtual

Trains the winnow on the given training documents.

Maintains a set of weight vectors \(w_1,\ldots,w_K\) where \(K\) is the number of classes and updates them for each training document seen in each iteration. This continues until the error threshold is met or the maximum number of iterations is completed.

Parameters
docsThe training set

Implements meta::classify::classifier.

class_label meta::classify::winnow::classify ( doc_id  d_id)
overridevirtual

Classifies the given document.

The class label returned is \(\argmax_k(w_k^\intercal x_n + b)\)—in other words, the class whose associated weight vector gives the highest result.

Parameters
docThe document to be classified
Returns
the class label determined for the document

Implements meta::classify::classifier.

double meta::classify::winnow::get_weight ( const class_label &  label,
const term_id &  term 
) const
private
Returns
the given term's weight in the weight vector for the given class
Parameters
labelThe class label for the weight vector we want
termThe term whose weight should be returned
void meta::classify::winnow::zero_weights ( const std::vector< doc_id > &  docs)
private

Initializes the weight vectors to zero for every class label.

Parameters
docsThe set of documents to collect class labels from.

The documentation for this class was generated from the following files: