ModErn Text Analysis
META Enumerates Textual Applications
Namespaces | Classes | Functions | Variables
meta::classify Namespace Reference

Algorithms for feature selection, KNN search, and confusion matrices. More...

Namespaces

 kernel
 Kernel functions for linear classifiers.
 
 loss
 Loss functions for sgd.
 

Classes

class  binary_classifier
 A classifier which classifies documents as "positive" or "negative". More...
 
class  binary_classifier_factory
 Factory that is responsible for creating binary classifiers from configuration files. More...
 
class  classifier
 A classifier uses a document's feature space to identify which group it belongs to. More...
 
class  classifier_factory
 Factory that is responsible for creating classifiers from configuration files. More...
 
class  confusion_matrix
 Allows interpretation of classification errors. More...
 
class  dual_perceptron
 Implements a perceptron classifier, but using the dual formulation of the problem. More...
 
class  knn
 Implements the k-Nearest Neighbor lazy learning classification algorithm. More...
 
class  linear_model
 A storage class for multiclass linear classifier models. More...
 
class  linear_model_exception
 Exception thrown during interactions with linear_models. More...
 
class  logistic_regression
 Multinomial logistic regression. More...
 
class  naive_bayes
 Implements the Naive Bayes classifier, a simplistic probabilistic classifier that uses Bayes' theorem with strong feature independence assumptions. More...
 
class  nearest_centroid
 Implements the nearest centroid classification algorithm. More...
 
class  one_vs_all
 Generalizes binary classifiers to operate over multiclass types using the one vs all method. More...
 
class  one_vs_one
 Ensemble method adaptor for extending binary_classifiers to the multi-class classification case by using a one-vs-one strategy. More...
 
class  sgd
 Implements stochastic gradient descent for learning binary linear classifiers. More...
 
class  svm_wrapper
 Wrapper class for liblinear (http://www.csie.ntu.edu.tw/~cjlin/liblinear/) and libsvm (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) implementation of support vector machine classification. More...
 
class  winnow
 Implements the Winnow classifier, a simplistic linear classifier for linearly-separable data. More...
 

Functions

template<class Index , class Classifier >
void batch_train (Index &idx, Classifier &cls, const std::vector< doc_id > &training_set, uint64_t batch_size)
 This trains a classifier in an online fashion, using batches of size batch_size from the training_set. More...
 
std::unique_ptr< binary_classifiermake_binary_classifier (const cpptoml::table &config, std::shared_ptr< index::forward_index > idx, class_label positive, class_label negative)
 (Non-template): Convenience method for creating a binary classifier using the factory. More...
 
template<class Classifier >
void register_binary_classifier ()
 Registration method for binary classifiers. More...
 
template<>
std::unique_ptr< classifiermake_classifier< dual_perceptron > (const cpptoml::table &, std::shared_ptr< index::forward_index >)
 Specialization of the factory function used to create dual_perceptrons.
 
template<>
std::unique_ptr< classifiermake_multi_index_classifier< knn > (const cpptoml::table &, std::shared_ptr< index::forward_index >, std::shared_ptr< index::inverted_index >)
 Specialization of the factory method used to create knn classifiers.
 
template<>
std::unique_ptr< classifiermake_classifier< logistic_regression > (const cpptoml::table &, std::shared_ptr< index::forward_index >)
 Specialization of the factory method used for creating logistic_regression classifiers.
 
template<>
std::unique_ptr< classifiermake_classifier< naive_bayes > (const cpptoml::table &config, std::shared_ptr< index::forward_index > idx)
 Specialization of the factory method used for creating naive bayes classifiers.
 
template<>
std::unique_ptr< classifiermake_multi_index_classifier< nearest_centroid > (const cpptoml::table &, std::shared_ptr< index::forward_index >, std::shared_ptr< index::inverted_index >)
 Specialization of the factory method used to create nearest_centroid classifiers.
 
template<>
std::unique_ptr< classifiermake_classifier< one_vs_all > (const cpptoml::table &, std::shared_ptr< index::forward_index >)
 Specialization of the factory method used to create one_vs_all classifiers.
 
template<>
std::unique_ptr< classifiermake_classifier< one_vs_one > (const cpptoml::table &, std::shared_ptr< index::forward_index >)
 Specialization of the factory method used to create one_vs_all classifiers.
 
template<>
std::unique_ptr< binary_classifiermake_binary_classifier< sgd > (const cpptoml::table &config, std::shared_ptr< index::forward_index > idx, class_label positive, class_label negative)
 Specialization of the factory method used to create sgd classifiers.
 
template<>
std::unique_ptr< classifiermake_classifier< svm_wrapper > (const cpptoml::table &, std::shared_ptr< index::forward_index >)
 Specialization of the factory method used for creating svm_wrapper classifiers.
 
template<>
std::unique_ptr< classifiermake_classifier< winnow > (const cpptoml::table &config, std::shared_ptr< index::forward_index > idx)
 Specialization of the factory method used for creating winnow classifiers.
 
std::unique_ptr< classifiermake_classifier (const cpptoml::table &config, std::shared_ptr< index::forward_index > idx, std::shared_ptr< index::inverted_index > inv_idx=nullptr)
 Convenience method for creating a classifier using the factory. More...
 
template<class Classifier >
std::unique_ptr< classifiermake_classifier (const cpptoml::table &, std::shared_ptr< index::forward_index > idx)
 Factory method for creating a classifier. More...
 
template<class Classifier >
std::unique_ptr< classifiermake_multi_index_classifier (const cpptoml::table &, std::shared_ptr< index::forward_index > idx, std::shared_ptr< index::inverted_index > inv_idx)
 Factory method for creating a classifier that takes both index types. More...
 
template<class Classifier >
void register_classifier ()
 Registration method for classifiers. More...
 
template<class Classifier >
void register_multi_index_classifier ()
 Registration method for multi-index classifiers. More...
 
term_probs_ reserve (lbls.size())
 
 for (const auto &lbl:lbls) term_probs_.emplace_back(lbl
 

Variables

auto lbls = idx_->class_labels()
 
 term_prior
 

Detailed Description

Algorithms for feature selection, KNN search, and confusion matrices.

Function Documentation

template<class Index , class Classifier >
void meta::classify::batch_train ( Index &  idx,
Classifier &  cls,
const std::vector< doc_id > &  training_set,
uint64_t  batch_size 
)

This trains a classifier in an online fashion, using batches of size batch_size from the training_set.

Parameters
idxThe index the classifier is using (so the cache may be dropped between batches)
clsThe classifier to train. This must be a classifier supporting online learning (e.g., sgd or an ensemble of sgd)
training_setThe list of document ids that comprise the training data
batch_sizeThe size of the batches to use for the minibatch training
std::unique_ptr< binary_classifier > meta::classify::make_binary_classifier ( const cpptoml::table &  config,
std::shared_ptr< index::forward_index idx,
class_label  positive,
class_label  negative 
)

(Non-template): Convenience method for creating a binary classifier using the factory.

(Template): Factory method for creating a binary classifier; this should be specialized if your given binary classifier requires special construction behavior (e.g., reading parameters).

Parameters
configThe table that specifies the binary classifier's configuration
idxThe forward_index the binary classifier is being constructed over
positiveThe class_label for positive documents
negativeThe class_label for negative documents
Returns
a unique_ptr to a binary_classifier constructed from the given configuration
Parameters
configThe table that specifies the binary classifier's configuration
idxThe forward_index the binary classifier is being constructed over
positiveThe class_label for positive documents
negativeThe class_label for negative documents
Returns
a unique_ptr to a binary_classifier (of derived type Classifier) that has been constructed from the given configuration
template<class Classifier >
void meta::classify::register_binary_classifier ( )

Registration method for binary classifiers.

Clients should use this method to register any new binary classifiers they write.

std::unique_ptr< classifier > meta::classify::make_classifier ( const cpptoml::table &  config,
std::shared_ptr< index::forward_index idx,
std::shared_ptr< index::inverted_index inv_idx = nullptr 
)

Convenience method for creating a classifier using the factory.

Parameters
configThe configuration group that specifies the configuration for the classifier to be created
idxThe forward_index to be passed to the classifier being created
inv_idxThe inverted_index to be passed to the classifier being created (if needed)
Returns
a unique_ptr to the classifier created from the given configuration
template<class Classifier >
std::unique_ptr<classifier> meta::classify::make_classifier ( const cpptoml::table &  ,
std::shared_ptr< index::forward_index idx 
)

Factory method for creating a classifier.

This should be specialized if your given classifier requires special construction behavior (e.g., reading parameters).

Parameters
configThe configuration group that specifies the configuration for the classifier to be created
idxThe forward_index to be passed to the classifier being created
Returns
a unique_ptr to the classifier (of derived type Classifier) created from the given configuration
template<class Classifier >
std::unique_ptr<classifier> meta::classify::make_multi_index_classifier ( const cpptoml::table &  ,
std::shared_ptr< index::forward_index idx,
std::shared_ptr< index::inverted_index inv_idx 
)

Factory method for creating a classifier that takes both index types.

This should be specialized if your given classifier requires special construction behavior.

Parameters
configThe configuration group that specifies the configuration for the classifier to be created
idxThe forward_index to be passed to the classifier being created
inv_idxThe inverted_index to be passed to the classifier being created
Returns
a unique_ptr to the classifier (of derived type Classifier) created from the given configuration
template<class Classifier >
void meta::classify::register_classifier ( )

Registration method for classifiers.

Clients should use this method to register any new classifiers they write that operate on just a forward_index (this should be most).

template<class Classifier >
void meta::classify::register_multi_index_classifier ( )

Registration method for multi-index classifiers.

Clients should use this method to register any new classifiers they write that operate on both a forward_index and an inverted_index (this is rare).