ModErn Text Analysis
META Enumerates Textual Applications
|
Algorithms for feature selection, KNN search, and confusion matrices. More...
Namespaces | |
kernel | |
Kernel functions for linear classifiers. | |
loss | |
Loss functions for sgd. | |
Classes | |
class | binary_classifier |
A classifier which classifies documents as "positive" or "negative". More... | |
class | binary_classifier_factory |
Factory that is responsible for creating binary classifiers from configuration files. More... | |
class | classifier |
A classifier uses a document's feature space to identify which group it belongs to. More... | |
class | classifier_factory |
Factory that is responsible for creating classifiers from configuration files. More... | |
class | confusion_matrix |
Allows interpretation of classification errors. More... | |
class | dual_perceptron |
Implements a perceptron classifier, but using the dual formulation of the problem. More... | |
class | knn |
Implements the k-Nearest Neighbor lazy learning classification algorithm. More... | |
class | linear_model |
A storage class for multiclass linear classifier models. More... | |
class | linear_model_exception |
Exception thrown during interactions with linear_models. More... | |
class | logistic_regression |
Multinomial logistic regression. More... | |
class | naive_bayes |
Implements the Naive Bayes classifier, a simplistic probabilistic classifier that uses Bayes' theorem with strong feature independence assumptions. More... | |
class | nearest_centroid |
Implements the nearest centroid classification algorithm. More... | |
class | one_vs_all |
Generalizes binary classifiers to operate over multiclass types using the one vs all method. More... | |
class | one_vs_one |
Ensemble method adaptor for extending binary_classifiers to the multi-class classification case by using a one-vs-one strategy. More... | |
class | sgd |
Implements stochastic gradient descent for learning binary linear classifiers. More... | |
class | svm_wrapper |
Wrapper class for liblinear (http://www.csie.ntu.edu.tw/~cjlin/liblinear/) and libsvm (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) implementation of support vector machine classification. More... | |
class | winnow |
Implements the Winnow classifier, a simplistic linear classifier for linearly-separable data. More... | |
Functions | |
template<class Index , class Classifier > | |
void | batch_train (Index &idx, Classifier &cls, const std::vector< doc_id > &training_set, uint64_t batch_size) |
This trains a classifier in an online fashion, using batches of size batch_size from the training_set. More... | |
std::unique_ptr< binary_classifier > | make_binary_classifier (const cpptoml::table &config, std::shared_ptr< index::forward_index > idx, class_label positive, class_label negative) |
(Non-template): Convenience method for creating a binary classifier using the factory. More... | |
template<class Classifier > | |
void | register_binary_classifier () |
Registration method for binary classifiers. More... | |
template<> | |
std::unique_ptr< classifier > | make_classifier< dual_perceptron > (const cpptoml::table &, std::shared_ptr< index::forward_index >) |
Specialization of the factory function used to create dual_perceptrons. | |
template<> | |
std::unique_ptr< classifier > | make_multi_index_classifier< knn > (const cpptoml::table &, std::shared_ptr< index::forward_index >, std::shared_ptr< index::inverted_index >) |
Specialization of the factory method used to create knn classifiers. | |
template<> | |
std::unique_ptr< classifier > | make_classifier< logistic_regression > (const cpptoml::table &, std::shared_ptr< index::forward_index >) |
Specialization of the factory method used for creating logistic_regression classifiers. | |
template<> | |
std::unique_ptr< classifier > | make_classifier< naive_bayes > (const cpptoml::table &config, std::shared_ptr< index::forward_index > idx) |
Specialization of the factory method used for creating naive bayes classifiers. | |
template<> | |
std::unique_ptr< classifier > | make_multi_index_classifier< nearest_centroid > (const cpptoml::table &, std::shared_ptr< index::forward_index >, std::shared_ptr< index::inverted_index >) |
Specialization of the factory method used to create nearest_centroid classifiers. | |
template<> | |
std::unique_ptr< classifier > | make_classifier< one_vs_all > (const cpptoml::table &, std::shared_ptr< index::forward_index >) |
Specialization of the factory method used to create one_vs_all classifiers. | |
template<> | |
std::unique_ptr< classifier > | make_classifier< one_vs_one > (const cpptoml::table &, std::shared_ptr< index::forward_index >) |
Specialization of the factory method used to create one_vs_all classifiers. | |
template<> | |
std::unique_ptr< binary_classifier > | make_binary_classifier< sgd > (const cpptoml::table &config, std::shared_ptr< index::forward_index > idx, class_label positive, class_label negative) |
Specialization of the factory method used to create sgd classifiers. | |
template<> | |
std::unique_ptr< classifier > | make_classifier< svm_wrapper > (const cpptoml::table &, std::shared_ptr< index::forward_index >) |
Specialization of the factory method used for creating svm_wrapper classifiers. | |
template<> | |
std::unique_ptr< classifier > | make_classifier< winnow > (const cpptoml::table &config, std::shared_ptr< index::forward_index > idx) |
Specialization of the factory method used for creating winnow classifiers. | |
std::unique_ptr< classifier > | make_classifier (const cpptoml::table &config, std::shared_ptr< index::forward_index > idx, std::shared_ptr< index::inverted_index > inv_idx=nullptr) |
Convenience method for creating a classifier using the factory. More... | |
template<class Classifier > | |
std::unique_ptr< classifier > | make_classifier (const cpptoml::table &, std::shared_ptr< index::forward_index > idx) |
Factory method for creating a classifier. More... | |
template<class Classifier > | |
std::unique_ptr< classifier > | make_multi_index_classifier (const cpptoml::table &, std::shared_ptr< index::forward_index > idx, std::shared_ptr< index::inverted_index > inv_idx) |
Factory method for creating a classifier that takes both index types. More... | |
template<class Classifier > | |
void | register_classifier () |
Registration method for classifiers. More... | |
template<class Classifier > | |
void | register_multi_index_classifier () |
Registration method for multi-index classifiers. More... | |
term_probs_ | reserve (lbls.size()) |
for (const auto &lbl:lbls) term_probs_.emplace_back(lbl | |
Variables | |
auto | lbls = idx_->class_labels() |
term_prior | |
Algorithms for feature selection, KNN search, and confusion matrices.
void meta::classify::batch_train | ( | Index & | idx, |
Classifier & | cls, | ||
const std::vector< doc_id > & | training_set, | ||
uint64_t | batch_size | ||
) |
This trains a classifier in an online fashion, using batches of size batch_size from the training_set.
idx | The index the classifier is using (so the cache may be dropped between batches) |
cls | The classifier to train. This must be a classifier supporting online learning (e.g., sgd or an ensemble of sgd) |
training_set | The list of document ids that comprise the training data |
batch_size | The size of the batches to use for the minibatch training |
std::unique_ptr< binary_classifier > meta::classify::make_binary_classifier | ( | const cpptoml::table & | config, |
std::shared_ptr< index::forward_index > | idx, | ||
class_label | positive, | ||
class_label | negative | ||
) |
(Non-template): Convenience method for creating a binary classifier using the factory.
(Template): Factory method for creating a binary classifier; this should be specialized if your given binary classifier requires special construction behavior (e.g., reading parameters).
config | The table that specifies the binary classifier's configuration |
idx | The forward_index the binary classifier is being constructed over |
positive | The class_label for positive documents |
negative | The class_label for negative documents |
config | The table that specifies the binary classifier's configuration |
idx | The forward_index the binary classifier is being constructed over |
positive | The class_label for positive documents |
negative | The class_label for negative documents |
void meta::classify::register_binary_classifier | ( | ) |
Registration method for binary classifiers.
Clients should use this method to register any new binary classifiers they write.
std::unique_ptr< classifier > meta::classify::make_classifier | ( | const cpptoml::table & | config, |
std::shared_ptr< index::forward_index > | idx, | ||
std::shared_ptr< index::inverted_index > | inv_idx = nullptr |
||
) |
Convenience method for creating a classifier using the factory.
config | The configuration group that specifies the configuration for the classifier to be created |
idx | The forward_index to be passed to the classifier being created |
inv_idx | The inverted_index to be passed to the classifier being created (if needed) |
std::unique_ptr<classifier> meta::classify::make_classifier | ( | const cpptoml::table & | , |
std::shared_ptr< index::forward_index > | idx | ||
) |
Factory method for creating a classifier.
This should be specialized if your given classifier requires special construction behavior (e.g., reading parameters).
config | The configuration group that specifies the configuration for the classifier to be created |
idx | The forward_index to be passed to the classifier being created |
std::unique_ptr<classifier> meta::classify::make_multi_index_classifier | ( | const cpptoml::table & | , |
std::shared_ptr< index::forward_index > | idx, | ||
std::shared_ptr< index::inverted_index > | inv_idx | ||
) |
Factory method for creating a classifier that takes both index types.
This should be specialized if your given classifier requires special construction behavior.
config | The configuration group that specifies the configuration for the classifier to be created |
idx | The forward_index to be passed to the classifier being created |
inv_idx | The inverted_index to be passed to the classifier being created |
void meta::classify::register_classifier | ( | ) |
Registration method for classifiers.
Clients should use this method to register any new classifiers they write that operate on just a forward_index (this should be most).
void meta::classify::register_multi_index_classifier | ( | ) |
Registration method for multi-index classifiers.
Clients should use this method to register any new classifiers they write that operate on both a forward_index and an inverted_index (this is rare).