|
ModErn Text Analysis
META Enumerates Textual Applications
|
Implements the Naive Bayes classifier, a simplistic probabilistic classifier that uses Bayes' theorem with strong feature independence assumptions. More...
#include <naive_bayes.h>
Public Member Functions | |
| naive_bayes (std::shared_ptr< index::forward_index > idx, double alpha=default_alpha, double beta=default_beta) | |
| Constructor: learns class models based on a collection of training documents. More... | |
| void | train (const std::vector< doc_id > &docs) override |
| Creates a classification model based on training documents. More... | |
| class_label | classify (doc_id d_id) override |
| Classifies a document into a specific group, as determined by training data. More... | |
| void | reset () override |
| Resets any learning information associated with this classifier. | |
Public Member Functions inherited from meta::classify::classifier | |
| classifier (std::shared_ptr< index::forward_index > idx) | |
| virtual confusion_matrix | test (const std::vector< doc_id > &docs) |
| Classifies a collection document into specific groups, as determined by training data; this function will make repeated calls to classify(). More... | |
| virtual confusion_matrix | cross_validate (const std::vector< doc_id > &input_docs, size_t k, bool even_split=false, int seed=1) |
| Performs k-fold cross-validation on a set of documents. More... | |
Static Public Attributes | |
| static const constexpr double | default_alpha = 0.1 |
| The default \(\alpha\) parameter. | |
| static const constexpr double | default_beta = 0.1 |
| The default \(beta\) parameter. | |
| static const std::string | id = "naive-bayes" |
| The identifier for this classifier. | |
Private Attributes | |
| util::sparse_vector< class_label, stats::multinomial< term_id > > | term_probs_ |
| Contains P(term|class) for each class. | |
| stats::multinomial< class_label > | class_probs_ |
| Contains the number of documents in each class. | |
Additional Inherited Members | |
Protected Attributes inherited from meta::classify::classifier | |
| std::shared_ptr< index::forward_index > | idx_ |
| the index that the classifer is run on | |
Implements the Naive Bayes classifier, a simplistic probabilistic classifier that uses Bayes' theorem with strong feature independence assumptions.
| meta::classify::naive_bayes::naive_bayes | ( | std::shared_ptr< index::forward_index > | idx, |
| double | alpha = default_alpha, |
||
| double | beta = default_beta |
||
| ) |
Constructor: learns class models based on a collection of training documents.
| idx | The index to run the classifier on |
| alpha | Optional smoothing parameter for term frequencies |
| beta | Optional smoothing parameter for class frequencies |
|
overridevirtual |
Creates a classification model based on training documents.
Calculates \(P(term|class)\) and \(P(class)\) for all the training documents.
| docs | The training documents |
Implements meta::classify::classifier.
|
overridevirtual |
Classifies a document into a specific group, as determined by training data.
| d_id | The document to classify |
Implements meta::classify::classifier.
1.8.9.1