ModErn Text Analysis
META Enumerates Textual Applications
Public Member Functions | Static Public Attributes | Private Attributes | List of all members
meta::classify::naive_bayes Class Reference

Implements the Naive Bayes classifier, a simplistic probabilistic classifier that uses Bayes' theorem with strong feature independence assumptions. More...

#include <naive_bayes.h>

Inheritance diagram for meta::classify::naive_bayes:
meta::classify::classifier

Public Member Functions

 naive_bayes (std::shared_ptr< index::forward_index > idx, double alpha=default_alpha, double beta=default_beta)
 Constructor: learns class models based on a collection of training documents. More...
 
void train (const std::vector< doc_id > &docs) override
 Creates a classification model based on training documents. More...
 
class_label classify (doc_id d_id) override
 Classifies a document into a specific group, as determined by training data. More...
 
void reset () override
 Resets any learning information associated with this classifier.
 
- Public Member Functions inherited from meta::classify::classifier
 classifier (std::shared_ptr< index::forward_index > idx)
 
virtual confusion_matrix test (const std::vector< doc_id > &docs)
 Classifies a collection document into specific groups, as determined by training data; this function will make repeated calls to classify(). More...
 
virtual confusion_matrix cross_validate (const std::vector< doc_id > &input_docs, size_t k, bool even_split=false, int seed=1)
 Performs k-fold cross-validation on a set of documents. More...
 

Static Public Attributes

static const constexpr double default_alpha = 0.1
 The default \(\alpha\) parameter.
 
static const constexpr double default_beta = 0.1
 The default \(beta\) parameter.
 
static const std::string id = "naive-bayes"
 The identifier for this classifier.
 

Private Attributes

util::sparse_vector< class_label, stats::multinomial< term_id > > term_probs_
 Contains P(term|class) for each class.
 
stats::multinomial< class_label > class_probs_
 Contains the number of documents in each class.
 

Additional Inherited Members

- Protected Attributes inherited from meta::classify::classifier
std::shared_ptr< index::forward_indexidx_
 the index that the classifer is run on
 

Detailed Description

Implements the Naive Bayes classifier, a simplistic probabilistic classifier that uses Bayes' theorem with strong feature independence assumptions.

Constructor & Destructor Documentation

meta::classify::naive_bayes::naive_bayes ( std::shared_ptr< index::forward_index idx,
double  alpha = default_alpha,
double  beta = default_beta 
)

Constructor: learns class models based on a collection of training documents.

Parameters
idxThe index to run the classifier on
alphaOptional smoothing parameter for term frequencies
betaOptional smoothing parameter for class frequencies

Member Function Documentation

void meta::classify::naive_bayes::train ( const std::vector< doc_id > &  docs)
overridevirtual

Creates a classification model based on training documents.

Calculates \(P(term|class)\) and \(P(class)\) for all the training documents.

Parameters
docsThe training documents

Implements meta::classify::classifier.

class_label meta::classify::naive_bayes::classify ( doc_id  d_id)
overridevirtual

Classifies a document into a specific group, as determined by training data.

Parameters
d_idThe document to classify
Returns
the class it belongs to

Implements meta::classify::classifier.


The documentation for this class was generated from the following files: