ModErn Text Analysis
META Enumerates Textual Applications
|
Implements the k-Nearest Neighbor lazy learning classification algorithm. More...
#include <knn.h>
Classes | |
class | knn_exception |
Basic exception for knn interactions. More... | |
Public Member Functions | |
knn (std::shared_ptr< index::inverted_index > idx, std::shared_ptr< index::forward_index > f_idx, uint16_t k, std::unique_ptr< index::ranker > ranker, bool weighted=false) | |
void | train (const std::vector< doc_id > &docs) override |
Creates a classification model based on training documents. More... | |
class_label | classify (doc_id d_id) override |
Classifies a document into a specific group, as determined by training data. More... | |
void | reset () override |
Resets any learning information associated with this classifier. | |
Public Member Functions inherited from meta::classify::classifier | |
classifier (std::shared_ptr< index::forward_index > idx) | |
virtual confusion_matrix | test (const std::vector< doc_id > &docs) |
Classifies a collection document into specific groups, as determined by training data; this function will make repeated calls to classify(). More... | |
virtual confusion_matrix | cross_validate (const std::vector< doc_id > &input_docs, size_t k, bool even_split=false, int seed=1) |
Performs k-fold cross-validation on a set of documents. More... | |
Static Public Attributes | |
static const std::string | id = "knn" |
Identifier for this classifier. | |
Private Member Functions | |
class_label | select_best_label (const std::vector< std::pair< doc_id, double >> &scored, const std::vector< std::pair< class_label, uint16_t >> &sorted) const |
Private Attributes | |
std::shared_ptr< index::inverted_index > | inv_idx_ |
the inverted index used for ranking | |
uint16_t | k_ |
the value of k in k-NN | |
std::unique_ptr< index::ranker > | ranker_ |
The ranker that is used to score the queries in the index. | |
std::unordered_set< doc_id > | legal_docs_ |
documents that are "legal" to be used in the results | |
const bool | weighted_ |
Whether we want the neighbors to be weighted by distance or not. | |
Additional Inherited Members | |
Protected Attributes inherited from meta::classify::classifier | |
std::shared_ptr< index::forward_index > | idx_ |
the index that the classifer is run on | |
Implements the k-Nearest Neighbor lazy learning classification algorithm.
meta::classify::knn::knn | ( | std::shared_ptr< index::inverted_index > | idx, |
std::shared_ptr< index::forward_index > | f_idx, | ||
uint16_t | k, | ||
std::unique_ptr< index::ranker > | ranker, | ||
bool | weighted = false |
||
) |
idx | The index to run the classifier on |
ranker | The ranker to be used internally |
k | The value of k in k-NN |
args | Arguments to the chosen ranker constructor |
weighted | Whether to weight the neighbors by distance to the query |
|
overridevirtual |
Creates a classification model based on training documents.
docs | The training documents |
Implements meta::classify::classifier.
|
overridevirtual |
Classifies a document into a specific group, as determined by training data.
d_id | The document to classify |
Implements meta::classify::classifier.
|
private |
scored | |
sorted |