ModErn Text Analysis
META Enumerates Textual Applications
Public Member Functions | Static Public Attributes | Private Types | Private Attributes | List of all members
meta::analyzers::ngram_word_analyzer Class Reference

Analyzes documents using their tokenized words. More...

#include <ngram_word_analyzer.h>

Inheritance diagram for meta::analyzers::ngram_word_analyzer:
meta::util::multilevel_clonable< analyzer, ngram_analyzer, ngram_word_analyzer >

Public Member Functions

 ngram_word_analyzer (uint16_t n, std::unique_ptr< token_stream > stream)
 Constructor. More...
 
 ngram_word_analyzer (const ngram_word_analyzer &other)
 Copy constructor. More...
 
virtual void tokenize (corpus::document &doc) override
 Tokenizes a file into a document. More...
 
- Public Member Functions inherited from meta::util::multilevel_clonable< analyzer, ngram_analyzer, ngram_word_analyzer >
virtual std::unique_ptr< analyzer > clone () const
 Clones the given object. More...
 

Static Public Attributes

static const std::string id = "ngram-word"
 Identifier for this analyzer.
 

Private Types

using base = util::multilevel_clonable< analyzer, ngram_analyzer, ngram_word_analyzer >
 

Private Attributes

std::unique_ptr< token_streamstream_
 The token stream to be used for extracting tokens.
 

Detailed Description

Analyzes documents using their tokenized words.

Constructor & Destructor Documentation

meta::analyzers::ngram_word_analyzer::ngram_word_analyzer ( uint16_t  n,
std::unique_ptr< token_stream stream 
)

Constructor.

Parameters
nThe value of n to use for the ngrams.
streamThe stream to read tokens from.
meta::analyzers::ngram_word_analyzer::ngram_word_analyzer ( const ngram_word_analyzer other)

Copy constructor.

Parameters
otherThe other ngram_word_analyzer to copy from

Member Function Documentation

void meta::analyzers::ngram_word_analyzer::tokenize ( corpus::document doc)
overridevirtual

Tokenizes a file into a document.

Parameters
docThe document to store the tokenized information in

The documentation for this class was generated from the following files: