Analyzes documents based on an ngram word model, where the value for n is supplied by the user. More...

Inheritance diagram for meta::analyzers::ngram_analyzer:

Public Member Functions
	ngram_analyzer (uint16_t n)
	Constructor. More...

virtual uint16_t	n_value () const

Public Member Functions inherited from meta::analyzers::analyzer
virtual	~analyzer ()=default
	A default virtual destructor.

virtual void	tokenize (corpus::document &doc)=0
	Tokenizes a document. More...

virtual std::unique_ptr< analyzer >	clone () const =0
	Clones this analyzer.

Protected Member Functions
virtual std::string	wordify (const std::deque< std::string > &words) const
	Turns a list of words into an ngram string. More...

Private Attributes
uint16_t	n_val_
	The value of n for this ngram analyzer.

Additional Inherited Members
Static Public Member Functions inherited from meta::analyzers::analyzer
static std::unique_ptr< analyzer >	load (const cpptoml::table &config)

static std::unique_ptr< token_stream >	default_filter_chain (const cpptoml::table &config)

static std::unique_ptr< token_stream >	load_filters (const cpptoml::table &global, const cpptoml::table &config)

static std::unique_ptr< token_stream >	load_filter (std::unique_ptr< token_stream > src, const cpptoml::table &config)

static io::parser	create_parser (const corpus::document &doc, const std::string &extension, const std::string &delims)

static std::string	get_content (const corpus::document &doc)

Detailed Description

Analyzes documents based on an ngram word model, where the value for n is supplied by the user.

This class is abstract, as it only provides the framework for ngram tokenization.

Constructor & Destructor Documentation

ngram_analyzer::ngram_analyzer ( uint16_t n )

Constructor.

Parameters

n	The value of n in ngram.

uint16_t ngram_analyzer::n_value ( ) const

virtual

std::string ngram_analyzer::wordify ( const std::deque< std::string > & words ) const

protectedvirtual

Turns a list of words into an ngram string.

Parameters

words The deque representing a list of words

The documentation for this class was generated from the following files: