ModErn Text Analysis
META Enumerates Textual Applications
ngram_analyzer.h
Go to the documentation of this file.
1 
9 #ifndef META_NGRAM_ANALYZER_H_
10 #define META_NGRAM_ANALYZER_H_
11 
12 #include <deque>
13 
14 #include "analyzers/analyzer.h"
15 #include "util/clonable.h"
16 
17 namespace meta
18 {
19 namespace analyzers
20 {
21 
27 class ngram_analyzer : public analyzer
28 {
29  public:
34  ngram_analyzer(uint16_t n);
35 
39  virtual uint16_t n_value() const;
40 
41  protected:
47  virtual std::string wordify(const std::deque<std::string>& words) const;
48 
49  private:
51  uint16_t n_val_;
52 };
53 }
54 }
55 
56 #endif
uint16_t n_val_
The value of n for this ngram analyzer.
Definition: ngram_analyzer.h:51
virtual uint16_t n_value() const
Definition: ngram_analyzer.cpp:18
ngram_analyzer(uint16_t n)
Constructor.
Definition: ngram_analyzer.cpp:13
The ModErn Text Analysis toolkit is a suite of natural language processing, classification, information retreival, data mining, and other applications of text processing.
Definition: analyzer.h:24
An class that provides a framework to produce token counts from documents.
Definition: analyzer.h:41
virtual std::string wordify(const std::deque< std::string > &words) const
Turns a list of words into an ngram string.
Definition: ngram_analyzer.cpp:23
Analyzes documents based on an ngram word model, where the value for n is supplied by the user...
Definition: ngram_analyzer.h:27