|
ModErn Text Analysis
META Enumerates Textual Applications
|
Contains tokenizers that start off a filter chain. More...
Classes | |
| class | character_tokenizer |
| Converts documents into streams of characters. More... | |
| class | icu_tokenizer |
| Converts documents into streams of tokens by following the unicode standards for sentence and word segmentation. More... | |
| class | whitespace_tokenizer |
| Converts documents into streams of whitespace delimited tokens. More... | |
Functions | |
| template<class Tokenizer > | |
| std::unique_ptr< token_stream > | make_tokenizer (const cpptoml::table &) |
| Factory method for creating a tokenizer. More... | |
Contains tokenizers that start off a filter chain.
| std::unique_ptr<token_stream> meta::analyzers::tokenizers::make_tokenizer | ( | const cpptoml::table & | ) |
Factory method for creating a tokenizer.
This should be specialized if your given tokenizer requires special construction behavior.
1.8.9.1