ModErn Text Analysis
META Enumerates Textual Applications
Classes | Functions
meta::analyzers::filters Namespace Reference

Contains filters that mutate existing token streams in a filter chain. More...

Classes

class  alpha_filter
 Filter that removes "non-letter" characters from tokens. More...
 
class  empty_sentence_filter
 Filter that removes any empty sentences from the token stream. More...
 
class  english_normalizer
 Filter that normalizes english language tokens. More...
 
class  icu_filter
 Filter that applies an ICU transliteration to each token in the sequence. More...
 
class  length_filter
 Filter that only retains tokens that are within a certain length range, inclusive. More...
 
class  list_filter
 Filter that either removes or keeps tokens from a given list. More...
 
class  lowercase_filter
 Filter that converts all tokens to lowercase. More...
 
class  porter2_stemmer
 Filter that stems words according to the porter2 stemmer algorithm. More...
 
class  ptb_normalizer
 A filter that normalizes text to match Penn Treebank conventions. More...
 
class  sentence_boundary
 Filter that adds sentence boundary tokens ("<s>" and "</s>") to streams of tokens. More...
 

Functions

template<class Filter >
std::unique_ptr< token_streammake_filter (std::unique_ptr< token_stream > source, const cpptoml::table &)
 Factory method for creating a filter. More...
 
template<>
std::unique_ptr< token_streammake_filter< icu_filter > (std::unique_ptr< token_stream >, const cpptoml::table &)
 Specialization of the factory method for creating icu_filters.
 
template<>
std::unique_ptr< token_streammake_filter< length_filter > (std::unique_ptr< token_stream >, const cpptoml::table &)
 Specialization of the factory method for creating length_filters.
 
template<>
std::unique_ptr< token_streammake_filter< list_filter > (std::unique_ptr< token_stream >, const cpptoml::table &)
 Specialization of the factory method used to create list_filters.
 
template<>
std::unique_ptr< token_streammake_filter< sentence_boundary > (std::unique_ptr< token_stream >, const cpptoml::table &)
 Specialization of the factory method used to create sentence_boundary filters.
 

Detailed Description

Contains filters that mutate existing token streams in a filter chain.

Function Documentation

template<class Filter >
std::unique_ptr<token_stream> meta::analyzers::filters::make_filter ( std::unique_ptr< token_stream source,
const cpptoml::table &   
)

Factory method for creating a filter.

This should be specialized if your given filter requires special behavior.