ModErn Text Analysis
META Enumerates Textual Applications
icu_filter.h
Go to the documentation of this file.
1 
9 #ifndef META_ICU_FILTER_H_
10 #define META_ICU_FILTER_H_
11 
13 #include "util/clonable.h"
14 #include "util/optional.h"
15 #include "utf/transformer.h"
16 
17 namespace cpptoml
18 {
19 class table;
20 }
21 
22 namespace meta
23 {
24 namespace analyzers
25 {
26 namespace filters
27 {
28 
33 class icu_filter : public util::clonable<token_stream, icu_filter>
34 {
35  public:
42  icu_filter(std::unique_ptr<token_stream> source, const std::string& id);
43 
48  icu_filter(const icu_filter& other);
49 
54  void set_content(const std::string& content) override;
55 
59  std::string next() override;
60 
64  operator bool() const override;
65 
67  const static std::string id;
68 
69  private:
73  void next_token();
74 
76  std::unique_ptr<token_stream> source_;
77 
80 
83 };
84 
88 template <>
89 std::unique_ptr<token_stream>
90  make_filter<icu_filter>(std::unique_ptr<token_stream>,
91  const cpptoml::table&);
92 }
93 }
94 }
95 #endif
void next_token()
Finds the next valid token for this filter.
Definition: icu_filter.cpp:46
void set_content(const std::string &content) override
Sets the content for the beginning of the filter chain.
Definition: icu_filter.cpp:33
static const std::string id
Identifier for this filter.
Definition: icu_filter.h:67
util::optional< std::string > token_
Current token (if available)
Definition: icu_filter.h:82
utf::transformer trans_
The transformer to use.
Definition: icu_filter.h:79
Template class to facilitate polymorphic cloning.
Definition: clonable.h:28
The ModErn Text Analysis toolkit is a suite of natural language processing, classification, information retreival, data mining, and other applications of text processing.
Definition: analyzer.h:24
std::unique_ptr< token_stream > make_filter< icu_filter >(std::unique_ptr< token_stream >, const cpptoml::table &)
Specialization of the factory method for creating icu_filters.
Definition: icu_filter.cpp:73
std::unique_ptr< token_stream > source_
The source to read tokens from.
Definition: icu_filter.h:76
icu_filter(std::unique_ptr< token_stream > source, const std::string &id)
Constructs an icu_filter which reads tokens from the given source, using a utf::transformer construct...
Definition: icu_filter.cpp:18
std::string next() override
Definition: icu_filter.cpp:39
Filter that applies an ICU transliteration to each token in the sequence.
Definition: icu_filter.h:33
Class that encapsulates transliteration of unicode strings.
Definition: transformer.h:26
Definition: analyzer.h:19