ModErn Text Analysis
META Enumerates Textual Applications
Main Page
Related Pages
Namespaces
Classes
Files
File List
File Members
include
analyzers
tokenizers
icu_tokenizer.h
Go to the documentation of this file.
1
9
#ifndef META_ICU_TOKENIZER_H_
10
#define META_ICU_TOKENIZER_H_
11
12
#include "
analyzers/token_stream.h
"
13
#include "
util/clonable.h
"
14
#include "
util/pimpl.h
"
15
16
namespace
meta
17
{
18
namespace
corpus
19
{
20
class
document;
21
}
22
}
23
24
namespace
meta
25
{
26
namespace
analyzers
27
{
28
namespace
tokenizers
29
{
30
35
class
icu_tokenizer
:
public
util::clonable
<token_stream, icu_tokenizer>
36
{
37
public
:
41
icu_tokenizer
();
42
47
icu_tokenizer
(
const
icu_tokenizer
& other);
48
53
icu_tokenizer
(
icu_tokenizer
&& other);
54
58
~icu_tokenizer
();
59
67
void
set_content
(
const
std::string& content)
override
;
68
75
std::string
next
()
override
;
76
80
operator
bool()
const override
;
81
83
const
static
std::string
id
;
84
85
private
:
87
class
impl
;
88
90
util::pimpl<impl>
impl_
;
91
};
92
}
93
}
94
}
95
#endif
meta::analyzers::tokenizers::icu_tokenizer::impl_
util::pimpl< impl > impl_
The implementation for this tokenizer.
Definition:
icu_tokenizer.h:87
meta::analyzers::tokenizers::icu_tokenizer::set_content
void set_content(const std::string &content) override
Sets the content for the tokenizer to parse.
Definition:
icu_tokenizer.cpp:104
meta::analyzers::tokenizers::icu_tokenizer::icu_tokenizer
icu_tokenizer()
Creates an icu_tokenizer.
meta::analyzers::tokenizers::icu_tokenizer::id
static const std::string id
Identifier for this tokenizer.
Definition:
icu_tokenizer.h:83
meta::util::pimpl
Class to assist in simple pointer-to-implementation classes.
Definition:
pimpl.h:26
meta::analyzers::tokenizers::icu_tokenizer::~icu_tokenizer
~icu_tokenizer()
Destroys an icu_tokenizer.
meta::util::multilevel_clonable
Template class to facilitate polymorphic cloning.
Definition:
clonable.h:28
clonable.h
meta
The ModErn Text Analysis toolkit is a suite of natural language processing, classification, information retreival, data mining, and other applications of text processing.
Definition:
analyzer.h:24
meta::analyzers::tokenizers::icu_tokenizer::next
std::string next() override
Definition:
icu_tokenizer.cpp:109
meta::analyzers::tokenizers::icu_tokenizer::impl
Implementation class for the icu_tokenizer.
Definition:
icu_tokenizer.cpp:28
meta::analyzers::tokenizers::icu_tokenizer
Converts documents into streams of tokens by following the unicode standards for sentence and word se...
Definition:
icu_tokenizer.h:35
token_stream.h
pimpl.h
Generated on Tue Mar 3 2015 23:20:16 for ModErn Text Analysis by
1.8.9.1