ModErn Text Analysis
META Enumerates Textual Applications
Main Page
Related Pages
Namespaces
Classes
Files
File List
File Members
include
analyzers
filters
ptb_normalizer.h
Go to the documentation of this file.
1
9
#ifndef META_PTB_NORMALIZER_H_
10
#define META_PTB_NORMALIZER_H_
11
12
#include <deque>
13
#include <memory>
14
#include "
analyzers/token_stream.h
"
15
#include "
util/clonable.h
"
16
17
namespace
meta
18
{
19
namespace
analyzers
20
{
21
namespace
filters
22
{
23
29
class
ptb_normalizer
:
public
util::clonable
<token_stream, ptb_normalizer>
30
{
31
public
:
37
ptb_normalizer
(std::unique_ptr<token_stream> source);
38
43
ptb_normalizer
(
const
ptb_normalizer
& other);
44
49
void
set_content
(
const
std::string& content)
override
;
50
54
std::string
next
()
override
;
55
59
operator
bool()
const override
;
60
62
const
static
std::string
id
;
63
64
private
:
68
std::string
current_token
();
69
76
void
parse_token
(
const
std::string& token);
77
79
std::unique_ptr<token_stream>
source_
;
80
82
std::deque<std::string>
tokens_
;
83
};
84
}
85
}
86
}
87
#endif
meta::analyzers::filters::ptb_normalizer::next
std::string next() override
Obtains the next token in the sequence.
Definition:
ptb_normalizer.cpp:37
meta::analyzers::filters::ptb_normalizer::current_token
std::string current_token()
Definition:
ptb_normalizer.cpp:90
meta::analyzers::filters::ptb_normalizer::source_
std::unique_ptr< token_stream > source_
The source to read tokens from.
Definition:
ptb_normalizer.h:79
meta::analyzers::filters::ptb_normalizer::id
static const std::string id
Identifier for this filter.
Definition:
ptb_normalizer.h:62
meta::analyzers::filters::ptb_normalizer
A filter that normalizes text to match Penn Treebank conventions.
Definition:
ptb_normalizer.h:29
meta::analyzers::filters::ptb_normalizer::ptb_normalizer
ptb_normalizer(std::unique_ptr< token_stream > source)
Constructs an ptb_normalizer which reads tokens from the given source.
Definition:
ptb_normalizer.cpp:19
meta::util::multilevel_clonable
Template class to facilitate polymorphic cloning.
Definition:
clonable.h:28
clonable.h
meta
The ModErn Text Analysis toolkit is a suite of natural language processing, classification, information retreival, data mining, and other applications of text processing.
Definition:
analyzer.h:24
meta::analyzers::filters::ptb_normalizer::parse_token
void parse_token(const std::string &token)
Performs token normalization, splitting, etc.
Definition:
ptb_normalizer.cpp:97
meta::analyzers::filters::ptb_normalizer::set_content
void set_content(const std::string &content) override
Sets the content for the beginning of the filter chain.
Definition:
ptb_normalizer.cpp:31
meta::analyzers::filters::ptb_normalizer::tokens_
std::deque< std::string > tokens_
Buffered tokens to return.
Definition:
ptb_normalizer.h:82
token_stream.h
Generated on Tue Mar 3 2015 23:20:16 for ModErn Text Analysis by
1.8.9.1