ModErn Text Analysis
META Enumerates Textual Applications
training_data.h
Go to the documentation of this file.
1 
9 #ifndef META_PARSER_TRAINING_DATA_H_
10 #define META_PARSER_TRAINING_DATA_H_
11 
12 #include "parser/sr_parser.h"
13 
14 namespace meta
15 {
16 namespace parser
17 {
18 
23 {
24  public:
30  training_data(std::vector<parse_tree>& trees,
31  std::default_random_engine::result_type seed);
32 
47 
51  void shuffle();
52 
56  size_t size() const;
57 
62  const parse_tree& tree(size_t idx) const;
63 
68  const std::vector<trans_id>& transitions(size_t idx) const;
69 
70  private:
74  std::vector<parse_tree>& trees_;
75 
79  std::vector<std::vector<trans_id>> all_transitions_;
80 
84  std::vector<size_t> indices_;
85 
89  std::default_random_engine rng_;
90 };
91 }
92 }
93 #endif
size_t size() const
Definition: training_data.cpp:69
Represents the parse tree for a sentence.
Definition: parse_tree.h:32
const parse_tree & tree(size_t idx) const
Definition: training_data.cpp:74
std::vector< std::vector< trans_id > > all_transitions_
The gold standard transitions for each tree.
Definition: training_data.h:79
std::vector< parse_tree > & trees_
A reference to the collection of training trees.
Definition: training_data.h:74
void shuffle()
Shuffles the training data.
Definition: training_data.cpp:64
training_data(std::vector< parse_tree > &trees, std::default_random_engine::result_type seed)
Definition: training_data.cpp:23
Training data for the parser.
Definition: training_data.h:22
The ModErn Text Analysis toolkit is a suite of natural language processing, classification, information retreival, data mining, and other applications of text processing.
Definition: analyzer.h:24
const std::vector< trans_id > & transitions(size_t idx) const
Definition: training_data.cpp:79
An invertible map that maps transitions to ids.
Definition: transition_map.h:23
transition_map preprocess()
Preprocesses all of the training trees.
Definition: training_data.cpp:30
std::vector< size_t > indices_
The vector of indices used for fast shuffling.
Definition: training_data.h:84
std::default_random_engine rng_
The random number generator used for shuffling.
Definition: training_data.h:89