ModErn Text Analysis
META Enumerates Textual Applications
Classes | Public Types | Public Member Functions | Private Types | Private Member Functions | Private Attributes | List of all members
meta::parser::sr_parser Class Reference

A shift-reduce constituency parser. More...

#include <sr_parser.h>

Classes

class  exception
 Exception thrown during parser actions. More...
 
class  state_analyzer
 Analyzer responsible for converting a parser state to a feature_vector. More...
 
struct  training_batch
 A training batch. More...
 
class  training_data
 Training data for the parser. More...
 
struct  training_options
 Training options required for learning a parser model. More...
 

Public Types

enum  training_algorithm { EARLY_TERMINATION, BEAM_SEARCH }
 The set of training algorithms available for the parser.
 
using feature_vector = std::unordered_map< std::string, float >
 Sparse vector representation of a state's features.
 
using weight_vector = util::sparse_vector< trans_id, float >
 A single weight vector for a specific transition.
 
using weight_vectors = std::unordered_map< std::string, weight_vector >
 A collection of weight vectors by feature type.
 

Public Member Functions

 sr_parser ()=default
 Default constructor.
 
 sr_parser (const std::string &prefix)
 Loads a pre-trained parser from a prefix. More...
 
parse_tree parse (const sequence::sequence &sentence) const
 Parses a POS-tagged sentence (represented as a sequence::sequence). More...
 
void train (std::vector< parse_tree > &trees, training_options options)
 Trains a model on the given parse trees using the supplied training options. More...
 
void save (const std::string &prefix) const
 

Private Types

using scored_trans = std::pair< trans_id, float >
 

Private Member Functions

void load (const std::string &prefix)
 
std::tuple< weight_vectors, uint64_t, uint64_t > train_batch (training_batch batch, parallel::thread_pool &pool, const training_options &options)
 Calculates a weight update on a given batch of training trees. More...
 
std::pair< uint64_t, uint64_t > train_instance (const parse_tree &tree, const std::vector< trans_id > &transitions, const training_options &options, weight_vectors &update) const
 Calculates a weight update on a single tree. More...
 
std::pair< uint64_t, uint64_t > train_early_termination (const parse_tree &tree, const std::vector< trans_id > &transitions, weight_vectors &update) const
 Calculates a weight update on a single tree, using the greedy early termination training strategy. More...
 
std::pair< uint64_t, uint64_t > train_beam_search (const parse_tree &tree, const std::vector< trans_id > &transitions, const training_options &options, weight_vectors &update) const
 Calculates a weight update on a single tree, using beam search. More...
 
trans_id best_transition (const feature_vector &features, const state &state, bool check_legality=false) const
 Computes the most likely transition according to the current model. More...
 
std::vector< scored_trans > best_transitions (const feature_vector &features, const state &state, size_t num, bool check_legality=false) const
 Computes the \(k\) most likely transitions according to the current model. More...
 

Private Attributes

transition_map trans_
 Storage for the ids for each transition.
 
classify::linear_model< std::string, float, trans_id > model_
 Storage for the weights for each possible transition.
 
uint64_t beam_size_ = 1
 Beam size used during training.
 

Detailed Description

A shift-reduce constituency parser.

The model is a simple linear classifier learned via the generalized averaged perceptron algorithm that seeks to classify a parser action given a parser state.

See also
http://people.sutd.edu.sg/~yue_zhang/pub/acl13.muhua.pdf
http://www.aclweb.org/anthology/W09-3825

Constructor & Destructor Documentation

meta::parser::sr_parser::sr_parser ( const std::string &  prefix)

Loads a pre-trained parser from a prefix.

Parameters
prefixThe prefix to load the parser model from

Member Function Documentation

parse_tree meta::parser::sr_parser::parse ( const sequence::sequence sentence) const

Parses a POS-tagged sentence (represented as a sequence::sequence).

Parameters
sentenceThe sentence to be tagged
Returns
the parse tree corresponding to the input sentence
void meta::parser::sr_parser::train ( std::vector< parse_tree > &  trees,
training_options  options 
)

Trains a model on the given parse trees using the supplied training options.

Parameters
treesThe full parse trees for training
optionsThe options used for training
void meta::parser::sr_parser::save ( const std::string &  prefix) const
Parameters
prefixThe prefix to store the model in
void meta::parser::sr_parser::load ( const std::string &  prefix)
private
Parameters
prefixThe prefix to load the model from
auto meta::parser::sr_parser::train_batch ( training_batch  batch,
parallel::thread_pool pool,
const training_options options 
)
private

Calculates a weight update on a given batch of training trees.

Parameters
batchThe batch to learn on
poolThe thread pool to use for parsing the batch in parallel
optionsThe training options
Returns
a 3-tuple (update, correct actions, incorrect actions)
std::pair< uint64_t, uint64_t > meta::parser::sr_parser::train_instance ( const parse_tree tree,
const std::vector< trans_id > &  transitions,
const training_options options,
weight_vectors update 
) const
private

Calculates a weight update on a single tree.

Parameters
treeThe training tree
transitionsThe correct transitions for parsing this tree
optionsThe training options
updateThe weight vector to place the update in
Returns
(correct actions, incorrect actions)
std::pair< uint64_t, uint64_t > meta::parser::sr_parser::train_early_termination ( const parse_tree tree,
const std::vector< trans_id > &  transitions,
weight_vectors update 
) const
private

Calculates a weight update on a single tree, using the greedy early termination training strategy.

Parameters
treeThe training tree
transitionsThe correct transitions for parsing this tree
optionsThe training options
updateThe weight vector to place the update in
Returns
(correct actions, incorrect actions)
std::pair< uint64_t, uint64_t > meta::parser::sr_parser::train_beam_search ( const parse_tree tree,
const std::vector< trans_id > &  transitions,
const training_options options,
weight_vectors update 
) const
private

Calculates a weight update on a single tree, using beam search.

Parameters
treeThe training tree
transitionsThe correct transitions for parsing this tree
optionsThe training options
updateThe weight vector to place the update in
Returns
(correct actions, incorrect actions)
auto meta::parser::sr_parser::best_transition ( const feature_vector features,
const state state,
bool  check_legality = false 
) const
private

Computes the most likely transition according to the current model.

Parameters
featuresThe feature vector representation for the current state
stateThe current state
check_legalityWhether or not to limit the transitions to only those that are "legal" according to the constraints given for each transition
auto meta::parser::sr_parser::best_transitions ( const feature_vector features,
const state state,
size_t  num,
bool  check_legality = false 
) const
private

Computes the \(k\) most likely transitions according to the current model.

Parameters
featuresThe feature vector representation for the current state
stateThe current state
check_legalityWhether or not to limit the transitions to only those that are "legal" according to the constraints given for each transition

The documentation for this class was generated from the following files: