Linear-chain conditional random field for POS tagging and chunking applications. More...

#include <crf.h>

Classes
struct	parameters
	Wrapper to represent the parameters used during learning. More...

class	scorer
	Internal class that holds scoring information for sequences under the current model. More...

class	tagger

class	viterbi_scorer
	Scorer for performing viterbi-based tagging. More...

Public Member Functions
	crf (const std::string &prefix)
	Constructs a new CRF, storing model parameters in the given prefix. More...

double	train (parameters params, const std::vector< sequence > &examples)
	Trains a new CRF model on the given examples. More...

tagger	make_tagger () const
	Constructs a new tagging interface that references the current model. More...

uint64_t	num_labels () const

Private Types
using	double_matrix = util::dense_matrix< double >
	A dense_matrix of doubles, used frequently in training and testing for holding score information under the model.

using	feature_range = util::basic_range< crf_feature_id >
	A range representing a set of feature functions (ids).

Private Member Functions
void	initialize (const std::vector< sequence > &examples)
	Initializes the CRF model based on the set of training examples. More...

void	load_model ()
	Loads the CRF model from the files stored on disk.

void	reset ()
	Completely resets the model weights.

double	calibrate (parameters params, const std::vector< uint64_t > &indices, const std::vector< sequence > &examples)
	Determines a good initial setting for the learning rate. More...

const double &	obs_weight (crf_feature_id idx) const

double &	obs_weight (crf_feature_id idx)

const double &	trans_weight (crf_feature_id idx) const

double &	trans_weight (crf_feature_id idx)

feature_range	obs_range (feature_id fid) const

feature_range	trans_range (label_id lbl) const

label_id	observation (crf_feature_id idx) const

label_id	transition (crf_feature_id idx) const

double	epoch (parameters params, printing::progress &progress, uint64_t iter, const std::vector< uint64_t > &indices, const std::vector< sequence > &examples, scorer &scorer)
	Performs a single epoch of training. More...

double	iteration (parameters params, uint64_t iter, const sequence &seq, scorer &scorer)
	Performs a single iteration within a training epoch. More...

void	gradient_observation_expectation (const sequence &seq, double gain)
	Updates the model parameters based on the observation expectation part of the gradient. More...

void	gradient_model_expectation (const sequence &seq, double gain, const scorer &scr)
	Updates the model parameters based on the model expectation part of the gradient. More...

double	l2norm () const

void	rescale ()
	Updates all of the weights by re-scaling by the current scale parameter, and sets the scale parameter to 1 after doing so.

Private Attributes
friend	scorer

util::optional< util::disk_vector< crf_feature_id > >	observation_ranges_
	Represents the feature id range for a given observation: `observation_ranges_[i]` gives the start of a range of crf_feature_ids (indexing into the `observation_weights_`) that have fired for feature_id `i`, and `observation_ranges_[i + 1]` gives the end of the range. More...

util::optional< util::disk_vector< crf_feature_id > >	transition_ranges_
	Analogous to the observation range, but for transitions. More...

util::optional< util::disk_vector< label_id > >	observations_
	Represents the state that fired for a given observation feature. More...

util::optional< util::disk_vector< label_id > >	transitions_
	Represents the destination label for a given transition feature. More...

util::optional< util::disk_vector< double > >	observation_weights_
	The weights for all of the node-observation features. More...

util::optional< util::disk_vector< double > >	transition_weights_
	Weights for all of the transition features. More...

double	scale_
	the current decay factor applied to all of the weights

uint64_t	num_labels_
	the number of allowed labels

const std::string &	prefix_
	the prefix (folder) where model files are to be stored

Detailed Description

Linear-chain conditional random field for POS tagging and chunking applications.

Learned using l2 regularized stochastic gradient descent.

This CRF implementation uses node-observation features only. This means that feature templates look like \(f(o_t, s_t)\) and \(f(s_{t-1}, s_t)\) only. This is done for memory efficiency and to avoid overfitting.

See also: http://homepages.inf.ed.ac.uk/csutton/publications/crftut-fnt.pdf

Constructor & Destructor Documentation

meta::sequence::crf::crf ( const std::string & prefix )

Constructs a new CRF, storing model parameters in the given prefix.

If a crf model already exists in the given prefix, it will be loaded; otherwise, the directory will be created.

Parameters

prefix The prefix (folder) to load/store model files

Member Function Documentation

double meta::sequence::crf::train	(	parameters	params,
		const std::vector< sequence > &	examples
	)

Trains a new CRF model on the given examples.

The examples are assumed to have been run through a sequence_analyzer first to generate features for every observation in every sequence.

Parameters

params	The parameters for the learning algorithm
examples	The labeled training examples

Returns: the loss for the last epoch during training

auto meta::sequence::crf::make_tagger ( ) const

Constructs a new tagging interface that references the current model.

Returns: a new tagging interface for this model

uint64_t meta::sequence::crf::num_labels ( ) const

Returns: the number of labels possible under this model.

void meta::sequence::crf::initialize ( const std::vector< sequence > & examples )

private

Initializes the CRF model based on the set of training examples.

This function runs the "feature generation" portion of the training, where we try to find all state-observation and transition-observation functions that are active in the training data.

Parameters

examples The training examples

double meta::sequence::crf::calibrate	(	parameters	params,
		const std::vector< uint64_t > &	indices,
		const std::vector< sequence > &	examples
	)

private

Determines a good initial setting for the learning rate.

Based on Leon Bottou's SGD implementation.

Parameters

params	The parameters for learning
indices	The vector of shuffled indices for the random sampling
examples	The (unshuffled) training examples

Returns: The optimal t0 found by calibration, which determines the initial learning rate \(\eta\).

const double & meta::sequence::crf::obs_weight ( crf_feature_id idx ) const

private

Parameters

idx	The internal crf model feature id

Returns: a const reference to the weight associated with this feature

double & meta::sequence::crf::obs_weight ( crf_feature_id idx )

private

Parameters

idx	The internal crf model feature id

Returns: a reference to the weight associated with this feature

const double & meta::sequence::crf::trans_weight ( crf_feature_id idx ) const

private

Parameters

idx	The internal crf model feature id

Returns: a const reference to the weight associated with this feature

double & meta::sequence::crf::trans_weight ( crf_feature_id idx )

private

Parameters

idx	The internal crf model feature id

Returns: a reference to the weight associated with this feature

auto meta::sequence::crf::obs_range ( feature_id fid ) const

private

Parameters

fid	The external observation feature id

Returns: a range of internal crf model feature ids for state features that are active for this observation

auto meta::sequence::crf::trans_range ( label_id lbl ) const

private

Parameters

lbl The label

Returns: a range of internal crf model feature ids for transitions that are active for this state

label_id meta::sequence::crf::observation ( crf_feature_id idx ) const

private

Parameters

idx	The internal crf model feature id

Returns: the label associated with this state-based feature id

label_id meta::sequence::crf::transition ( crf_feature_id idx ) const

private

Parameters

idx	The internal crf model feature id

Returns: the destination label associated with this transition feature id

double meta::sequence::crf::epoch	(	parameters	params,
		printing::progress &	progress,
		uint64_t	iter,
		const std::vector< uint64_t > &	indices,
		const std::vector< sequence > &	examples,
		scorer &	scorer
	)

private

Performs a single epoch of training.

Parameters

params	The learning parameters
progress	The progress logger to use
iter	The current epoch
indices	The shuffled indices for the random sampling
examples	The (not shuffled) training examples
scorer	The scorer to re-use

Returns: the loss for this training epoch

double meta::sequence::crf::iteration	(	parameters	params,
		uint64_t	iter,
		const sequence &	seq,
		scorer &	scorer
	)

private

Performs a single iteration within a training epoch.

Parameters

params	The learning parameters
iter	The current number of total iterations ( \(t\))
seq	The sequence to use to update model parameters
scorer	The scorer to re-use

Returns: the loss associated with this single iteration within the epoch

void meta::sequence::crf::gradient_observation_expectation	(	const sequence &	seq,
		double	gain
	)

private

Updates the model parameters based on the observation expectation part of the gradient.

Parameters

seq	The sequence to use
gain	The amount to scale the weight updates by

void meta::sequence::crf::gradient_model_expectation	(	const sequence &	seq,
		double	gain,
		const scorer &	scr
	)

private

Updates the model parameters based on the model expectation part of the gradient.

Parameters

seq	The sequence to use
gain	The amount to scale the weight updates by
scr	The scorer to re-use for computing the marginal probabilities

double meta::sequence::crf::l2norm ( ) const

private

Returns: the current l2 norm of the weights ( \(w^T w\))

Member Data Documentation

util::optional<util::disk_vector<crf_feature_id> > meta::sequence::crf::observation_ranges_

private

Represents the feature id range for a given observation: observation_ranges_[i] gives the start of a range of crf_feature_ids (indexing into the observation_weights_) that have fired for feature_id i, and observation_ranges_[i + 1] gives the end of the range.

(If i is the end, then the size of observation_weights_ gives the last id.)

util::optional<util::disk_vector<crf_feature_id> > meta::sequence::crf::transition_ranges_

private

Analogous to the observation range, but for transitions.

transition_ranges_[i] gives the start of a range of feature_ids (indexing into transition_weights_) that have fired for label_id i, and transition_ranges_[i+1] gives the end of the range. (If i is the end, then the size of transition_weights_ gives the last id.)

util::optional<util::disk_vector<label_id> > meta::sequence::crf::observations_

private

Represents the state that fired for a given observation feature.

This is a parallel vector with observation_weights_, where observations_[f] gives the label_id for the observation feature f.

util::optional<util::disk_vector<label_id> > meta::sequence::crf::transitions_

private

Represents the destination label for a given transition feature.

This is a parallel vector with transition_weights_, where transitions_[f] gives the destination for transition feature f.

util::optional<util::disk_vector<double> > meta::sequence::crf::observation_weights_

private

The weights for all of the node-observation features.

Indexes must be taken from the observation_ranges_ vector.

util::optional<util::disk_vector<double> > meta::sequence::crf::transition_weights_

private

Weights for all of the transition features.

Indexes must be taken from the transition_ranges_ vector.

The documentation for this class was generated from the following files:

/home/chase/projects/meta/include/sequence/crf/crf.h
/home/chase/projects/meta/src/sequence/crf/crf.cpp
/home/chase/projects/meta/src/sequence/crf/tagger.cpp

Classes

Public Member Functions

Private Types

Private Member Functions

Private Attributes

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation