Linear-chain conditional random field for POS tagging and chunking applications.
More...
#include <crf.h>
|
| crf (const std::string &prefix) |
| Constructs a new CRF, storing model parameters in the given prefix. More...
|
|
double | train (parameters params, const std::vector< sequence > &examples) |
| Trains a new CRF model on the given examples. More...
|
|
tagger | make_tagger () const |
| Constructs a new tagging interface that references the current model. More...
|
|
uint64_t | num_labels () const |
|
|
void | initialize (const std::vector< sequence > &examples) |
| Initializes the CRF model based on the set of training examples. More...
|
|
void | load_model () |
| Loads the CRF model from the files stored on disk.
|
|
void | reset () |
| Completely resets the model weights.
|
|
double | calibrate (parameters params, const std::vector< uint64_t > &indices, const std::vector< sequence > &examples) |
| Determines a good initial setting for the learning rate. More...
|
|
const double & | obs_weight (crf_feature_id idx) const |
|
double & | obs_weight (crf_feature_id idx) |
|
const double & | trans_weight (crf_feature_id idx) const |
|
double & | trans_weight (crf_feature_id idx) |
|
feature_range | obs_range (feature_id fid) const |
|
feature_range | trans_range (label_id lbl) const |
|
label_id | observation (crf_feature_id idx) const |
|
label_id | transition (crf_feature_id idx) const |
|
double | epoch (parameters params, printing::progress &progress, uint64_t iter, const std::vector< uint64_t > &indices, const std::vector< sequence > &examples, scorer &scorer) |
| Performs a single epoch of training. More...
|
|
double | iteration (parameters params, uint64_t iter, const sequence &seq, scorer &scorer) |
| Performs a single iteration within a training epoch. More...
|
|
void | gradient_observation_expectation (const sequence &seq, double gain) |
| Updates the model parameters based on the observation expectation part of the gradient. More...
|
|
void | gradient_model_expectation (const sequence &seq, double gain, const scorer &scr) |
| Updates the model parameters based on the model expectation part of the gradient. More...
|
|
double | l2norm () const |
|
void | rescale () |
| Updates all of the weights by re-scaling by the current scale parameter, and sets the scale parameter to 1 after doing so.
|
|
Linear-chain conditional random field for POS tagging and chunking applications.
Learned using l2 regularized stochastic gradient descent.
This CRF implementation uses node-observation features only. This means that feature templates look like \(f(o_t, s_t)\) and \(f(s_{t-1}, s_t)\) only. This is done for memory efficiency and to avoid overfitting.
- See also
- http://homepages.inf.ed.ac.uk/csutton/publications/crftut-fnt.pdf
meta::sequence::crf::crf |
( |
const std::string & |
prefix | ) |
|
Constructs a new CRF, storing model parameters in the given prefix.
If a crf model already exists in the given prefix, it will be loaded; otherwise, the directory will be created.
- Parameters
-
prefix | The prefix (folder) to load/store model files |
double meta::sequence::crf::train |
( |
parameters |
params, |
|
|
const std::vector< sequence > & |
examples |
|
) |
| |
Trains a new CRF model on the given examples.
The examples are assumed to have been run through a sequence_analyzer
first to generate features for every observation in every sequence.
- Parameters
-
params | The parameters for the learning algorithm |
examples | The labeled training examples |
- Returns
- the loss for the last epoch during training
auto meta::sequence::crf::make_tagger |
( |
| ) |
const |
Constructs a new tagging interface that references the current model.
- Returns
- a new tagging interface for this model
uint64_t meta::sequence::crf::num_labels |
( |
| ) |
const |
- Returns
- the number of labels possible under this model.
void meta::sequence::crf::initialize |
( |
const std::vector< sequence > & |
examples | ) |
|
|
private |
Initializes the CRF model based on the set of training examples.
This function runs the "feature generation" portion of the training, where we try to find all state-observation and transition-observation functions that are active in the training data.
- Parameters
-
examples | The training examples |
double meta::sequence::crf::calibrate |
( |
parameters |
params, |
|
|
const std::vector< uint64_t > & |
indices, |
|
|
const std::vector< sequence > & |
examples |
|
) |
| |
|
private |
Determines a good initial setting for the learning rate.
Based on Leon Bottou's SGD implementation.
- Parameters
-
params | The parameters for learning |
indices | The vector of shuffled indices for the random sampling |
examples | The (unshuffled) training examples |
- Returns
- The optimal
t0
found by calibration, which determines the initial learning rate \(\eta\).
const double & meta::sequence::crf::obs_weight |
( |
crf_feature_id |
idx | ) |
const |
|
private |
- Parameters
-
idx | The internal crf model feature id |
- Returns
- a const reference to the weight associated with this feature
double & meta::sequence::crf::obs_weight |
( |
crf_feature_id |
idx | ) |
|
|
private |
- Parameters
-
idx | The internal crf model feature id |
- Returns
- a reference to the weight associated with this feature
const double & meta::sequence::crf::trans_weight |
( |
crf_feature_id |
idx | ) |
const |
|
private |
- Parameters
-
idx | The internal crf model feature id |
- Returns
- a const reference to the weight associated with this feature
double & meta::sequence::crf::trans_weight |
( |
crf_feature_id |
idx | ) |
|
|
private |
- Parameters
-
idx | The internal crf model feature id |
- Returns
- a reference to the weight associated with this feature
auto meta::sequence::crf::obs_range |
( |
feature_id |
fid | ) |
const |
|
private |
- Parameters
-
fid | The external observation feature id |
- Returns
- a range of internal crf model feature ids for state features that are active for this observation
auto meta::sequence::crf::trans_range |
( |
label_id |
lbl | ) |
const |
|
private |
- Parameters
-
- Returns
- a range of internal crf model feature ids for transitions that are active for this state
label_id meta::sequence::crf::observation |
( |
crf_feature_id |
idx | ) |
const |
|
private |
- Parameters
-
idx | The internal crf model feature id |
- Returns
- the label associated with this state-based feature id
label_id meta::sequence::crf::transition |
( |
crf_feature_id |
idx | ) |
const |
|
private |
- Parameters
-
idx | The internal crf model feature id |
- Returns
- the destination label associated with this transition feature id
Performs a single epoch of training.
- Parameters
-
params | The learning parameters |
progress | The progress logger to use |
iter | The current epoch |
indices | The shuffled indices for the random sampling |
examples | The (not shuffled) training examples |
scorer | The scorer to re-use |
- Returns
- the loss for this training epoch
Performs a single iteration within a training epoch.
- Parameters
-
params | The learning parameters |
iter | The current number of total iterations ( \(t\)) |
seq | The sequence to use to update model parameters |
scorer | The scorer to re-use |
- Returns
- the loss associated with this single iteration within the epoch
void meta::sequence::crf::gradient_observation_expectation |
( |
const sequence & |
seq, |
|
|
double |
gain |
|
) |
| |
|
private |
Updates the model parameters based on the observation expectation part of the gradient.
- Parameters
-
seq | The sequence to use |
gain | The amount to scale the weight updates by |
void meta::sequence::crf::gradient_model_expectation |
( |
const sequence & |
seq, |
|
|
double |
gain, |
|
|
const scorer & |
scr |
|
) |
| |
|
private |
Updates the model parameters based on the model expectation part of the gradient.
- Parameters
-
seq | The sequence to use |
gain | The amount to scale the weight updates by |
scr | The scorer to re-use for computing the marginal probabilities |
double meta::sequence::crf::l2norm |
( |
| ) |
const |
|
private |
- Returns
- the current l2 norm of the weights ( \(w^T w\))
Represents the feature id range for a given observation: observation_ranges_[i]
gives the start of a range of crf_feature_ids (indexing into the observation_weights_
) that have fired for feature_id i
, and observation_ranges_[i + 1]
gives the end of the range.
(If i
is the end, then the size of observation_weights_
gives the last id.)
Analogous to the observation range, but for transitions.
transition_ranges_[i]
gives the start of a range of feature_ids (indexing into transition_weights_
) that have fired for label_id i
, and transition_ranges_[i+1]
gives the end of the range. (If i
is the end, then the size of transition_weights_
gives the last id.)
Represents the state that fired for a given observation feature.
This is a parallel vector with observation_weights_
, where observations_[f]
gives the label_id for the observation feature f
.
Represents the destination label for a given transition feature.
This is a parallel vector with transition_weights_
, where transitions_[f]
gives the destination for transition feature f
.
The weights for all of the node-observation features.
Indexes must be taken from the observation_ranges_
vector.
Weights for all of the transition features.
Indexes must be taken from the transition_ranges_
vector.
The documentation for this class was generated from the following files:
- /home/chase/projects/meta/include/sequence/crf/crf.h
- /home/chase/projects/meta/src/sequence/crf/crf.cpp
- /home/chase/projects/meta/src/sequence/crf/tagger.cpp