Implements a perceptron classifier, but using the dual formulation of the problem. More...

#include <dual_perceptron.h>

Inheritance diagram for meta::classify::dual_perceptron:

Public Member Functions
template<class Kernel >
	dual_perceptron (std::shared_ptr< index::forward_index > idx, Kernel &&kernel_fn=kernel::polynomial{}, double alpha=default_alpha, double gamma=default_gamma, double bias=default_bias, uint64_t max_iter=default_max_iter)
	Constructs a dual_perceptron classifier over the given index and with the given paramters. More...

void	train (const std::vector< doc_id > &docs) override
	Trains the perceptron on the given training documents. More...

class_label	classify (doc_id d_id) override
	Classifies the given document. More...

void	reset () override
	Resets all learned information for this perceptron so it may be re-learned.

Public Member Functions inherited from meta::classify::classifier
	classifier (std::shared_ptr< index::forward_index > idx)

virtual confusion_matrix	test (const std::vector< doc_id > &docs)
	Classifies a collection document into specific groups, as determined by training data; this function will make repeated calls to classify(). More...

virtual confusion_matrix	cross_validate (const std::vector< doc_id > &input_docs, size_t k, bool even_split=false, int seed=1)
	Performs k-fold cross-validation on a set of documents. More...

Static Public Attributes
static const constexpr double	default_alpha = 0.1
	The default \(\alpha\) parameter.

static const constexpr double	default_gamma = 0.05
	The default \(\gamma\) parameter.

static const constexpr double	default_bias = 0
	The default \(b\) parameter.

static const constexpr uint64_t	default_max_iter = 100
	The default number of allowed iterations.

static const std::string	id = "dual-perceptron"
	The identifier for this classifier.

Private Types
using	pdata = decltype(idx_->search_primary(doc_id{}))
	Convenience typedef for the postings data type.

Private Member Functions
void	decrease_weight (const class_label &label, const doc_id &id)
	Decreases the "weight" (mistake count) for a given class label and document. More...

Private Attributes
std::unordered_map< class_label, std::unordered_map< doc_id, uint64_t > >	weights_
	The "weight" (mistake count) vectors for each class label.

std::function< double(pdata, pdata)>	kernel_
	The kernel function to be used in lieu of a dot product.

const double	alpha_
	\(\alpha\), the learning rate

const double	gamma_
	\(\gamma\), the error threshold (in terms of percentage of mistakes on the training data in one iteration of training).

const double	bias_
	\(b\), the bias factor.

const uint64_t	max_iter_
	The maximum number of iterations for training.

Additional Inherited Members
Protected Attributes inherited from meta::classify::classifier
std::shared_ptr< index::forward_index >	idx_
	the index that the classifer is run on

Detailed Description

Implements a perceptron classifier, but using the dual formulation of the problem.

This allows the perceptron to be used for data that is not necessarily linearly separable via the use of a kernel function.

Constructor & Destructor Documentation

template<class Kernel >

meta::classify::dual_perceptron::dual_perceptron	(	std::shared_ptr< index::forward_index >	idx,
		Kernel &&	kernel_fn = `kernel::polynomial{}`,
		double	alpha = `default_alpha`,
		double	gamma = `default_gamma`,
		double	bias = `default_bias`,
		uint64_t	max_iter = `default_max_iter`
	)

inline

Constructs a dual_perceptron classifier over the given index and with the given paramters.

Parameters

idx	The index to run the classifier on
kernel_fn	The kernel function to be used
alpha	\(\alpha\), the learning rate
gamma	\(\gamma\), the error threshold (in terms of percentage of mistakes on one training run)
bias	\(b\), the bias
max_iter	The maximum allowed iterations for training.

Member Function Documentation

void meta::classify::dual_perceptron::train ( const std::vector< doc_id > & docs )

overridevirtual

Trains the perceptron on the given training documents.

Maintains a set of weight vectors \(w_1,\ldots,w_K\) where \(K\) is the number of classes and updates them for each training document seen in each iteration. This continues until the error threshold is met or the maximum number of iterations is completed.

Contrary to the regular perceptron, since this is the dual formulation, its vectors are "mistake vectors" that keep track of how often a given training instance was misclassified.

Parameters

docs	The training set

Implements meta::classify::classifier.

class_label meta::classify::dual_perceptron::classify ( doc_id d_id )

overridevirtual

Classifies the given document.

The class label returned is \(\arg\!\max_k(\sum_d(w_k^d*(K(d,x) + b))\)—in other words, the class whose associated weight vector gives the highest result.

Parameters

doc	The document to be classified

Returns: the class label determined for the document

Implements meta::classify::classifier.

void meta::classify::dual_perceptron::decrease_weight	(	const class_label &	label,
		const doc_id &	id
	)

private

Decreases the "weight" (mistake count) for a given class label and document.

Parameters

label	The class label
id	The document

The documentation for this class was generated from the following files:

/home/chase/projects/meta/include/classify/classifier/dual_perceptron.h
/home/chase/projects/meta/src/classify/classifier/dual_perceptron.cpp

Public Member Functions

Static Public Attributes

Private Types

Private Member Functions

Private Attributes

Additional Inherited Members

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation