Implementation of an inverted_index. More...

Public Member Functions
	impl (inverted_index *parent, const cpptoml::table &config)
	Constructs an inverted_index impl. More...

void	tokenize_docs (corpus::corpus *docs, chunk_handler< inverted_index > &handler)

void	create_lexicon (const std::string &postings_file, const std::string &lexicon_file)
	Creates the lexicon file (or "dictionary") which has pointers into the large postings file. More...

void	compress (const std::string &filename, uint64_t num_unique_terms)
	Compresses the large postings file.

Public Attributes
std::unique_ptr< analyzers::analyzer >	analyzer_
	The analyzer used to tokenize documents.

util::optional< util::disk_vector< uint64_t > >	term_bit_locations_
	PrimaryKey -> postings location. More...

uint64_t	total_corpus_terms_
	the total number of term occurrences in the entire corpus

Private Attributes
inverted_index *	idx_
	Pointer to the inverted_index this is an implementation of.

Detailed Description

Implementation of an inverted_index.

Constructor & Destructor Documentation

meta::index::inverted_index::impl::impl	(	inverted_index *	parent,
		const cpptoml::table &	config
	)

Constructs an inverted_index impl.

Parameters

parent	The parent of this impl
config	The config group

void meta::index::inverted_index::impl::tokenize_docs	(	corpus::corpus *	docs,
		chunk_handler< inverted_index > &	handler
	)

Parameters

docs	The documents to be tokenized
handler	The chunk handler for this index

void meta::index::inverted_index::impl::create_lexicon	(	const std::string &	postings_file,
		const std::string &	lexicon_file
	)

Creates the lexicon file (or "dictionary") which has pointers into the large postings file.

Parameters

postings_file
lexicon_file

util::optional<util::disk_vector<uint64_t> > meta::index::inverted_index::impl::term_bit_locations_

PrimaryKey -> postings location.

Each index corresponds to a PrimaryKey (uint64_t).

The documentation for this class was generated from the following file: