ModErn Text Analysis
META Enumerates Textual Applications
Public Member Functions | Private Member Functions | Private Attributes | List of all members
meta::index::vocabulary_map Class Reference

A read-only view of a B+-tree-like structure that stores the vocabulary for an index. More...

#include <vocabulary_map.h>

Public Member Functions

 vocabulary_map (const std::string &path, uint16_t block_size=4096)
 Creates a vocabulary map reading the file in the given path with the given block size. More...
 
 vocabulary_map (vocabulary_map &&)=default
 Move constructs a vocabulary_map.
 
vocabulary_mapoperator= (vocabulary_map &&)=default
 Move assigns a vocabulary_map.
 
util::optional< term_id > find (const std::string &term) const
 Finds the given term in the tree, if it exists. More...
 
std::string find_term (term_id t_id) const
 Finds the term associated with the given id. More...
 
uint64_t size () const
 The number of terms in the map.
 

Private Member Functions

int compare (const std::string &term, const char *other) const
 Convenience wrapper for comparing the term with strings in the tree. More...
 

Private Attributes

io::mmap_file file_
 The file containing the tree. More...
 
util::disk_vector< uint64_t > inverse_
 Byte positions for each term in the leaves to allow for reverse lookup of a the string associated with a given id.
 
uint64_t block_size_
 The size of the nodes in the tree.
 
uint64_t leaf_end_pos_
 The ending position of the leaf nodes. More...
 
uint64_t initial_seek_pos_
 The position of the first internal node that is not the root. More...
 

Detailed Description

A read-only view of a B+-tree-like structure that stores the vocabulary for an index.

It reads the file format that is written by the vocabulary_map_writer class (see the documentation for the writer for information about the file format).

Constructor & Destructor Documentation

meta::index::vocabulary_map::vocabulary_map ( const std::string &  path,
uint16_t  block_size = 4096 
)

Creates a vocabulary map reading the file in the given path with the given block size.

Changing the block size is not recommended—the block size used should always be the same as the block size used in the vocabulary_map_writer used to create the tree.

Parameters
paththe location of the tree file
block_sizethe size of the nodes in the tree

Member Function Documentation

int meta::index::vocabulary_map::compare ( const std::string &  term,
const char *  other 
) const
private

Convenience wrapper for comparing the term with strings in the tree.

Parameters
termthe term we are looking for
otherthe string in the tree we are considering
util::optional< term_id > meta::index::vocabulary_map::find ( const std::string &  term) const

Finds the given term in the tree, if it exists.

Parameters
termthe term to find an id for
std::string meta::index::vocabulary_map::find_term ( term_id  t_id) const

Finds the term associated with the given id.

No bounds checking is performed—accessing beyond the maximum assigned term_id is undefined behavior.

Parameters
t_idthe term id to find the string representation of

Member Data Documentation

io::mmap_file meta::index::vocabulary_map::file_
private

The file containing the tree.

mmapped for performance.

uint64_t meta::index::vocabulary_map::leaf_end_pos_
private

The ending position of the leaf nodes.

Used to determine when to stop a search.

uint64_t meta::index::vocabulary_map::initial_seek_pos_
private

The position of the first internal node that is not the root.

Used to seek to the first level during search.


The documentation for this class was generated from the following files: