ModErn Text Analysis
META Enumerates Textual Applications
Main Page
Related Pages
Namespaces
Classes
Files
File List
File Members
include
index
vocabulary_map_writer.h
Go to the documentation of this file.
1
10
#ifndef META_VOCABULARY_MAP_WRITER_H_
11
#define META_VOCABULARY_MAP_WRITER_H_
12
13
#include <cstdint>
14
#include <fstream>
15
#include <stdexcept>
16
#include <string>
17
18
namespace
meta
19
{
20
namespace
index
21
{
22
57
class
vocabulary_map_writer
58
{
59
public
:
69
vocabulary_map_writer
(
const
std::string& path, uint16_t block_size = 4096);
70
77
~vocabulary_map_writer
();
78
87
void
insert
(
const
std::string& term);
88
92
class
vocabulary_map_writer_exception
:
public
std::runtime_error
93
{
94
using
std::runtime_error::runtime_error;
95
};
96
97
private
:
101
void
write_padding
();
102
106
void
flush
();
107
109
std::ofstream
file_
;
110
116
uint64_t
file_write_pos_
;
117
119
std::ofstream
inverse_file_
;
120
122
std::string
path_
;
123
125
uint16_t
block_size_
;
126
128
uint64_t
num_terms_
;
129
131
uint16_t
remaining_block_space_
;
132
134
uint64_t
written_nodes_
;
135
};
136
}
137
}
138
#endif
meta::index::vocabulary_map_writer::insert
void insert(const std::string &term)
Inserts this term into the map.
Definition:
vocabulary_map_writer.cpp:33
meta::index::vocabulary_map_writer::vocabulary_map_writer_exception
An exception that can be thrown during the building of the tree.
Definition:
vocabulary_map_writer.h:92
meta::index::vocabulary_map_writer::inverse_file_
std::ofstream inverse_file_
The file containing the reverse mapping.
Definition:
vocabulary_map_writer.h:119
meta::index::vocabulary_map_writer::written_nodes_
uint64_t written_nodes_
Number of written nodes to be "merged" when writing the next level.
Definition:
vocabulary_map_writer.h:134
meta::index::vocabulary_map_writer::path_
std::string path_
The path to the tree file.
Definition:
vocabulary_map_writer.h:122
meta::index::vocabulary_map_writer::flush
void flush()
Flushes a node to disk after writing the padding bytes.
Definition:
vocabulary_map_writer.cpp:73
meta::index::vocabulary_map_writer::~vocabulary_map_writer
~vocabulary_map_writer()
The destructor for a vocabulary_map_writer flushes the last leaf node and builds the internal structu...
Definition:
vocabulary_map_writer.cpp:80
meta::index::vocabulary_map_writer::vocabulary_map_writer
vocabulary_map_writer(const std::string &path, uint16_t block_size=4096)
Creates a writer for a tree at the given path and block_size.
Definition:
vocabulary_map_writer.cpp:17
meta::index::vocabulary_map_writer::write_padding
void write_padding()
Writes null bytes to fill up the current block.
Definition:
vocabulary_map_writer.cpp:61
meta::index::vocabulary_map_writer::block_size_
uint16_t block_size_
The block size of every node in the tree, in bytes.
Definition:
vocabulary_map_writer.h:125
meta::index::vocabulary_map_writer::file_write_pos_
uint64_t file_write_pos_
The current write position in the forward mapping tree file.
Definition:
vocabulary_map_writer.h:116
meta
The ModErn Text Analysis toolkit is a suite of natural language processing, classification, information retreival, data mining, and other applications of text processing.
Definition:
analyzer.h:24
meta::index::vocabulary_map_writer::file_
std::ofstream file_
The file containing the forward mapping tree.
Definition:
vocabulary_map_writer.h:109
meta::index::vocabulary_map_writer::remaining_block_space_
uint16_t remaining_block_space_
The remaining space in the block currently being written.
Definition:
vocabulary_map_writer.h:131
meta::index::vocabulary_map_writer::num_terms_
uint64_t num_terms_
The total number of terms inserted so far.
Definition:
vocabulary_map_writer.h:128
meta::index::vocabulary_map_writer
A class that writes the B+-tree-like data structure used for storing the term id mapping in an index...
Definition:
vocabulary_map_writer.h:57
Generated on Tue Mar 3 2015 23:20:16 for ModErn Text Analysis by
1.8.9.1