ModErn Text Analysis
META Enumerates Textual Applications
|
A class to represent the per-PrimaryKey data in an index's postings file. More...
#include <postings_data.h>
Public Types | |
using | primary_key_type = PrimaryKey |
using | secondary_key_type = SecondaryKey |
using | pair_t = std::pair< SecondaryKey, double > |
using | count_t = std::vector< pair_t > |
Public Member Functions | |
postings_data ()=default | |
PrimaryKeys may only be integral types or strings; SecondaryKeys may only be integral types. More... | |
postings_data (PrimaryKey p_id) | |
Creates an empty postings_data for a given PrimaryKey. More... | |
void | merge_with (postings_data &other) |
void | increase_count (SecondaryKey s_id, double amount) |
double | count (SecondaryKey s_id) const |
const count_t & | counts () const |
void | set_counts (const count_t &counts) |
bool | operator< (const postings_data &other) const |
void | write_compressed (io::compressed_file_writer &writer) const |
Writes this postings_data to a compressed file. More... | |
void | read_compressed (io::compressed_file_reader &reader) |
Reads compressed postings_data into this object. More... | |
void | write_libsvm (std::ofstream &out) const |
PrimaryKey | primary_key () const |
void | set_primary_key (PrimaryKey new_key) |
uint64_t | inverse_frequency () const |
uint64_t | bytes_used () const |
Private Attributes | |
PrimaryKey | p_id_ |
Primary id this postings_data represents. | |
util::sparse_vector< SecondaryKey, double > | counts_ |
The (secondary_key_type, count) pairs. | |
Static Private Attributes | |
static const uint64_t | delimiter_ = std::numeric_limits<uint64_t>::max() |
delimiter used when writing to compressed files | |
Friends | |
void | stream_helper (io::compressed_file_reader &in, postings_data< PrimaryKey, SecondaryKey > &pd) |
Helper function used by istream operator. More... | |
io::compressed_file_reader & | operator>> (io::compressed_file_reader &in, postings_data< PrimaryKey, SecondaryKey > &pd) |
Reads semi-compressed postings data from a compressed file. More... | |
io::compressed_file_writer & | operator<< (io::compressed_file_writer &out, const postings_data< PrimaryKey, SecondaryKey > &pd) |
Writes semi-compressed postings data to a compressed file. More... | |
A class to represent the per-PrimaryKey data in an index's postings file.
For a given PrimaryKey, a mapping of SecondaryKey -> count information is stored.
For example, for an inverted index, PrimaryKey = term_id, SecondaryKey = doc_id. For a forward_index, PrimaryKey = doc_id, SecondaryKey = term_id.
|
default |
PrimaryKeys may only be integral types or strings; SecondaryKeys may only be integral types.
uint64_t and double must take up the same number of bytes since they are being casted to each other when compressing. postings_data is default-constructable.
meta::index::postings_data< PrimaryKey, SecondaryKey >::postings_data | ( | PrimaryKey | p_id | ) |
Creates an empty postings_data for a given PrimaryKey.
p_id | The PrimaryKey to be associated with this postings_data |
void meta::index::postings_data< PrimaryKey, SecondaryKey >::merge_with | ( | postings_data< PrimaryKey, SecondaryKey > & | other | ) |
other | The other postings_data object to consume Adds the parameter's data to this object's data |
void meta::index::postings_data< PrimaryKey, SecondaryKey >::increase_count | ( | SecondaryKey | s_id, |
double | amount | ||
) |
s_id | The SecondaryKey's id to add counts for |
amount | The number of times to increase the count for a given SecondaryKey |
double meta::index::postings_data< PrimaryKey, SecondaryKey >::count | ( | SecondaryKey | s_id | ) | const |
s_id | The SecondaryKey id to query |
const std::vector< std::pair< SecondaryKey, double > > & meta::index::postings_data< PrimaryKey, SecondaryKey >::counts | ( | ) | const |
void meta::index::postings_data< PrimaryKey, SecondaryKey >::set_counts | ( | const count_t & | counts | ) |
counts | A map of counts to assign into this postings_data |
bool meta::index::postings_data< PrimaryKey, SecondaryKey >::operator< | ( | const postings_data< PrimaryKey, SecondaryKey > & | other | ) | const |
other | The postings_data to compare with |
void meta::index::postings_data< PrimaryKey, SecondaryKey >::write_compressed | ( | io::compressed_file_writer & | writer | ) | const |
Writes this postings_data to a compressed file.
The mapping for the compressed file is already set, so we don't have to worry about it. We can also assume that we are already in the correct location of the file.
writer | The compressed file to write to |
void meta::index::postings_data< PrimaryKey, SecondaryKey >::read_compressed | ( | io::compressed_file_reader & | reader | ) |
Reads compressed postings_data into this object.
The mapping for the compressed file is already set, so we don't have to worry about it. We can also assume that we are already in the correct location of the file.
reader | The compressed file to read from |
|
inline |
out | The output stream to write to |
PrimaryKey meta::index::postings_data< PrimaryKey, SecondaryKey >::primary_key | ( | ) | const |
void meta::index::postings_data< PrimaryKey, SecondaryKey >::set_primary_key | ( | PrimaryKey | new_key | ) |
new_key |
uint64_t meta::index::postings_data< PrimaryKey, SecondaryKey >::inverse_frequency | ( | ) | const |
uint64_t meta::index::postings_data< PrimaryKey, SecondaryKey >::bytes_used | ( | ) | const |
|
friend |
Helper function used by istream operator.
in | The stream to read from |
pd | The postings data object to write the stream info to |
|
friend |
Reads semi-compressed postings data from a compressed file.
in | The stream to read from |
pd | The postings data object to write the stream info to |
|
friend |
Writes semi-compressed postings data to a compressed file.
out | The stream to write to |
pd | The postings data object to write to the stream |