ModErn Text Analysis
META Enumerates Textual Applications
Public Types | Public Member Functions | Private Attributes | Static Private Attributes | Friends | List of all members
meta::index::postings_data< PrimaryKey, SecondaryKey > Class Template Reference

A class to represent the per-PrimaryKey data in an index's postings file. More...

#include <postings_data.h>

Public Types

using primary_key_type = PrimaryKey
 
using secondary_key_type = SecondaryKey
 
using pair_t = std::pair< SecondaryKey, double >
 
using count_t = std::vector< pair_t >
 

Public Member Functions

 postings_data ()=default
 PrimaryKeys may only be integral types or strings; SecondaryKeys may only be integral types. More...
 
 postings_data (PrimaryKey p_id)
 Creates an empty postings_data for a given PrimaryKey. More...
 
void merge_with (postings_data &other)
 
void increase_count (SecondaryKey s_id, double amount)
 
double count (SecondaryKey s_id) const
 
const count_t & counts () const
 
void set_counts (const count_t &counts)
 
bool operator< (const postings_data &other) const
 
void write_compressed (io::compressed_file_writer &writer) const
 Writes this postings_data to a compressed file. More...
 
void read_compressed (io::compressed_file_reader &reader)
 Reads compressed postings_data into this object. More...
 
void write_libsvm (std::ofstream &out) const
 
PrimaryKey primary_key () const
 
void set_primary_key (PrimaryKey new_key)
 
uint64_t inverse_frequency () const
 
uint64_t bytes_used () const
 

Private Attributes

PrimaryKey p_id_
 Primary id this postings_data represents.
 
util::sparse_vector< SecondaryKey, double > counts_
 The (secondary_key_type, count) pairs.
 

Static Private Attributes

static const uint64_t delimiter_ = std::numeric_limits<uint64_t>::max()
 delimiter used when writing to compressed files
 

Friends

void stream_helper (io::compressed_file_reader &in, postings_data< PrimaryKey, SecondaryKey > &pd)
 Helper function used by istream operator. More...
 
io::compressed_file_readeroperator>> (io::compressed_file_reader &in, postings_data< PrimaryKey, SecondaryKey > &pd)
 Reads semi-compressed postings data from a compressed file. More...
 
io::compressed_file_writeroperator<< (io::compressed_file_writer &out, const postings_data< PrimaryKey, SecondaryKey > &pd)
 Writes semi-compressed postings data to a compressed file. More...
 

Detailed Description

template<class PrimaryKey, class SecondaryKey>
class meta::index::postings_data< PrimaryKey, SecondaryKey >

A class to represent the per-PrimaryKey data in an index's postings file.

For a given PrimaryKey, a mapping of SecondaryKey -> count information is stored.

For example, for an inverted index, PrimaryKey = term_id, SecondaryKey = doc_id. For a forward_index, PrimaryKey = doc_id, SecondaryKey = term_id.

Constructor & Destructor Documentation

template<class PrimaryKey , class SecondaryKey >
meta::index::postings_data< PrimaryKey, SecondaryKey >::postings_data ( )
default

PrimaryKeys may only be integral types or strings; SecondaryKeys may only be integral types.

uint64_t and double must take up the same number of bytes since they are being casted to each other when compressing. postings_data is default-constructable.

template<class PrimaryKey , class SecondaryKey >
meta::index::postings_data< PrimaryKey, SecondaryKey >::postings_data ( PrimaryKey  p_id)

Creates an empty postings_data for a given PrimaryKey.

Parameters
p_idThe PrimaryKey to be associated with this postings_data

Member Function Documentation

template<class PrimaryKey , class SecondaryKey >
void meta::index::postings_data< PrimaryKey, SecondaryKey >::merge_with ( postings_data< PrimaryKey, SecondaryKey > &  other)
Parameters
otherThe other postings_data object to consume Adds the parameter's data to this object's data
template<class PrimaryKey , class SecondaryKey >
void meta::index::postings_data< PrimaryKey, SecondaryKey >::increase_count ( SecondaryKey  s_id,
double  amount 
)
Parameters
s_idThe SecondaryKey's id to add counts for
amountThe number of times to increase the count for a given SecondaryKey
template<class PrimaryKey , class SecondaryKey >
double meta::index::postings_data< PrimaryKey, SecondaryKey >::count ( SecondaryKey  s_id) const
Parameters
s_idThe SecondaryKey id to query
Returns
the number of times SecondaryKey occurred in this postings_data
template<class PrimaryKey , class SecondaryKey >
const std::vector< std::pair< SecondaryKey, double > > & meta::index::postings_data< PrimaryKey, SecondaryKey >::counts ( ) const
Returns
the per-SecondaryKey frequency information for this PrimaryKey
template<class PrimaryKey , class SecondaryKey >
void meta::index::postings_data< PrimaryKey, SecondaryKey >::set_counts ( const count_t &  counts)
Parameters
countsA map of counts to assign into this postings_data
template<class PrimaryKey , class SecondaryKey >
bool meta::index::postings_data< PrimaryKey, SecondaryKey >::operator< ( const postings_data< PrimaryKey, SecondaryKey > &  other) const
Parameters
otherThe postings_data to compare with
Returns
whether this postings_data is less than (has a smaller PrimaryKey than) the parameter
template<class PrimaryKey , class SecondaryKey >
void meta::index::postings_data< PrimaryKey, SecondaryKey >::write_compressed ( io::compressed_file_writer writer) const

Writes this postings_data to a compressed file.

The mapping for the compressed file is already set, so we don't have to worry about it. We can also assume that we are already in the correct location of the file.

Parameters
writerThe compressed file to write to
template<class PrimaryKey , class SecondaryKey >
void meta::index::postings_data< PrimaryKey, SecondaryKey >::read_compressed ( io::compressed_file_reader reader)

Reads compressed postings_data into this object.

The mapping for the compressed file is already set, so we don't have to worry about it. We can also assume that we are already in the correct location of the file.

Parameters
readerThe compressed file to read from
template<class PrimaryKey , class SecondaryKey >
void meta::index::postings_data< PrimaryKey, SecondaryKey >::write_libsvm ( std::ofstream &  out) const
inline
Parameters
outThe output stream to write to
template<class PrimaryKey , class SecondaryKey >
PrimaryKey meta::index::postings_data< PrimaryKey, SecondaryKey >::primary_key ( ) const
Returns
the term_id for this postings_data
template<class PrimaryKey , class SecondaryKey >
void meta::index::postings_data< PrimaryKey, SecondaryKey >::set_primary_key ( PrimaryKey  new_key)
Parameters
new_key
template<class PrimaryKey , class SecondaryKey >
uint64_t meta::index::postings_data< PrimaryKey, SecondaryKey >::inverse_frequency ( ) const
Returns
the number of SecondaryKeys that this PrimaryKey occurs with
template<class PrimaryKey , class SecondaryKey >
uint64_t meta::index::postings_data< PrimaryKey, SecondaryKey >::bytes_used ( ) const
Returns
the number of bytes used for this postings_data

Friends And Related Function Documentation

template<class PrimaryKey , class SecondaryKey >
void stream_helper ( io::compressed_file_reader in,
postings_data< PrimaryKey, SecondaryKey > &  pd 
)
friend

Helper function used by istream operator.

Parameters
inThe stream to read from
pdThe postings data object to write the stream info to
template<class PrimaryKey , class SecondaryKey >
io::compressed_file_reader& operator>> ( io::compressed_file_reader in,
postings_data< PrimaryKey, SecondaryKey > &  pd 
)
friend

Reads semi-compressed postings data from a compressed file.

Parameters
inThe stream to read from
pdThe postings data object to write the stream info to
Returns
the input stream
template<class PrimaryKey , class SecondaryKey >
io::compressed_file_writer& operator<< ( io::compressed_file_writer out,
const postings_data< PrimaryKey, SecondaryKey > &  pd 
)
friend

Writes semi-compressed postings data to a compressed file.

Parameters
outThe stream to write to
pdThe postings data object to write to the stream
Returns
the output stream

The documentation for this class was generated from the following files: