ModErn Text Analysis
META Enumerates Textual Applications
Public Member Functions | Private Attributes | List of all members
meta::corpus::file_corpus Class Reference

Creates document objects from individual files, each representing a single document. More...

#include <file_corpus.h>

Inheritance diagram for meta::corpus::file_corpus:
meta::corpus::corpus

Public Member Functions

 file_corpus (const std::string &prefix, const std::string &doc_list, std::string encoding)
 
bool has_next () const override
 
document next () override
 
uint64_t size () const override
 
- Public Member Functions inherited from meta::corpus::corpus
 corpus (std::string encoding)
 Constructs a new corpus with the given encoding. More...
 
virtual ~corpus ()=default
 Destructor.
 
const std::string & encoding () const
 

Private Attributes

uint64_t cur_
 the current document we are on
 
std::string prefix_
 the path to all the documents
 
std::vector< std::pair< std::string, class_label > > docs_
 contains doc class labels and paths
 

Additional Inherited Members

- Static Public Member Functions inherited from meta::corpus::corpus
static std::unique_ptr< corpusload (const std::string &config_file)
 

Detailed Description

Creates document objects from individual files, each representing a single document.

Constructor & Destructor Documentation

meta::corpus::file_corpus::file_corpus ( const std::string &  prefix,
const std::string &  doc_list,
std::string  encoding 
)
Parameters
prefixThe path to where the files are located
doc_listA file containing the path to each document in the corpus
encodingthe encoding of the corpus

Member Function Documentation

bool meta::corpus::file_corpus::has_next ( ) const
overridevirtual
Returns
whether there is another document in this corpus

Implements meta::corpus::corpus.

document meta::corpus::file_corpus::next ( )
overridevirtual
Returns
the next document from this corpus

Implements meta::corpus::corpus.

uint64_t meta::corpus::file_corpus::size ( ) const
overridevirtual
Returns
the number of documents in this corpus

Implements meta::corpus::corpus.


The documentation for this class was generated from the following files: