ModErn Text Analysis
META Enumerates Textual Applications
Public Member Functions | Static Public Attributes | Private Attributes | List of all members
meta::analyzers::tokenizers::character_tokenizer Class Reference

Converts documents into streams of characters. More...

#include <character_tokenizer.h>

Inheritance diagram for meta::analyzers::tokenizers::character_tokenizer:
meta::util::multilevel_clonable< Root, Base, Derived >

Public Member Functions

 character_tokenizer ()
 Creates a character_tokenizer.
 
void set_content (const std::string &content) override
 Sets the content for the tokenizer. More...
 
std::string next () override
 
 operator bool () const override
 Determines if there are more tokens in the document.
 
- Public Member Functions inherited from meta::util::multilevel_clonable< Root, Base, Derived >
virtual std::unique_ptr< Root > clone () const
 Clones the given object. More...
 

Static Public Attributes

static const std::string id = "character-tokenizer"
 Identifier for this tokenizer.
 

Private Attributes

std::string content_
 The buffered string content for this tokenizer.
 
uint64_t idx_
 Character index into the current buffer.
 

Detailed Description

Converts documents into streams of characters.

This is the simplest tokenizer.

Member Function Documentation

void meta::analyzers::tokenizers::character_tokenizer::set_content ( const std::string &  content)
override

Sets the content for the tokenizer.

Parameters
contentThe string content to set
std::string meta::analyzers::tokenizers::character_tokenizer::next ( )
override
Returns
the next token in the document. This token will contain a single character.

The documentation for this class was generated from the following files: