ModErn Text Analysis
META Enumerates Textual Applications
Namespaces | Functions
utf.cpp File Reference
#include <array>
#include <stdexcept>
#include <unicode/brkiter.h>
#include <unicode/uchar.h>
#include <unicode/uclean.h>
#include <unicode/unistr.h>
#include <unicode/translit.h>
#include "util/pimpl.tcc"
#include "utf/utf.h"
#include "detail.h"

Namespaces

 meta
 The ModErn Text Analysis toolkit is a suite of natural language processing, classification, information retreival, data mining, and other applications of text processing.
 
 meta::utf
 Functions for converting to and from various character sets.
 

Functions

std::string meta::utf::to_utf8 (const std::string &str, const std::string &charset)
 Converts a string from the given charset to utf8. More...
 
std::u16string meta::utf::to_utf16 (const std::string &str, const std::string &charset)
 Converts a string fro the given charset to utf16. More...
 
std::string meta::utf::to_utf8 (const std::u16string &str)
 Converts a string from utf16 to utf8. More...
 
std::u16string meta::utf::to_utf16 (const std::string &str)
 Converts a string from utf8 to utf16. More...
 
std::string meta::utf::tolower (const std::string &str)
 Lowercases a utf8 string. More...
 
std::string meta::utf::toupper (const std::string &str)
 Uppercases a utf8 string. More...
 
std::string meta::utf::foldcase (const std::string &str)
 Folds the case of a utf8 string. More...
 
std::string meta::utf::remove_if (const std::string &str, std::function< bool(uint32_t)> pred)
 Removes UTF-32 codepoints that match the given function. More...
 
bool meta::utf::isalpha (uint32_t codepoint)
 
bool meta::utf::isblank (uint32_t codepoint)
 
uint64_t meta::utf::length (const std::string &str)
 

Detailed Description

Author
Chase Geigle