espnet2.text package

espnet2.text.build_tokenizer

espnet2.text.build_tokenizer.build_tokenizer(token_type: str, bpemodel: Union[pathlib.Path, str, Iterable[str]] = None, non_linguistic_symbols: Union[pathlib.Path, str, Iterable[str]] = None, remove_non_linguistic_symbols: bool = False, space_symbol: str = '<space>', delimiter: str = None, g2p_type: str = None, nonsplit_symbol: Iterable[str] = None, encode_kwargs: Dict = None) → espnet2.text.abs_tokenizer.AbsTokenizer[source]

A helper function to instantiate Tokenizer

espnet2.text.cleaner

class espnet2.text.cleaner.TextCleaner(cleaner_types: Collection[str] = None)[source]

Bases: object

Text cleaner.

Examples

>>> cleaner = TextCleaner("tacotron")
>>> cleaner("(Hello-World);   &  jr. & dr.")
'HELLO WORLD, AND JUNIOR AND DOCTOR'

espnet2.text.whisper_token_id_converter

class espnet2.text.whisper_token_id_converter.OpenAIWhisperTokenIDConverter(model_type: str = 'whisper_multilingual')[source]

Bases: object

get_num_vocabulary_size() → int[source]
ids2tokens(integers: Union[numpy.ndarray, Iterable[int]]) → List[str][source]
tokens2ids(tokens: Iterable[str]) → List[int][source]

espnet2.text.phoneme_tokenizer

class espnet2.text.phoneme_tokenizer.G2p_en(no_space: bool = False)[source]

Bases: object

On behalf of g2p_en.G2p.

g2p_en.G2p isn’t pickalable and it can’t be copied to the other processes via multiprocessing module. As a workaround, g2p_en.G2p is instantiated upon calling this class.

class espnet2.text.phoneme_tokenizer.G2pk(descritive=False, group_vowels=False, to_syl=False, no_space=False, explicit_space=False, space_symbol='<space>')[source]

Bases: object

On behalf of g2pk.G2p.

g2pk.G2p isn’t pickalable and it can’t be copied to the other processes via multiprocessing module. As a workaround, g2pk.G2p is instantiated upon calling this class.

class espnet2.text.phoneme_tokenizer.IsG2p(dialect: str = 'standard', syllabify: bool = True, word_sep: str = ', ', use_dict: bool = True)[source]

Bases: object

Minimal wrapper for https://github.com/grammatek/ice-g2p

The g2p module uses a Bi-LSTM model along with a pronunciation dictionary to generate phonemization Unfortunately does not support multi-thread phonemization as of yet

class espnet2.text.phoneme_tokenizer.Jaso(space_symbol=' ', no_space=False)[source]

Bases: object

JAMO_LEADS = 'ᄀᄁᄂᄃᄄᄅᄆᄇᄈᄉᄊᄋᄌᄍᄎᄏᄐᄑᄒ'
JAMO_TAILS = 'ᆨᆩᆪᆫᆬᆭᆮᆯᆰᆱᆲᆳᆴᆵᆶᆷᆸᆹᆺᆻᆼᆽᆾᆿᇀᇁᇂ'
JAMO_VOWELS = 'ᅡᅢᅣᅤᅥᅦᅧᅨᅩᅪᅫᅬᅭᅮᅯᅰᅱᅲᅳᅴᅵ'
PUNC = "!'(),-.:;?"
SPACE = ' '
VALID_CHARS = "ᄀᄁᄂᄃᄄᄅᄆᄇᄈᄉᄊᄋᄌᄍᄎᄏᄐᄑ하ᅢᅣᅤᅥᅦᅧᅨᅩᅪᅫᅬᅭᅮᅯᅰᅱᅲᅳᅴᅵᆨᆩᆪᆫᆬᆭᆮᆯᆰᆱᆲᆳᆴᆵᆶᆷᆸᆹᆺᆻᆼᆽᆾᆿᇀᇁᇂ!'(),-.:;? "
class espnet2.text.phoneme_tokenizer.PhonemeTokenizer(g2p_type: Union[None, str], non_linguistic_symbols: Union[pathlib.Path, str, Iterable[str]] = None, space_symbol: str = '<space>', remove_non_linguistic_symbols: bool = False)[source]

Bases: espnet2.text.abs_tokenizer.AbsTokenizer

text2tokens(line: str) → List[str][source]
text2tokens_svs(syllable: str) → List[str][source]
tokens2text(tokens: Iterable[str]) → str[source]
class espnet2.text.phoneme_tokenizer.Phonemizer(backend, word_separator: Optional[str] = None, syllable_separator: Optional[str] = None, phone_separator: Optional[str] = ' ', strip=False, split_by_single_token: bool = False, **phonemizer_kwargs)[source]

Bases: object

Phonemizer module for various languages.

This is wrapper module of https://github.com/bootphon/phonemizer. You can define various g2p modules by specifying options for phonemizer.

See available options:

https://github.com/bootphon/phonemizer/blob/master/phonemizer/phonemize.py#L32

espnet2.text.phoneme_tokenizer.pyopenjtalk_g2p(text) → List[str][source]
espnet2.text.phoneme_tokenizer.pyopenjtalk_g2p_accent(text) → List[str][source]
espnet2.text.phoneme_tokenizer.pyopenjtalk_g2p_accent_with_pause(text) → List[str][source]
espnet2.text.phoneme_tokenizer.pyopenjtalk_g2p_kana(text) → List[str][source]
espnet2.text.phoneme_tokenizer.pyopenjtalk_g2p_prosody(text: str, drop_unvoiced_vowels: bool = True) → List[str][source]

Extract phoneme + prosoody symbol sequence from input full-context labels.

The algorithm is based on Prosodic features control by symbols as input of sequence-to-sequence acoustic modeling for neural TTS with some r9y9’s tweaks.

Parameters:
  • text (str) – Input text.

  • drop_unvoiced_vowels (bool) – whether to drop unvoiced vowels.

Returns:

List of phoneme + prosody symbols.

Return type:

List[str]

Examples

>>> from espnet2.text.phoneme_tokenizer import pyopenjtalk_g2p_prosody
>>> pyopenjtalk_g2p_prosody("こんにちは。")
['^', 'k', 'o', '[', 'N', 'n', 'i', 'ch', 'i', 'w', 'a', '$']
espnet2.text.phoneme_tokenizer.pypinyin_g2p(text) → List[str][source]
espnet2.text.phoneme_tokenizer.pypinyin_g2p_phone(text) → List[str][source]
espnet2.text.phoneme_tokenizer.pypinyin_g2p_phone_without_prosody(text) → List[str][source]
espnet2.text.phoneme_tokenizer.split_by_space(text) → List[str][source]

espnet2.text.hugging_face_tokenizer

class espnet2.text.hugging_face_tokenizer.HuggingFaceTokenizer(model: Union[pathlib.Path, str])[source]

Bases: espnet2.text.abs_tokenizer.AbsTokenizer

text2tokens(line: str) → List[str][source]
tokens2text(tokens: Iterable[str]) → str[source]

espnet2.text.word_tokenizer

class espnet2.text.word_tokenizer.WordTokenizer(delimiter: str = None, non_linguistic_symbols: Union[pathlib.Path, str, Iterable[str]] = None, remove_non_linguistic_symbols: bool = False)[source]

Bases: espnet2.text.abs_tokenizer.AbsTokenizer

text2tokens(line: str) → List[str][source]
tokens2text(tokens: Iterable[str]) → str[source]

espnet2.text.token_id_converter

class espnet2.text.token_id_converter.TokenIDConverter(token_list: Union[pathlib.Path, str, Iterable[str]], unk_symbol: str = '<unk>')[source]

Bases: object

get_num_vocabulary_size() → int[source]
ids2tokens(integers: Union[numpy.ndarray, Iterable[int]]) → List[str][source]
tokens2ids(tokens: Iterable[str]) → List[int][source]

espnet2.text.korean_cleaner

class espnet2.text.korean_cleaner.KoreanCleaner[source]

Bases: object

classmethod normalize_text(text)[source]

espnet2.text.char_tokenizer

class espnet2.text.char_tokenizer.CharTokenizer(non_linguistic_symbols: Union[pathlib.Path, str, Iterable[str]] = None, space_symbol: str = '<space>', remove_non_linguistic_symbols: bool = False, nonsplit_symbols: Iterable[str] = None)[source]

Bases: espnet2.text.abs_tokenizer.AbsTokenizer

text2tokens(line: str) → List[str][source]
tokens2text(tokens: Iterable[str]) → str[source]

espnet2.text.abs_tokenizer

class espnet2.text.abs_tokenizer.AbsTokenizer[source]

Bases: abc.ABC

abstract text2tokens(line: str) → List[str][source]
abstract tokens2text(tokens: Iterable[str]) → str[source]

espnet2.text.__init__

espnet2.text.sentencepiece_tokenizer

class espnet2.text.sentencepiece_tokenizer.SentencepiecesTokenizer(model: Union[pathlib.Path, str], encode_kwargs: Dict = {})[source]

Bases: espnet2.text.abs_tokenizer.AbsTokenizer

text2tokens(line: str) → List[str][source]
tokens2text(tokens: Iterable[str]) → str[source]

espnet2.text.whisper_tokenizer

class espnet2.text.whisper_tokenizer.OpenAIWhisperTokenizer(model_type: str)[source]

Bases: espnet2.text.abs_tokenizer.AbsTokenizer

text2tokens(line: str) → List[str][source]
tokens2text(tokens: Iterable[str]) → str[source]