espnet2.mt package¶
espnet2.mt.espnet_model¶
-
class
espnet2.mt.espnet_model.
ESPnetMTModel
(vocab_size: int, token_list: Union[Tuple[str, ...], List[str]], frontend: Optional[espnet2.asr.frontend.abs_frontend.AbsFrontend], preencoder: Optional[espnet2.asr.preencoder.abs_preencoder.AbsPreEncoder], encoder: espnet2.asr.encoder.abs_encoder.AbsEncoder, postencoder: Optional[espnet2.asr.postencoder.abs_postencoder.AbsPostEncoder], decoder: espnet2.asr.decoder.abs_decoder.AbsDecoder, src_vocab_size: int = 0, src_token_list: Union[Tuple[str, ...], List[str]] = [], ignore_id: int = -1, lsm_weight: float = 0.0, length_normalized_loss: bool = False, report_bleu: bool = True, sym_space: str = '<space>', sym_blank: str = '<blank>', extract_feats_in_collect_stats: bool = True, share_decoder_input_output_embed: bool = False, share_encoder_decoder_input_embed: bool = False)[source]¶ Bases:
espnet2.train.abs_espnet_model.AbsESPnetModel
Encoder-Decoder model
-
collect_feats
(text: torch.Tensor, text_lengths: torch.Tensor, src_text: torch.Tensor, src_text_lengths: torch.Tensor, **kwargs) → Dict[str, torch.Tensor][source]¶
-
encode
(src_text: torch.Tensor, src_text_lengths: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]¶ Frontend + Encoder. Note that this method is used by mt_inference.py
- Parameters:
src_text – (Batch, Length, …)
src_text_lengths – (Batch, )
-
forward
(text: torch.Tensor, text_lengths: torch.Tensor, src_text: torch.Tensor, src_text_lengths: torch.Tensor, **kwargs) → Tuple[torch.Tensor, Dict[str, torch.Tensor], torch.Tensor][source]¶ Frontend + Encoder + Decoder + Calc loss
- Parameters:
text – (Batch, Length)
text_lengths – (Batch,)
src_text – (Batch, length)
src_text_lengths – (Batch,)
kwargs – “utt_id” is among the input.
-
espnet2.mt.__init__¶
espnet2.mt.frontend.embedding¶
Embedding Frontend for text based inputs.
-
class
espnet2.mt.frontend.embedding.
Embedding
(input_size: int = 400, embed_dim: int = 400, pos_enc_class=<class 'espnet.nets.pytorch_backend.transformer.embedding.PositionalEncoding'>, positional_dropout_rate: float = 0.1)[source]¶ Bases:
espnet2.asr.frontend.abs_frontend.AbsFrontend
Embedding Frontend for text based inputs.
Initialize.
- Parameters:
input_size – Number of input tokens.
embed_dim – Embedding Size.
pos_enc_class – PositionalEncoding or ScaledPositionalEncoding
positional_dropout_rate – dropout rate after adding positional encoding
-
forward
(input: torch.Tensor, input_lengths: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]¶ Apply a sliding window on the input.
- Parameters:
input – Input (B, T) or (B, T,D), with D.
input_lengths – Input lengths within batch.
- Returns:
Output with dimensions (B, T, D). Tensor: Output lengths within batch.
- Return type:
Tensor