espnet2.iterators package¶
espnet2.iterators.multiple_iter_factory¶
espnet2.iterators.chunk_iter_factory¶
-
class
espnet2.iterators.chunk_iter_factory.
ChunkIterFactory
(dataset, batch_size: int, batches: Union[espnet2.samplers.abs_sampler.AbsSampler, Sequence[Sequence[Any]]], chunk_length: Union[int, str], chunk_shift_ratio: float = 0.5, num_cache_chunks: int = 1024, num_samples_per_epoch: Optional[int] = None, seed: int = 0, shuffle: bool = False, num_workers: int = 0, collate_fn=None, pin_memory: bool = False, excluded_key_prefixes: Optional[List[str]] = None)[source]¶ Bases:
espnet2.iterators.abs_iter_factory.AbsIterFactory
Creates chunks from a sequence
Examples
>>> batches = [["id1"], ["id2"], ...] >>> batch_size = 128 >>> chunk_length = 1000 >>> iter_factory = ChunkIterFactory(dataset, batches, batch_size, chunk_length) >>> it = iter_factory.build_iter(epoch) >>> for ids, batch in it: ... ...
The number of mini-batches are varied in each epochs and we can’t get the number in advance because IterFactory doesn’t be given to the length information.
Since the first reason, “num_iters_per_epoch” can’t be implemented for this iterator. Instead of it, “num_samples_per_epoch” is implemented.
espnet2.iterators.sequence_iter_factory¶
-
class
espnet2.iterators.sequence_iter_factory.
SequenceIterFactory
(dataset, batches: Union[espnet2.samplers.abs_sampler.AbsSampler, Sequence[Sequence[Any]]], num_iters_per_epoch: int = None, seed: int = 0, shuffle: bool = False, shuffle_within_batch: bool = False, num_workers: int = 0, collate_fn=None, pin_memory: bool = False)[source]¶ Bases:
espnet2.iterators.abs_iter_factory.AbsIterFactory
Build iterator for each epoch.
This class simply creates pytorch DataLoader except for the following points: - The random seed is decided according to the number of epochs. This feature
guarantees reproducibility when resuming from middle of training process.
Enable to restrict the number of samples for one epoch. This features controls the interval number between training and evaluation.