espnet.distributed package¶
Initialize sub package.
espnet.distributed.__init__¶
Initialize sub package.
espnet.distributed.pytorch_backend.launch¶
This is a helper module for distributed training.
The code uses an official implementation of distributed data parallel launcher as just a reference. https://github.com/pytorch/pytorch/blob/v1.8.2/torch/distributed/launch.py One main difference is this code focuses on launching simple function with given arguments.
-
exception
espnet.distributed.pytorch_backend.launch.
MainProcessError
(*, signal_no)[source]¶ Bases:
multiprocessing.context.ProcessError
An error happened from main process.
Initialize error class.
-
property
signal_no
¶ Return signal number which stops main process.
-
property
-
exception
espnet.distributed.pytorch_backend.launch.
WorkerError
(*, msg, exitcode, worker_id)[source]¶ Bases:
multiprocessing.context.ProcessError
An error happened within each worker.
Initialize error class.
-
property
exitcode
¶ Return exitcode from worker process.
-
property
worker_id
¶ Return worker ID related to a process causes this error.
-
property