Mixed precision training? #487

ghost · 2020-08-12T21:00:44Z

Pytorch 1.6 has native support for automatic mixed precision (AMP) training: https://pytorch.org/blog/pytorch-1.6-released/

Should we take advantage of this? In particular I think the larger batches would be nice for encoder and synthesizer training.

ghost · 2020-08-13T00:53:29Z

Mozilla TTS tried it and encountered a bug with RNNs to be fixed in a future Pytorch release. mozilla/TTS#486 (comment)

ghost · 2020-10-27T17:27:06Z

Pytorch 1.7 just released, so it is time to try again. If we add AMP, the implementation must be clean while preserving support for lower versions of PT.

ghost · 2021-03-24T21:00:16Z

See fatchord/WaveRNN#229 for an example of how to do this.

ghost · 2021-04-01T00:42:57Z

Closing due to lack of developer interest at this time.
Please comment and reopen if you would like to work on this.

ghost · 2021-11-07T17:38:04Z

I made a branch that supports mixed precision training. It is not recommended for use at this time.
https://github.com/blue-fish/Real-Time-Voice-Cloning/tree/487_mixed_precision_training

For me, mixed precision training is much slower than without it, and loss is occasionally nan. I also had to set up my Python environment with Anaconda due to a problem with matrix multiplication. pytorch/pytorch#56747 (comment)

Pytorch AMP enabled (Python 3.9.7 with Anaconda, pytorch==1.10.0):

{| Epoch: 1/8 (20/2564) | Loss: nan | 0.24 steps/s | Step: 0k | }
Average execution time over 10 steps:
  Blocking, waiting for batch (threaded) (10/10):  mean:    0ms   std:    0ms
  Data to cuda (10/10):                            mean:    1ms   std:    0ms
  Forward pass (10/10):                            mean:  956ms   std:  151ms
  Loss (10/10):                                    mean:   14ms   std:    3ms
  Backward pass (10/10):                           mean: 3013ms   std:  456ms
  Parameter update (10/10):                        mean:   71ms   std:    5ms
  Extras (visualizations, saving) (10/10):         mean:    0ms   std:    0ms

Same setup without AMP:

{| Epoch: 1/8 (20/2564) | Loss: 5.778 | 0.76 steps/s | Step: 0k | }
Average execution time over 10 steps:
  Blocking, waiting for batch (threaded) (10/10):  mean:    0ms   std:    0ms
  Data to cuda (10/10):                            mean:    1ms   std:    0ms
  Forward pass (10/10):                            mean:  451ms   std:   52ms
  Loss (10/10):                                    mean:   10ms   std:    2ms
  Backward pass (10/10):                           mean:  753ms   std:   73ms
  Parameter update (10/10):                        mean:   31ms   std:    3ms
  Extras (visualizations, saving) (10/10):         mean:    5ms   std:    0ms

ghost · 2021-11-15T07:02:37Z

Dropping this due to poor performance and lack of interest.

ghost added the enhancement New feature or request label Aug 20, 2020

ghost mentioned this issue Jan 25, 2021

Synthesizer doesn't work in Python 3.8+ #634

Closed

ghost closed this as completed Apr 1, 2021

ghost reopened this Nov 7, 2021

ghost closed this as completed Nov 15, 2021

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixed precision training? #487

Mixed precision training? #487

ghost commented Aug 12, 2020

ghost commented Aug 13, 2020

ghost commented Oct 27, 2020

ghost commented Mar 24, 2021

ghost commented Apr 1, 2021

ghost commented Nov 7, 2021 •

edited by ghost

Loading

ghost commented Nov 15, 2021

Mixed precision training? #487

Mixed precision training? #487

Comments

ghost commented Aug 12, 2020

ghost commented Aug 13, 2020

ghost commented Oct 27, 2020

ghost commented Mar 24, 2021

ghost commented Apr 1, 2021

ghost commented Nov 7, 2021 • edited by ghost Loading

ghost commented Nov 15, 2021

ghost commented Nov 7, 2021 •

edited by ghost

Loading