Use with static (unrolled) RNN? #13

eundersander · 2018-01-23T19:48:27Z

Hi guys, thanks for your contribution. I wanted to give some feedback and request that you add a static (unrolled) RNN to your test suite. If/when I get a chance to spend more time on this, I'm happy to contribute this myself.

I tried using your code with a 2-layer LSTM RNN using dynamic_rnn and hit the same issue as here: #9

I converted my model to use static_rnn. This removes the while loop by statically unrolling for a fixed sequence length. At this point, your code was unable to automatically find articulation points. So, I tried adding manual checkpoints in a few intuitive places (at output of each layer, or at every unrolled loop iteration, or at every k unrolled loop iterations). In all cases, the memory usage was still higher than the baseline. I investigated the modified backprop graph. It seemed to be doing a lot of redundant computation and not working as described in your writing. I suspect I wasn't checkpointing correctly. A working static RNN test case would be a helpful reference.

yaroslavvb · 2018-01-23T23:25:35Z

We have not tested with any RNN examples. I'm somewhat reluctant extending it to work on RNN's given that there's a parallel effort to integrate memory saving into TensorFlow using Grappler framework, @allenlavoie may know more details

allenlavoie · 2018-01-24T17:31:10Z

I'd note that while_loop has its swap_memory argument, which can help for dynamic RNNs on GPUs (it won't recompute, it'll just swap intermediate Tensors to host memory until they're needed).

The plan is that Grappler's memory optimizer will (eventually) do checkpointing. Currently it will only recompute things once. Once checkpointing is implemented I'm happy to look at static RNNs if they're an issue. I'm not working on it at the moment, and it will likely be at least a quarter or two until I can get back to it. Happy to chat/make connections if someone reading this is interested in picking it up.

eundersander · 2018-01-24T20:32:12Z

Thanks for the info. We'll take a closer look at Grappler.

jianlong-yuan · 2018-07-13T05:22:10Z

@yaroslavvb have the same error with RNN

yaroslavvb mentioned this issue Feb 14, 2018

Add memory saving techniques to Tensorpack tensorpack/tensorpack#654

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use with static (unrolled) RNN? #13

Use with static (unrolled) RNN? #13

eundersander commented Jan 23, 2018

yaroslavvb commented Jan 23, 2018 •

edited

Loading

allenlavoie commented Jan 24, 2018

eundersander commented Jan 24, 2018

jianlong-yuan commented Jul 13, 2018

Use with static (unrolled) RNN? #13

Use with static (unrolled) RNN? #13

Comments

eundersander commented Jan 23, 2018

yaroslavvb commented Jan 23, 2018 • edited Loading

allenlavoie commented Jan 24, 2018

eundersander commented Jan 24, 2018

jianlong-yuan commented Jul 13, 2018

yaroslavvb commented Jan 23, 2018 •

edited

Loading