Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running GPT-2 with Tensorflow2.2+ issues #265

Closed
ManuelitoMartinez opened this issue Aug 26, 2020 · 6 comments
Closed

Running GPT-2 with Tensorflow2.2+ issues #265

ManuelitoMartinez opened this issue Aug 26, 2020 · 6 comments

Comments

@ManuelitoMartinez
Copy link

Hi! I have been trying to install GPT-2 locally through several methods, but due to Tensorflow 2.2 (and greater) I have encounter many issues. First with hparam import in the first lines of model.py. I solved this issue thanks to this: https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/utils/hparam.py After that several errors appeared related to Tensorflow not having attributes like tf.Session, tf.get_variable, etc, which I solved usingimport tensorflow.compat.v1 as tf at the beginning of the file, but that developed in other complex errors when the console encounters these lines of code in models.py:

def mlp(x, scope, n_state, *, hparams):
    with tf2.variable_scope(scope):
        nx = x.shape[-1].value
        h = gelu(conv1d(x, 'c_fc', n_state))
        h2 = conv1d(h, 'c_proj', nx)
        return h2

The console outputs "int" has no attribute .value
And at this point I have no idea how to fix this issue. So I tried this: https://github.com/akanyaani/gpt-2-tensorflow2.0 But it tries to install tensorflow 2.0.1, which is not availabe.

I have also tried to install earlier versions of tensorflow, but apart to the fact that the tf.contrib module is deprecated in TF 2.0, my console output an error saying that it was not compatible with it.

Is there a different way to run GPT-2 now that only Tensorflow 2.2(and greater) is available? I would really love to try GPT-2 in many projects, and it would also be a really good practice until GPT-3 API is released. I will really appreciate any help. Thank you

@DaveXanatos
Copy link

Not sure if this will be helpful or not, but I gave up on Tensorflow 2 for my GPT-2 runup. I found that Tensorflow 1.13.1 ONLY allowed a clean, un-hacked-around-with install and run for GPT-2 as it stands now.

Also see issue #231 "Modifying to work for tensorflow 2.0" Nikolay Neupokoev writes:

_

There is no need to worry about the whole tensorflow.contrib module. In this project only the HParams class is used from tensorflow.contrib.training (import in model.py).

What I have found (tensorflow/community#148) is that the original class can be replaced with a nice fork. Saving hparam.py in src folder and replacing import with

from hparam import HParams

solves the original problem.

_

That said, I'm very happy with the performance of GPT-2 with TF 1.13.1. You will get a boatload of warnings about various code used in GPT-2 being deprecated, but it runs perfectly. Hopefully you're using something with some power otherwise it's a wee bit slow.... I have a copy of GPT-2 running on a Raspberry Pi (345M on a Pi 4 8 gig, Raspbian Buster) and it takes about 90 seconds to generate with nsamples=1, length of 200, temperature of 1.04, top p of .5 and top k of 35. It does pretty damn good! Good luck.

@ManuelitoMartinez
Copy link
Author

Thank you @DaveXanatos ! I managed to make it run! As you said, I used TF 1.13.1. This is the method I used: I created a new environment with Anaconda in which I only installed python 3.6.5. I installed TF and the other requirements, and then
I downloaded the 774M model. Then I used the hparams.py you mentionded to work around the tensorflow.contrib issue. then I had to install CUDA version: 10.0 (which needed Nvidia drivers) and cudnn version: 7.6.4 (sept 27th, 2019) for CUDA 10.0. Commands only worked with phython and not python3 for some reason. Thank you for your help!

@DaveXanatos
Copy link

DaveXanatos commented Aug 27, 2020 via email

@gselsidi
Copy link

@DaveXanatos I tried hparams.py but still running into issues with tensorflow.contrib. Maybe I didn't add the right imports in model.py?

this is what i have for imports:
import numpy as np from hparam import HParams

@UtilityHotbar
Copy link

GPT-2 as it stands does work with Tensorflow 2.0+! What I did involved several steps:

  • First, converting using the conversion utility tf_upgrade_v2
  • Then, grabbing hparam.py from tensor2tensor and adding it to src, then changing the import statement in model.py from tensorflow.contrib to hparam
  • Finally, removing all instances of .value from model.py (Tensorflow 2.0+ simply returns an integer, so no need for .value)
    Hope this helps!

@DaveXanatos
Copy link

I can confirm @UtilityHotbar comments as I also have GPT-2 running with Tensorflow 2.1.1 currently, and having followed the same steps. Raspberry Pi OS64, on a Pi 4B 8 gig. It runs the 345M model (startup time 62 seconds), 774M model (startup time 135 seconds), and 1558M model (startup time 272 seconds). The 774 and 1558M models can only be run on a 64 bit OS due to the 2GB filesize limit in the signed 32 bit OSs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants