-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train loss equal to 0 #2
Comments
Besides, add 'tf.compat.disable_eager_execution()' at the beginning of the def placeholder(h). |
I met the same problem that the loss became 0 after several epochs, could you help me, appeciated! |
I translated their code into PyTorch. I also encountered the same issue you mentioned. And I think the problem is that they didn't normalize the inputs (so that masking NaN values in the loss function would not be difficult). However, it is causing the gradient to explode after 4 or 5 epochs. |
When I got the learning rate down to 0.0001, it worked fine, but the results were not as good as in the paper |
Hello, may I ask if you have solved this problem now? Can you run the effect in the paper? |
Hi, I try to run the train.py on METR-LA. Due to the TensorFlow version, I use the tf_upgrad_v2 to migrate the model.py to TF 2. X.
Specifically,
(1) line 76 'tf.nn.rnn_cell.GRUCell' to 'tf.compat.v1.nn.rnn_cell.GRUCell'
(2)line 80 and 87 'tf.layers.dense' to 'tf.compat.v1.layers.dense'
Then, when I run train.py and test.py, there're several issues, as follows:
(1) line 47 in test.py and line 58 in train.py, x.value() error, x is int. After I change the "x.value for x in xxx" to "x for x in xxx" it could work
(2) After 4-5 epochs, the training and validation losses become 0. The test result becomes nan. I run the code several times, the issue does not disappear. Meanwhile, the test.py is normal and outputs the results. Since the dataset doesn't contain the metr-la.h5, I use the document download from IGNNK.
Now, I am not sure of the reasons for the issues. Hope to here some suggestions. Much appreciated.
The text was updated successfully, but these errors were encountered: