-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About Original Nth Farthest #3
Comments
Hi! Sadly I have not tried to run the official Sonnet code myself, and just ported the core implementation of RMC with the Sonnet code just for the reference. So I'm afraid i could not share any useful pointers. The Nth farthest task implementation is from the contributor. (Mind if I ask about this topic, @jessicayung ?) Side note here, I've been running |
Hey! Thanks for the quick reply! Wow, thanks. I'll use your version in my experiments! |
@L0SG Hey there, one more question. I started running your N-Farthest script. It seems to still hover around 0.25 (it's been 1 day). Could you describe if there is a sudden spike in performance (to 91%) or at roughly how many epoch does it take to reach somewhere along that score! And is the default hyperparameters correct for achieving this result? Thanks! |
I've fired up the code and let it run forever and actually forgot about it for
like 5 days. And checking it after seeing your issue, it was reaching 91%
at around ~180000 epochs.
The original paper says a wall clock time of breaking the 25% mark at
around 40~50 hours, so running the code for at least this time period is a
viable choice I suppose.
Regarding to the default hyperparameters, I've not checked every last
details of them yet, but I believe that the contributor took a great effort
for matching them as faithful as possible.
Currently I'm doing another project (not related to sequence
unfortunately :( ), so I'll double check the faithfulness when I have a
spare time.
Meanwhile, if you could find the difference btw the Sonnet and this repo,
please let me know and I'll fix it. Thanks!
|
@vanzytay The hyperparameters in this implementation were set based on the paper first and the official Sonnet implementation second. Not sure if there were differences between the two. Let me know if you find any problems. I spoke with one of the authors and they did say that the RRNN tends to run for a while before having something like an 'aha' moment and having a spike in performance (as shown in the graphs in the paper). Also really glad to hear that the implementation's broken the 25% barrier, thanks for running it for longer Sang-gil! |
Thanks @L0SG and @jessicayung for your replies! |
I've uploaded a bit overdue experimental results of the nth farthest task. Definitely takes way longer than the reported results from the paper. I will play with other hyperparameters when I have spare GPU resources available. |
@L0SG Thanks! |
Hey!
I've been wondering if you have tried the original Nth Farthest code (from Sonnet) on a 16GB Ram GPU. I keep running into memory errors no matter what I do (on a Volta GPU).
Wondering if you have any clue. (Sorry this is not directly related to your repository), just wondering if you got the original Sonnet version to work.
Thanks!
The text was updated successfully, but these errors were encountered: