This repository has been archived by the owner on Nov 3, 2023. It is now read-only.
Replies: 1 comment 8 replies
-
I don't think this script uses any operation that can be accelerated via GPU. It mostly needs raw python string processing and dictionary operations for tokenizing the text string. So, using GPU will not help in speeding it up. |
Beta Was this translation helpful? Give feedback.
8 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It seems that this script outputs the tokenized statistics of whole episodes, while data_stats gives statistics per turn. Is this correct?
Getting the statistics is horribly slow, and as far as i can tell does not use cuda. If my assumptions are correct, why is this so? Right now my best estimate is that i need more than 48 hours to generate statistics for my dataset using the CPU.
Changing the code to
self.opt['no_cuda'] = False
does not change this behavior.Beta Was this translation helpful? Give feedback.
All reactions