[QUESTION] Getting tools/preprocess_data.py to work is painful #974
Unanswered
sambar1729
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Your question
Can
tools/preprocess_data.py
be simplified?Using
Right now, it requires nltk, torch, transformer_engine, as well as apex.
Installing transformer_engine does not work out of the box -- had to install out of box (on a A100).
Installing apex has similar problems, when using https://github.com/NVIDIA/apex?tab=readme-ov-file#linux
Given that the repo does not have some sample
idx
,bin
files, one would expect thepreprocess_data
process to be relatively simple. Could this process be simplified?Installing apex
gives
Beta Was this translation helpful? Give feedback.
All reactions