-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues introduced by refactoring and other miscellaneous #14
Comments
Also in several parts of I don't understand |
Another issue introduced during the refactor: in Things were good back when this line wasn't commented out
|
In It is then referenced outside any such block in line 182 |
@molereddy Thank you so much for the reports! We are working on the fix now. I have some comments here:
|
I only know of deepspeed as well. It's just it hasn't been straighforward to get deepspeed setup on the machines I have available. I am currently putting that issue on the backburner, I will report again with any errors in that aspect. |
Does the recent adding of index to the dataset object change anything about the evaluation results or is it just to make debugging easy or something? |
It should make everything easier. Previously the interleaving logic only works for a very specific batch size and num_gpus, and we have a bug which causes some data duplication. The error shouldn't change anything from a statistical point of view, but it may alter the numbers. I would stay away from interleaving from now on. |
I understand it is currently deprecated for being slow, but I think the
The I might have missed it if this functionality of evaluating is existing already.. I think this fix would take care of the else:
eval_logs = json.load(open(os.path.join(curr_save_dir, f"{eval_task}.json")))
aggregated_eval_logs[f'{eval_task}.json'] = eval_logs
new_save_filename = os.path.join(curr_save_dir, f"{eval_task}.json")
with open(new_save_filename, "w") as f:
# pretty write json to f
json.dump(eval_logs, f, indent=4) |
Also a reminder to fix the local_rank forget.py bug mentioned in the earlier comment #14 (comment) |
Wanted to inform about some issues which cam up after merging with the latest code.
natsort
to requirements.txt (needed after the refactor)python forget.py --config-name=forget.yaml
or provide the parameters likesplit
,model
andlr
.The text was updated successfully, but these errors were encountered: