-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ConnectionError: Couldn't reach 'synthetic_data_llama-3-8b-instruct-sppo-iter3_score' on the Hub (ConnectionError) #2
Comments
Hi, This file should appear in your local folder (under where you started the script) if the generation pipeline has run successfully. Please check for any errors in the generation process. |
Yes. It is generated by vllm and PairRM and is automatically included in our pipeline. |
Great work!
I commented all the push_to_hub in the code. Is synthetic_data_llama-3-8b-instruct-sppo-iter3_score dataset generated by PairRM?
[rank4]: Traceback (most recent call last):
[rank4]: File "/training-data/huangxing/software/SPPO/sppo/run_dpo.py", line 249, in
[rank4]: main()
[rank4]: File "/training-data/huangxing/software/SPPO/sppo/run_dpo.py", line 43, in main
[rank4]: main_inner(model_args, data_args, training_args)
[rank4]: File "/training-data/huangxing/software/SPPO/sppo/run_dpo.py", line 78, in main_inner
[rank4]: raw_datasets = get_datasets(data_args, splits=["train"])
[rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank4]: File "/training-data/huangxing/software/SPPO/sppo/alignment/data.py", line 164, in get_datasets
[rank4]: raw_datasets = mix_datasets(dataset_mixer, splits=splits, shuffle=shuffle)
[rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank4]: File "/training-data/huangxing/software/SPPO/sppo/alignment/data.py", line 189, in mix_datasets
[rank4]: dataset = load_dataset(ds, split=split)
[rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank4]: File "/training-data/software/miniconda3/envs/mcts/lib/python3.11/site-packages/datasets/load.py", line 2129, in load_dataset
[rank4]: builder_instance = load_dataset_builder(
[rank4]: ^^^^^^^^^^^^^^^^^^^^^
[rank4]: File "/training-data/software/miniconda3/envs/mcts/lib/python3.11/site-packages/datasets/load.py", line 1815, in load_dataset_builder
[rank4]: dataset_module = dataset_module_factory(
[rank4]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank4]: File "/training-data/software/miniconda3/envs/mcts/lib/python3.11/site-packages/datasets/load.py", line 1512, in dataset_module_factory
[rank4]: raise e1 from None
[rank4]: File "/training-data/software/miniconda3/envs/mcts/lib/python3.11/site-packages/datasets/load.py", line 1468, in dataset_module_factory
[rank4]: raise ConnectionError(f"Couldn't reach '{path}' on the Hub ({type(e).name})")
[rank4]: ConnectionError: Couldn't reach 'synthetic_data_llama-3-8b-instruct-sppo-iter3_score' on the Hub (ConnectionError)
The text was updated successfully, but these errors were encountered: