Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Seed issue with Pipeline Parallel #6449

Closed
andoorve opened this issue Jul 15, 2024 · 9 comments · Fixed by #6698
Closed

[Bug]: Seed issue with Pipeline Parallel #6449

andoorve opened this issue Jul 15, 2024 · 9 comments · Fixed by #6698
Assignees
Labels
bug Something isn't working

Comments

@andoorve
Copy link
Collaborator

Your current environment

v0.5.1

🐛 Describe the bug

OpenAI API specifies that you can provide a seed: https://platform.openai.com/docs/api-reference/chat/create#chat-create-seed

This allows reproducibility for example with non-zero temperature parameter.

Currently, any state information is stored/advanced on the driver process only. We need to extend this to the worker actually doing the sampling.

@andoorve andoorve added the bug Something isn't working label Jul 15, 2024
@binxuan
Copy link

binxuan commented Jul 16, 2024

  • 1, got openai.InternalServerError: Internal Server Error when I specified seed in the completion request

@andoorve
Copy link
Collaborator Author

@binxuan do you have any other debug information? That shouldn't be expected

@binxuan
Copy link

binxuan commented Jul 18, 2024

Below is my error trace, not sure if this is the root cause.

Traceback (most recent call last):
File "python/ray/_raylet.pyx", line 919, in ray._raylet.prepare_args_internal
File "/home/binxuan/.conda/envs/vllm_dist/lib/python3.10/site-packages/ray/_private/serialization.py", line 519, in serialize
return self._serialize_to_msgpack(value)
File "/home/binxuan/.conda/envs/vllm_dist/lib/python3.10/site-packages/ray/_private/serialization.py", line 497, in _serialize_to_msgpack
pickle5_serialized_object = self._serialize_to_pickle5(
File "/home/binxuan/.conda/envs/vllm_dist/lib/python3.10/site-packages/ray/_private/serialization.py", line 444, in _serialize_to_pickle5
raise e
File "/home/binxuan/.conda/envs/vllm_dist/lib/python3.10/site-packages/ray/_private/serialization.py", line 439, in _serialize_to_pickle5
inband = pickle.dumps(
File "/home/binxuan/.conda/envs/vllm_dist/lib/python3.10/site-packages/ray/cloudpickle/cloudpickle.py", line 1479, in dumps
cp.dump(obj)
File "/home/binxuan/.conda/envs/vllm_dist/lib/python3.10/site-packages/ray/cloudpickle/cloudpickle.py", line 1245, in dump
return super().dump(obj)
TypeError: cannot pickle 'torch._C.Generator' object

@andoorve
Copy link
Collaborator Author

@njhill any comments? Did you run into the above?

@njhill
Copy link
Member

njhill commented Jul 18, 2024

@andoorve yes this is the known issue, I should hopefully have time to fix it tomorrow.

@sekh77
Copy link

sekh77 commented Jul 21, 2024

Hi @njhill - Hope you have had some chance to fix this issue. Is it available now in the latest version (0.5.2)?

@njhill njhill self-assigned this Jul 22, 2024
@njhill
Copy link
Member

njhill commented Jul 22, 2024

@sekh77 it's not in 0.5.2. I am working on it right now and it should be ready today.

@sekh77
Copy link

sekh77 commented Jul 22, 2024

@njhill - Ok, thanks. Will this also be available for people who are on 0.5.1?

@njhill
Copy link
Member

njhill commented Jul 22, 2024

@sekh77 yes, it will be available to everyone! You'll just have to upgrade to the latest version (hopefully will be in a new release in the next day or two).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants