-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] ray job submit
doesn't always catch the last lines of the job logs
#48701
Comments
kpouget
added
bug
Something that is supposed to be working; but isn't
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
labels
Nov 12, 2024
rynewang
added
P1
Issue that should be fixed within a few weeks
core
Issues that should be addressed in Ray Core
and removed
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
labels
Nov 12, 2024
MortalHappiness
added
the
kuberay
Issues for the Ray/Kuberay integration that are tracked on the Ray side
label
Nov 13, 2024
Note for myself: |
ReproductionHere is a simpler reproduction script.
import ray
ray.init(address="auto")
@ray.remote
def f():
for i in range(1000000):
print(f"Hello world: {i}")
ray.get([f.remote()]) ray start --head --include-dashboard=True
ray job submit --working-dir . -- python task.py
cd /tmp/ray/session_latest/logs
grep -i 'hello world: 0' *.out *.log And then |
MortalHappiness
removed
the
kuberay
Issues for the Ray/Kuberay integration that are tracked on the Ray side
label
Dec 13, 2024
MortalHappiness
added a commit
to MortalHappiness/ray
that referenced
this issue
Dec 18, 2024
… prevent pending logs Closes: ray-project#48701 Signed-off-by: Chi-Sheng Liu <[email protected]>
Draft
8 tasks
MortalHappiness
added a commit
to MortalHappiness/ray
that referenced
this issue
Dec 18, 2024
… prevent pending logs Closes: ray-project#48701 Signed-off-by: Chi-Sheng Liu <[email protected]>
MortalHappiness
added a commit
to MortalHappiness/ray
that referenced
this issue
Dec 18, 2024
… prevent pending logs Closes: ray-project#48701 Signed-off-by: Chi-Sheng Liu <[email protected]>
MortalHappiness
added a commit
to MortalHappiness/ray
that referenced
this issue
Dec 18, 2024
… prevent pending logs Closes: ray-project#48701 Signed-off-by: Chi-Sheng Liu <[email protected]>
MortalHappiness
added a commit
to MortalHappiness/ray
that referenced
this issue
Dec 18, 2024
… prevent pending logs Closes: ray-project#48701 Signed-off-by: Chi-Sheng Liu <[email protected]>
MortalHappiness
added a commit
to MortalHappiness/ray
that referenced
this issue
Dec 19, 2024
… prevent pending logs Closes: ray-project#48701 Signed-off-by: Chi-Sheng Liu <[email protected]>
MortalHappiness
added a commit
to MortalHappiness/ray
that referenced
this issue
Dec 19, 2024
… prevent pending logs Closes: ray-project#48701 Signed-off-by: Chi-Sheng Liu <[email protected]>
MortalHappiness
added a commit
to MortalHappiness/ray
that referenced
this issue
Dec 19, 2024
… prevent pending logs Closes: ray-project#48701 Signed-off-by: Chi-Sheng Liu <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What happened + What you expected to happen
When I launch Ray jobs as part of OpenShift AI (
RayJobs
inK8sJobMode
mode), I observe that the end of logs of the job isn't always correctly captured.The submit command (part of the
Job
created out of theRayJob
) is the following:and sometimes, the logs of this Pod do not contain the last lines printed by my
entrypoint.sh
script:However, if I
rsh
into Ray's head Pod, I see that it is correctly captured:This issue is at the boundary between Ray and KubeRay, but I think that it should be reproducible outside of the K8s environment, so I chose to fill the issue in this repository.
Versions / Dependencies
2.35.0
quay.io/rhoai/ray:2.35.0-py311-cu121-torch24-fa26
Reproduction script
Sample job (
ray-job-sample.yaml
)Sample launcher:
Issue Severity
None
The text was updated successfully, but these errors were encountered: