You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While onboarding Yaris, we observed a few very confusing issues with capturing CPython STDIO/print() activity in some Python "learning scripts." I've reproduced some of them myself locally just now so I'll share and describe the behavior below.
While this is unlikely to show up in say large National Lab code workflows, I can certainly see how it could cause a great deal of confusion while trying to learn to use the darshan runtime monitoring through to the HTML report workflow parts of the ecosystem/project.
First, let's try the first exercise I suggest, just print on two ranks, hard to imagine something simpler that is MPI-aware:
# example code from Yaris' exerciseimporttimefrommpi4pyimportMPIcomm=MPI.COMM_WORLDrank=comm.Get_rank()
ifrank==0:
print("rank 0", flush=True)
ifrank==1:
print("rank 1", flush=True)
And the report shows (even with flush=True, on my machine), a lack of captured IO data.
If I increase the amount of data printed, there is still no capture of IO (same red text on the report):
# example code from Yaris' exerciseimporttimefrommpi4pyimportMPIcomm=MPI.COMM_WORLDrank=comm.Get_rank()
ifrank==0:
print("rank 0"*100, flush=True)
ifrank==1:
print("rank 1"*100, flush=True)
Even if I add a few seconds of sleep after, the HTML report still indicates no IO capture:
# example code from Yaris' exerciseimporttimefrommpi4pyimportMPIcomm=MPI.COMM_WORLDrank=comm.Get_rank()
ifrank==0:
print("rank 0"*100, flush=True)
time.sleep(5)
ifrank==1:
print("rank 1"*100, flush=True)
time.sleep(5)
If I switch to explicit POSIX by writing to a file, all is good in the world again:
# example code from Yaris' exerciseimporttimefrommpi4pyimportMPIcomm=MPI.COMM_WORLDrank=comm.Get_rank()
ifrank==0:
withopen(f"{rank}.txt", "w") asoutfile:
outfile.write("hello")
ifrank==1:
withopen(f"{rank}.txt", "w") asoutfile:
outfile.write("hello")
We can stagger the IO with POSIX as well, which was the original purpose of the exercise, to understand IO patterns with simple examples like the one below. But, STDIO was invisible, so that made the exercise pretty confusing!
# example code from Yaris' exerciseimporttimefrommpi4pyimportMPIcomm=MPI.COMM_WORLDrank=comm.Get_rank()
ifrank==0:
withopen(f"{rank}.txt", "w") asoutfile:
outfile.write("hello")
ifrank==1:
time.sleep(5)
withopen(f"{rank}.txt", "w") asoutfile:
outfile.write("hello")
The text was updated successfully, but these errors were encountered:
While onboarding Yaris, we observed a few very confusing issues with capturing CPython
STDIO
/print()
activity in some Python "learning scripts." I've reproduced some of them myself locally just now so I'll share and describe the behavior below.While this is unlikely to show up in say large National Lab code workflows, I can certainly see how it could cause a great deal of confusion while trying to learn to use the darshan runtime monitoring through to the HTML report workflow parts of the ecosystem/project.
First, let's try the first exercise I suggest, just print on two ranks, hard to imagine something simpler that is MPI-aware:
mpirun -x LD_PRELOAD=/home/tyler/darshan_install/lib/libdarshan.so -x DARSHAN_LOGPATH=/home/tyler/LANL/rough_work/darshan/python_stagger_tests -n 2 python test.py
And the report shows (even with
flush=True
, on my machine), a lack of captured IO data.If I increase the amount of data printed, there is still no capture of IO (same red text on the report):
Even if I add a few seconds of sleep after, the HTML report still indicates no IO capture:
If I switch to explicit
POSIX
by writing to a file, all is good in the world again:We can stagger the IO with
POSIX
as well, which was the original purpose of the exercise, to understand IO patterns with simple examples like the one below. But,STDIO
was invisible, so that made the exercise pretty confusing!The text was updated successfully, but these errors were encountered: