Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output log files are truncated. #64

Open
iandobbie opened this issue Sep 28, 2024 · 11 comments
Open

Output log files are truncated. #64

iandobbie opened this issue Sep 28, 2024 · 11 comments

Comments

@iandobbie
Copy link
Contributor

I think the log file is not flushed before it is copied out of the temp directory and the original deleted so the output log files are truncated at the end, finishing mid word in many cases. Here are the last few lines of such a log file.

....
Before fftplan3d 6954MB free 
After fftplan 6954MB free 
re-transforming centerband
inserting centerband
centerband assembly completed
mo
@thomasmfish
Copy link
Collaborator

Just added a couple of new commits #40 that should help - specifically using os.fsync to ensure everything is fully written to the files before they are closed. This isn't an issue I've had on my laptop but apparently it's related to the disk cache, so is hardware dependent.

@thomasmfish
Copy link
Collaborator

I'll merge that into dev in a sec

@thomasmfish
Copy link
Collaborator

Closed by #67

@iandobbie
Copy link
Contributor Author

I just checked out the latest main branch (e613e2b) and the log files are truncated but not as I thought.

I am processing the data from Lothar with 4 channels and channel specific OTFs (trying to sort out the wiener filter stuff). The end of the file is.....

Before fftplan3d 6954MB free
After fftplan 6954MB free
re-transforming centerband
inserting centerband
centerband assembly completed
mo

(pysim) T:\OMX-data>

Looking at the 4 different segments I think the mo comes from earlier steps. I think the later channels are dumping into the log file while the earlier channels are not finished. eg one earlier section ends....

centerband assembly compl
--------------------------------------------------------------------------------

So it looks like the "eted" was truncated from "completed" this segment. Don't understand where the "mo" came from though!

@thomasmfish thomasmfish reopened this Oct 14, 2024
@thomasmfish
Copy link
Collaborator

That's odd. The log files should be completely separate in the processing directory, only combined at the end. If you run it with --no-cleanup and inspect the log files in there, what do you see?

@iandobbie
Copy link
Contributor Author

I just tried this again and am finding the weird variable truncation in the log files in the sub directory.

Its getting to late for me. I will have another look tomorrow if I get time.

@iandobbie iandobbie reopened this Oct 15, 2024
@iandobbie
Copy link
Contributor Author

The final few lines from the 4 log files. All starting at the final occrance of K0 x% line and ending at different places.


447 log file
--------------------

best fit for k0 is 0.036% from expected value.
moving centerband
Before fftplan3d 6954MB free 
After fftplan 6954MB free 
re-transforming centerband
inserting centerband
centerband assembly compl


527 log file
--------------------

best fit for k0 is 0.085% from expected value.
moving centerband
Before fftplan3d 6954MB free 
After fftplan 6954MB free 
re-transforming centerband
inserting centerband
centerband assembly comp

603 log file
--------------------

best fit for k0 is 0.038% from expected value.
moving centerband
Before fftplan3d 6954MB free 
After fftplan 6954M


647 log file
--------------------

best fit for k0 is 0.070% from expected value.
moving centerband
Before fftplan3d 6954MB free 
After fftplan 6954MB free 
re-transforming centerband
inserting centerband
centerband assembly completed
mo

@iandobbie
Copy link
Contributor Author

Also it doesn't seem to be a race condition as it appears that the truncations are the same each time the software is run.

@thomasmfish
Copy link
Collaborator

thomasmfish commented Oct 16, 2024

This is very odd... I've tried reordering how things are handled in #75, so could you try that branch and let me know if that changes anything? The other thing I was thinking is perhaps adding print(f"\n{"-" * 80}\ncudasirecon complete") to reconstruct_from_processing_info after reconstruct is called would give an idea of whether the problem is with the redirection overall or if it's just the cudasirecon output. I'm frustrated that I can't get it to happen on my end, though I haven't tried with the 4 channel data yet.

@iandobbie
Copy link
Contributor Author

I'll try that branch and let you know what the results are. I agree this is very strange, especially in its reproducibility which makes me think it is definitely not a race condition issue.

@thomasmfish
Copy link
Collaborator

Each reconstruction runs on a different process, so the only thing in terms of a race condition I can think of is the flushing and syncing of the output, or perhaps some oddity to do with how the reconstruction process exits (though that should only be a problem if it crashes, so I don't think it's that). I wonder if a very short wait would make a difference, perhaps time.sleep(0.1) even... Let me know how that branch goes anyway and we can go from there!

thomasmfish added a commit that referenced this issue Nov 6, 2024
…ing stdout and stderr (#75)

Possible fix for #64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants