Protect NetCDF saving from bad Python-vs-HDF file lock timing #6760
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🚀 Pull Request
Description
Thanks to @RachelNorth and @TeresaHughes for reporting this.
Even when our NetCDF save operations are fully serialised - no parallelism - HDF still occasionally fails to acquire the file. This is despite all Python locks being available at expected moments, and the file reporting as closed. During testing, 2nd retry always succeeded. This is likely caused by HDF-level locking running on a different timescale to Python-level locking - i.e. sometimes Python has released its locks but HDF still has not. Thought to be filesystem-dependent; further investigation is needed but time is limited at the moment so it seemed best to just get the protective code in immediately.
Consult Iris pull request check list
Add any of the below labels to trigger actions on this PR: