You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 1, 2023. It is now read-only.
UV data munge was causing memory failures in R. My solution was to reduce down to daily mean values in the combine step, so that the raw data is not preserved in the shared cache. I think this approach is ok, since we have a reproducible pipeline + are not using the raw data.
I agree that this is OK, but I also want to document David's suggestion from standup that this pull could be done on one of the USGS clusters. This could potentially solve two problems: (1) the UV data munge probably(?) won't cause memory failures if the available memory is larger, and (2) in theory, though we've not yet tried this, having the data pull on a cluster would allow multiple people access to the raw data pull without going through the shared cache. Given the unknowns with each of these objectives, I'm not pushing hard for this switch, but let's at least keep it on the table.
If we did this, I think we'd do the pulling on a data transfer node (to be good cluster citizens) and then switch over to a login node -> SLURM-allocated job to get the larger memory needed to process the data.
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
From #23:
I agree that this is OK, but I also want to document David's suggestion from standup that this pull could be done on one of the USGS clusters. This could potentially solve two problems: (1) the UV data munge probably(?) won't cause memory failures if the available memory is larger, and (2) in theory, though we've not yet tried this, having the data pull on a cluster would allow multiple people access to the raw data pull without going through the shared cache. Given the unknowns with each of these objectives, I'm not pushing hard for this switch, but let's at least keep it on the table.
If we did this, I think we'd do the pulling on a data transfer node (to be good cluster citizens) and then switch over to a login node -> SLURM-allocated job to get the larger memory needed to process the data.
The text was updated successfully, but these errors were encountered: