Skip to content
This repository has been archived by the owner on Jun 1, 2023. It is now read-only.

Move processing to Denali? #24

Open
aappling-usgs opened this issue Jul 31, 2020 · 0 comments
Open

Move processing to Denali? #24

aappling-usgs opened this issue Jul 31, 2020 · 0 comments

Comments

@aappling-usgs
Copy link
Member

From #23:

  1. UV data munge was causing memory failures in R. My solution was to reduce down to daily mean values in the combine step, so that the raw data is not preserved in the shared cache. I think this approach is ok, since we have a reproducible pipeline + are not using the raw data.

I agree that this is OK, but I also want to document David's suggestion from standup that this pull could be done on one of the USGS clusters. This could potentially solve two problems: (1) the UV data munge probably(?) won't cause memory failures if the available memory is larger, and (2) in theory, though we've not yet tried this, having the data pull on a cluster would allow multiple people access to the raw data pull without going through the shared cache. Given the unknowns with each of these objectives, I'm not pushing hard for this switch, but let's at least keep it on the table.

If we did this, I think we'd do the pulling on a data transfer node (to be good cluster citizens) and then switch over to a login node -> SLURM-allocated job to get the larger memory needed to process the data.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant