-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bnb/masked fwp #231
Bnb/masked fwp #231
Conversation
c50f16e
to
606eeb7
Compare
…pi should not interrupt a multi download run
…all points masked. mask is given as variable in same file paths used for fwp input. adjusted collection to make sure all unique spatiotemporal regions are included in time index and meta.
…ed to start at the zeroth hour, if it has not been shifted already.
… instead of multi variable monthly files
…rent time indices, like for monthly files. also added catch in cacher for weird dimensions that we shouldnt be writing to h5
…oad, for era downloader.
…orrupted and need to be redownloaded
…se must be determined from the config
…g. u_30m from u_10m and u_100m, with u pressure level array
…ed to era downloader
…e height. Annua scalar correction bias correction method. removed random string from exo cache naming.
e4b5c21
to
07877a3
Compare
…es. each node uses this data so it doesn't make sense to have each node try to cache this data.
…derive ws_20m from u, v, zg, and topography. also some edits to allow interp on time independent variables.
978f3a5
to
7eb2b8e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes and clarification: does the mask variable only need to have one pixel False in an otherwise large chunk to be included in the fwp? Or does the whole chunk need to be false? Also, does this work for temporal masking or is this spatial only? Is the mask dataset assumed to be 2D?
sup3r/bias/bias_transforms.py
Outdated
@@ -396,6 +397,9 @@ def monthly_local_linear_bc( | |||
effect of extreme values within aggregations over large number of | |||
pixels. This value is the standard deviation for the gaussian_filter | |||
kernel. | |||
range_kwargs : dict | None | |||
Dictionary of ranges for scalar and adder values. e.g. {'scalar': (0, | |||
3), 'adder': (-2, 2)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally i dont think we should use kwargs in a public-facing method unless you can point to a subsequent docstring that describes the options. I would think scalar_range
and adder_range
would be more clear and explicit? what are your thoughts bnb?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree with that. Changed.
sup3r/bias/bias_calc.py
Outdated
f'bias_{bias_feature}_mean': np.nanmean(bias_data), | ||
f'bias_{bias_feature}_std': bias_std, | ||
f'base_{base_dset}_mean': np.nanmean(base_data), | ||
f'base_{base_dset}_std': np.nanstd(base_data), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of confusing if the base data is vortex data that is only 0D. Does std just come out to be nan/inf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point. This isn't even used for scalar correction calculations so maybe it shouldn't be written at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i would argue it should not be calculated then and the adder should be fixed as zero if you're using the linear function from the bias transforms
sup3r/bias/utilities.py
Outdated
@@ -18,7 +18,7 @@ | |||
logger = logging.getLogger(__name__) | |||
|
|||
|
|||
def lin_bc(handler, bc_files, threshold=0.1): | |||
def lin_bc(handler, bc_files, reference_feature=None, threshold=0.1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we've been using "bias_feature" elsewhere. you might consider changing for uniformity but i agree reference feature is maybe a better name. just won't match the bias calculation work and could cause confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this to match reference_feature
used in qdm_bc
but you're right these both should match what is used in bias_calc.py
sup3r/pipeline/strategy.py
Outdated
masked.""" | ||
def unmasked_chunks(self): | ||
"""List of chunk indices that are not masked from the input spatial | ||
region.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you be more explicit in what is masked vs. unmasked? You could mask included or excluded points. I feel like the word "mask" does not intuitively convey included or excluded. I actually would have defined things the other way around but that's okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was thinking of masked as synonymous with filtered but I suppose it could go either way. Added somre more info here and elsewhere.
@bnb32 can you clarify these general comments too? |
…ask is 2D and chunks with _any_ unmasked points will still be sent through the generator)
Oh yeah, no prob. Just added some info to ForwardPassStrategy doc string. Mask is spatial only and a single unmasked point will send the chunk through the generator. |
This is just a few additions to the refactor branch, and would like to merge before it diverges too much. This adds the option to skip chunks in the forward pass routine if all coordinates covered by that chunk are masked. The mask is provided as an additional variable through
file_paths
, where mask is 1 / True if the coordinate should not be included in the forward pass. An example which masks conus except for areas around observation locations is shown below.Mask:
Collected forward pass output: