Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird behavior with wet_mask #85

Closed
roxyboy opened this issue Sep 8, 2021 · 11 comments
Closed

Weird behavior with wet_mask #85

roxyboy opened this issue Sep 8, 2021 · 11 comments

Comments

@roxyboy
Copy link

roxyboy commented Sep 8, 2021

Firstly, I'd like to thank the developers for this awesome package and documentation!

I am currently playing around with a submesoscale permitting simulation but I am getting some weird behavior where at the boundaries, although I pad zeros to the wet_mask, the filter seems to ignore the 0 values in the mask and uses the data there to apply the filtering. I've attached a screenshot below where the values of the filtered field tend to go to zero towards the boundary.

Screenshot 2021-09-08 at 15 54 33

I have a reproducible example here: https://github.com/roxyboy/Cal1_SWOT-AdAC/blob/master/Filter.ipynb

Could you let me know what I'm doing wrong?

@iangrooms
Copy link
Member

Hi Takaya (@roxyboy)! Glad to hear you're using the package. It took me a little while to track down what was going on here, but I think I've figured it out. Several of the "required grid args" that you passed to the filter constructor included NaNs. These values are accessed & used during the Laplacian step, which produces NaNs internally. When this gets passed back in as input for another Laplacian step, we set all NaNs to 0. So values near the land boundaries were getting set to 0, and these values were then getting diffused into the interior.

Whether the explanation is completely correct or not, I was able to produce something that looks a lot better.
First I replaced area = wet_mask.copy() with area = xr.ones_like(wet_mask). The wet mask has 0 values, which means that the area will have 0 values. In the finite-volume Laplacian we divide by the area, which produces NaNs. So I replaced the 0s with 1s.

Next I added a bunch of lines like this
dxw[:,:] = np.nan_to_num(dxw[:,:],nan=1)
to remove NaNs in the cell edge lengths with nonzero values. For edges adjacent to land these aren't really used, but they are accessed and if they're NaNs then it messes up the internal works.

Finally, and I don't understand this one, the way that you were using sst.interp was causing problems. So I replaced that with

sst = wet_mask.copy()
sst[pad:-pad,pad:-pad] = ds.sosstsst.isel(time_counter=0)

and now it looks like it's working.

@roxyboy
Copy link
Author

roxyboy commented Sep 13, 2021

Hi @iangrooms , thank you for the support. Your solution fixed the issue so I'm going to close this :) It was helpful to know that the area shouldn't include 0s.

@roxyboy roxyboy closed this as completed Sep 13, 2021
@NoraLoose
Copy link
Member

@roxyboy, thanks for posting the issue!

If I understand correctly, 0s or NaNs in area (or one of the other required grid variables) over land caused the problem? If yes, we should change the code. The code should ignore all values on land if the Laplacian takes a wet_mask.

@roxyboy
Copy link
Author

roxyboy commented Sep 13, 2021

If I understand correctly, 0s or NaNs in area (or one of the other required grid variables) over land caused the problem? If yes, we should change the code. The code should ignore all values on land if the Laplacian takes a wet_mask.

@NoraLoose I progressively changed my code back prior to Ian's solution and it seems to originate from a combination of NaNs in the kappas and dxs where the wet_mask is 0/land. 0s in area actually ran fine when there were no NaNs in kappas and dxs...

@rabernat
Copy link
Contributor

Thanks a lot Takaya for reporting a useful issue and helping us iron out the package.

I think the best solution here is not necessarily to change the gcm-filters code but to clearly document this possible pitfalll in the documentation.

@roxyboy
Copy link
Author

roxyboy commented Sep 13, 2021

Feeling encouraged with the success with NEMO, I've been trying to apply the IRREGULAR_WITH_LAND filter to a CROCO simulation for the past week but I haven't been able to figure out why the filtered field basically comes out as noise...

Screenshot 2021-09-13 at 17 39 19

The area, kappas and dxs don't include any NaNs so I think the behavior is originating from a different issue. The reproducible example is here: https://github.com/roxyboy/Cal1_SWOT-AdAC/blob/master/Filter-GIGATL.ipynb

I thought that the grids being irregular may be causing the issue so I then tried the REGULAR_WITH_LAND filter but I then got the error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_4431/1295202153.py in <module>
----> 1 SST_filtered_to_50km_regular = filter_50km_regular.apply(sst.reset_coords(drop=True), 
      2                                                          dims=['eta_rho', 'xi_rho']
      3                                                         ).compute()
      4 SST_filtered_to_50km_regular.plot()

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/xarray/core/dataarray.py in compute(self, **kwargs)
    909         """
    910         new = self.copy(deep=False)
--> 911         return new.load(**kwargs)
    912 
    913     def persist(self, **kwargs) -> "DataArray":

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/xarray/core/dataarray.py in load(self, **kwargs)
    883         dask.compute
    884         """
--> 885         ds = self._to_temp_dataset().load(**kwargs)
    886         new = self._from_temp_dataset(ds)
    887         self._variable = new._variable

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/xarray/core/dataset.py in load(self, **kwargs)
    848 
    849             # evaluate all the dask arrays simultaneously
--> 850             evaluated_data = da.compute(*lazy_data.values(), **kwargs)
    851 
    852             for k, data in zip(lazy_data, evaluated_data):

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/dask/base.py in compute(*args, **kwargs)
    566         postcomputes.append(x.__dask_postcompute__())
    567 
--> 568     results = schedule(dsk, keys, **kwargs)
    569     return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
    570 

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/distributed/client.py in get(self, dsk, keys, workers, allow_other_workers, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
   2702                     should_rejoin = False
   2703             try:
-> 2704                 results = self.gather(packed, asynchronous=asynchronous, direct=direct)
   2705             finally:
   2706                 for f in futures.values():

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous)
   2016             else:
   2017                 local_worker = None
-> 2018             return self.sync(
   2019                 self._gather,
   2020                 futures,

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/distributed/client.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
    857             return future
    858         else:
--> 859             return sync(
    860                 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
    861             )

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs)
    324     if error[0]:
    325         typ, exc, tb = error[0]
--> 326         raise exc.with_traceback(tb)
    327     else:
    328         return result[0]

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/distributed/utils.py in f()
    307             if callback_timeout is not None:
    308                 future = asyncio.wait_for(future, callback_timeout)
--> 309             result[0] = yield future
    310         except Exception:
    311             error[0] = sys.exc_info()

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/tornado/gen.py in run(self)
    760 
    761                     try:
--> 762                         value = future.result()
    763                     except Exception:
    764                         exc_info = sys.exc_info()

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker)
   1881                             exc = CancelledError(key)
   1882                         else:
-> 1883                             raise exception.with_traceback(traceback)
   1884                         raise exc
   1885                     if errors == "skip":

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/dask/optimization.py in __call__()
    967         if not len(args) == len(self.inkeys):
    968             raise ValueError("Expected %d args, got %d" % (len(self.inkeys), len(args)))
--> 969         return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
    970 
    971     def __reduce__(self):

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/dask/core.py in get()
    149     for key in toposort(dsk):
    150         task = dsk[key]
--> 151         result = _execute_task(task, cache)
    152         cache[key] = result
    153     result = _execute_task(out, cache)

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/dask/core.py in _execute_task()
    119         # temporaries by their reference count and can execute certain
    120         # operations in-place.
--> 121         return func(*(_execute_task(a, cache) for a in args))
    122     elif not ishashable(arg):
    123         return arg

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/gcm_filters/filter.py in filter_func()
    188             if filter_spec.is_laplacian[i]:
    189                 s_l = np.real(filter_spec.s[i])
--> 190                 tendency = laplacian(field_bar)  # Compute Laplacian
    191                 field_bar += (1 / s_l) * tendency  # Update filtered field
    192             else:

/mnt/meom/workdir/uchidat/miniconda3/envs/py39/lib/python3.9/site-packages/gcm_filters/kernels.py in __call__()
    168 
    169         out = (
--> 170             -self.wet_fac * out
    171             + np.roll(out, -1, axis=-1)
    172             + np.roll(out, 1, axis=-1)

TypeError: The numpy boolean negative, the `-` operator, is not supported, use the `~` operator or the logical_not function inst

@roxyboy roxyboy reopened this Sep 13, 2021
@iangrooms
Copy link
Member

iangrooms commented Sep 13, 2021

The second error you mentioned is easy to correct: wet_mask should be an array of floats, not bools. If you just update this

wet_mask = xr.DataArray(mask, dims=ds.temp.isel(time=0).dims,
                        coords=ds.temp.isel(time=0).coords
                       )

to this

wet_mask = xr.DataArray(mask.astype(float), dims=ds.temp.isel(time=0).dims,
                        coords=ds.temp.isel(time=0).coords
                       )

it fixes the error.

The noise output is the result of roundoff errors. Per the gcm-filters docs, you can improve this by using double precision. Your dataset uses float32 (single precision), so if you just promote using ds = ds.astype(np.float64) it fixes the noise.

Almost. The filtered sst field still has some noise. The original filter used 56 steps. One quick way to try to reduce noise is to manually reduce the number of filter steps. I tried the following

filter_50km_regular = gcm_filters.Filter(
    filter_scale=factor,
    dx_min=dx_min,
    filter_shape=filter_shape,
    n_steps = 50,
    grid_type=gcm_filters.GridType.REGULAR_WITH_LAND_AREA_WEIGHTED,
    grid_vars={'area': area.isel(eta_rho=slice(None,-1),xi_rho=slice(None,-1)), 
               'wet_mask': wet_mask.isel(eta_rho=slice(None,-1),xi_rho=slice(None,-1))}
)

and it worked quite well. (At least the plots didn't appear to have noise.)
To verify that the filter is still acting like it has a 50 km scale, I tried filter_50km_regular.plot_shape() and it looked like it was still quite accurate with only 50 steps.
(The default number of steps is conservative.)

You might play around with finding a value for n_steps that is as large as possible while avoiding noise in the output.
I verified that the behavior for the IRREGULAR_WITH_LAND filter is similar, i.e. switching to double precision and reducing n_steps fixes the problem.
In case you're interested, the noise is not a property just of the filter; it depends on the input too. So if you apply the exact same filter to a smoother field you might not see any noise. This makes it hard for us to set up defaults that will be guaranteed to avoid noise.

[edit:typo]

@rabernat
Copy link
Contributor

wet_mask should be an array of floats, not bools.

This is definitely counterintuitive to users. I hit the same problem when I was demoing gcm-filters the other day. I have opened #88 to track this specifically.


Takaya, I'm curious whether you had read the documentation on numerical instability before posting this. If so, is there some way we can make that section more prominent or more clear?

This makes it hard for us to set up defaults that will be guaranteed to avoid noise.

I think we should keep thinking about this. I have also hit the same noise problems quite often when playing around with gcm-filters recently. Perhaps our defaults need to be adjusted so this is less common? Or maybe providing a series of "recipes" known to work for common models?

@roxyboy
Copy link
Author

roxyboy commented Sep 14, 2021

Thanks @iangrooms for the detailed feedback. Your solutions worked like a charm :)

Takaya, I'm curious whether you had read the documentation on numerical instability before posting this. If so, is there some way we can make that section more prominent or more clear?

And my apologies for having missed the section on numerical instability. Perhaps having the package raise a warning with a link to the documentation when the input data is single precision could help the user in debugging the noise.

@roxyboy
Copy link
Author

roxyboy commented Sep 14, 2021

And my apologies for having missed the section on numerical instability. Perhaps having the package raise a warning with a link to the documentation when the input data is single precision could help the user in debugging the noise.

Shall I start a PR?

@NoraLoose
Copy link
Member

I'm closing this issue because the numerical instability issues is fixed as per version v0.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants