Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--sanitize-with option seems to be behaving weirdly with "newly computed (test) output" #204

Open
tlvu opened this issue Mar 23, 2024 · 0 comments

Comments

@tlvu
Copy link
Contributor

tlvu commented Mar 23, 2024

We have this failure below and we could not understand why our output-sanitize.cfg regex file is unable to cover the diff.

Why is the "newly computed (test) output" has the "Text(0.5, 91.20243008191655, 'Longitude')" part in it.

  _ pavics-sdi-fix_nbs_jupyter_alpha/docs/source/notebooks/regridding.ipynb::Cell 27 _
  Notebook cell execution failed
  Cell 27: Cell outputs differ

  Input:
  # Now we can plot easily the results as a choropleth map!
  ax = shapes_data.plot(
      "tasmin", legend=True, legend_kwds={"label": "Minimal temperature 1993-05-20 [K]"}
  )
  ax.set_ylabel("Latitude")
  ax.set_xlabel("Longitude");

  Traceback:
  dissimilar number of outputs for key "text/plain"<<<<<<<<<<<< Reference outputs from ipynb file:
  <Figure size LENGTHxWIDTH with N Axes>
  ============ disagrees with newly computed (test) output:
  Text(0.5, 91.20243008191655, 'Longitude')
  <Figure size LENGTHxWIDTH with N Axes>
  >>>>>>>>>>>>

output-sanitize.cfg:

[finch-figure-size]
regex: <Figure size \d+x\d+\swith\s\d\sAxes>
replace: <Figure size LENGTHxWIDTH with N Axes>

Full output-sanitize.cfg file: https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/a4592eab55ad177b00cba77f126be56ff6566287/notebooks/output-sanitize.cfg#L92-L94

Notebook: https://github.com/Ouranosinc/pavics-sdi/blob/4ffec2df463b413c78991e9481fbe182537b3a65/docs/source/notebooks/regridding.ipynb

command:

py.test --nbval pavics-sdi-fix_nbs_jupyter_alpha_refresh_output/docs/source/notebooks/regridding.ipynb --sanitize-with notebooks/output-sanitize.cfg --dist=loadscope --numprocesses=0

============================= test session starts ==============================
platform linux -- Python 3.10.13, pytest-8.1.1, pluggy-1.4.0
rootdir: /home/jenkins/agent/workspace/_workflow-tests_new-docker-build
plugins: anyio-4.3.0, dash-2.16.1, nbval-0.11.0, tornasync-0.6.0.post2, xdist-3.5.0

tlvu added a commit to Ouranosinc/pavics-sdi that referenced this issue Mar 26, 2024
Work-around for
computationalmodelling/nbval#204

Have to use NBVAL_IGNORE_OUTPUT because the sanitize file seems to not
be working in this case, see the issue above.

```
  _ pavics-sdi-fix_nbs_jupyter_alpha_refresh_output/docs/source/notebooks/regridding.ipynb::Cell 4 _
  Notebook cell execution failed
  Cell 4: Cell outputs differ

  Input:
  ds_tgt.cf.plot.scatter(x="longitude", y="latitude", s=0.1)
  plt.title("Target regular grid");

  Traceback:
  dissimilar number of outputs for key "text/plain"<<<<<<<<<<<< Reference outputs from ipynb file:
  <Figure size LENGTHxWIDTH with N Axes>
  ============ disagrees with newly computed (test) output:
  Text(0.5, 1.0, 'Target regular grid')
  <Figure size LENGTHxWIDTH with N Axes>
  >>>>>>>>>>>>

  _ pavics-sdi-fix_nbs_jupyter_alpha_refresh_output/docs/source/notebooks/regridding.ipynb::Cell 6 _
  Notebook cell execution failed
  Cell 6: Cell outputs differ

  Input:
  reg_bil = xe.Regridder(ds_in, ds_tgt, "bilinear")
  reg_bil  # Show information about the regridding

  Traceback:
  Missing output fields from running code: {'stderr'}

  _ pavics-sdi-fix_nbs_jupyter_alpha_refresh_output/docs/source/notebooks/regridding.ipynb::Cell 7 _
  Notebook cell execution failed
  Cell 7: Cell outputs differ

  Input:
  # xesmf/frontend.py:476: FutureWarning: ``output_sizes`` should be given in the ``dask_gufunc_kwargs`` parameter. It will be removed as direct parameter in a future version.
  warnings.filterwarnings("ignore", category=FutureWarning)

  # Apply the regridding weights to the input sea ice concentration data
  sic_bil = reg_bil(ds_in.siconc)

  # Plot the results
  sic_bil.isel(time=0).plot(cmap=cmap)
  plt.title("Regridded sic data (Jan 2020)");

  Traceback:
  dissimilar number of outputs for key "text/plain"<<<<<<<<<<<< Reference outputs from ipynb file:
  <Figure size LENGTHxWIDTH with N Axes>
  ============ disagrees with newly computed (test) output:
  Text(0.5, 1.0, 'Regridded sic data (Jan 2020)')
  <Figure size LENGTHxWIDTH with N Axes>
  >>>>>>>>>>>>

  _ pavics-sdi-fix_nbs_jupyter_alpha_refresh_output/docs/source/notebooks/regridding.ipynb::Cell 20 _
  Notebook cell execution failed
  Cell 20: Cell outputs differ

  Input:
  reg_mask_cons = xe.Regridder(ds_in_mask, ds_tgt_mask, "conservative")
  tasmin_mask_cons = reg_mask_cons(ds_in_mask.tasmin)

  fig, ax = plt.subplots(figsize=(6, 4))
  tasmin_mask_cons.plot(cmap=cmap, ax=ax)
  ax.set_xlim(210, 320)
  ax.set_ylim(38, 86)
  ax.set_title("Conservative regridding without normalization - zoom on Canada")
  ax.annotate(
      "Some values are close to 0 Kelvins.\nCanada can get cold, but not that cold!",
      (280, 40),
      xytext=(1.3, 0.3),
      xycoords="data",
      textcoords="axes fraction",
      fontsize="x-large",
      arrowprops=dict(arrowstyle="->", connectionstyle="arc3, rad=-0.3"),
  );

  Traceback:
  dissimilar number of outputs for key "text/plain"<<<<<<<<<<<< Reference outputs from ipynb file:
  <Figure size LENGTHxWIDTH with N Axes>
  ============ disagrees with newly computed (test) output:
  Text(1.3, 0.3, 'Some values are close to 0 Kelvins.\nCanada can get cold, but not that cold!')
  <Figure size LENGTHxWIDTH with N Axes>
  >>>>>>>>>>>>

  _ pavics-sdi-fix_nbs_jupyter_alpha_refresh_output/docs/source/notebooks/regridding.ipynb::Cell 27 _
  Notebook cell execution failed
  Cell 27: Cell outputs differ

  Input:
  # Now we can plot easily the results as a choropleth map!
  ax = shapes_data.plot(
      "tasmin", legend=True, legend_kwds={"label": "Minimal temperature 1993-05-20 [K]"}
  )
  ax.set_ylabel("Latitude")
  ax.set_xlabel("Longitude");

  Traceback:
  dissimilar number of outputs for key "text/plain"<<<<<<<<<<<< Reference outputs from ipynb file:
  <Figure size LENGTHxWIDTH with N Axes>
  ============ disagrees with newly computed (test) output:
  Text(0.5, 91.20243008191655, 'Longitude')
  <Figure size LENGTHxWIDTH with N Axes>
  >>>>>>>>>>>>
```
tlvu added a commit to Ouranosinc/PAVICS-e2e-workflow-tests that referenced this issue May 9, 2024
…t xclim and ravenpy to smooth transition (#121)

# Overview

This new full build has latest of almost everything except `xclim` and
`ravenpy` as intermediate step to smooth transition to `pandas` 2.2 freq
strings changes.

## Changes

- New: save conda env export, DockerHub build logs and Jenkins test
result in the repo to track changes much more easily between releases

- Jenkins: add `SAVE_RESULTING_NOTEBOOK_TIMEOUT` for slow notebooks or
slow machine

- Jupyter env changes:
- add `conda-pack` so we can export the conda env outside of the docker
image if need to run locally without docker
  - upgrade from Python 3.9 to 3.11
  - Relevant changes (alphabetical order):
```diff
-  - birdy=0.8.4=pyh1a96a4e_0
+      - birdhouse-birdy==0.8.7

# major upgrade from v2 to v3
-  - bokeh=2.4.3=pyhd8ed1ab_3
+  - bokeh=3.4.1=pyhd8ed1ab_0

-  - cartopy=0.21.1=py39h6e7ad6e_0
+  - cartopy=0.23.0=py311h320fe9a_0

-  - cf_xarray=0.8.0=pyhd8ed1ab_0
+  - cf_xarray=0.9.0=pyhd8ed1ab_0

-  - cfgrib=0.9.10.4=pyhd8ed1ab_0
+  - cfgrib=0.9.11.0=pyhd8ed1ab_0

-  - cftime=1.6.2=py39h2ae25f5_1
+  - cftime=1.6.3=py311h1f0f07a_0

-  - climpred=2.3.0=pyhd8ed1ab_0
+  - climpred=2.4.0=pyhd8ed1ab_0

-  - clisops=0.9.6=pyh1a96a4e_0
+  - clisops=0.13.0=pyhca7485f_0

-  - dask=2023.5.1=pyhd8ed1ab_0
+  - dask=2024.5.0=pyhd8ed1ab_0

-  - geopandas=0.13.0=pyhd8ed1ab_0
+  - geopandas=0.14.4=pyhd8ed1ab_0

-  - hvplot=0.8.3=pyhd8ed1ab_0
+  - hvplot=0.9.2=pyhd8ed1ab_0

-  - numpy=1.23.5=py39h3d75532_0
+  - numpy=1.24.4=py311h64a7726_0

-  - numba=0.57.0=py39hb75a051_1
+  - numba=0.59.1=py311h96b013e_0

# major upgrade from v1 to v2
-  - pandas=1.3.5=py39hde0f152_0
+  - pandas=2.1.4=py311h320fe9a_0

# major upgrade to v1
-  - panel=0.14.4=pyhd8ed1ab_0
+  - panel=1.4.2=pyhd8ed1ab_0

# major upgrade from v1 to v2
-  - pydantic=1.10.8=py39hd1e30aa_0
+  - pydantic=2.7.1=pyhd8ed1ab_0

# Python 3.9 to 3.11
-  - python=3.9.16=h2782a2a_0_cpython
+  - python=3.11.6=hab00c5b_0_cpython

-  - raven-hydro=0.2.1=py39h8e2dbb5_1
+  - raven-hydro=0.2.4=py311h64a4d7b_0

-  - ravenpy=0.12.1=py39hf3d152e_0
+      - ravenpy==0.13.1

-  - rioxarray=0.14.1=pyhd8ed1ab_0
+  - rioxarray=0.15.5=pyhd8ed1ab_0

-  - roocs-utils=0.6.4=pyh1a96a4e_0
+  - roocs-utils=0.6.8=pyhd8ed1ab_0

-  - scipy=1.9.1=py39h8ba3f38_0
+  - scipy=1.13.0=py311h517d4fd_1

-  - xarray=2023.1.0=pyhd8ed1ab_0
+  - xarray=2023.8.0=pyhd8ed1ab_0

-  - xclim=0.43.0=py39hf3d152e_1
+  - xclim=0.47.0=py311h38be061_0

-  - xesmf=0.7.1=pyhd8ed1ab_0
+  - xesmf=0.8.5=pyhd8ed1ab_0

-  - xskillscore=0.0.24=pyhd8ed1ab_0
+  - xskillscore=0.0.26=pyhd8ed1ab_0

+  - xscen=0.8.2=pyhd8ed1ab_0

+      - figanos==0.3.0

-      - xncml==0.2
+      - xncml==0.4.0

```


## Test

- Deployed as "beta" image in production for bokeh visualization
performance regression testing.
- Manual test notebook
https://github.com/Ouranosinc/PAVICS-landing/blob/master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-5Visualization.ipynb
for bokeh visualization performance and it looks fine.
- Jenkins build:
- Default notebooks, all passed:
https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/54792e6510adfcd1bb21e1bd31fdfa36c5c634e0/docker/saved_buildout/jenkins-buildlogs-default.txt
- Raven notebooks, only known `HydroShare_integration.ipynb` failing:
https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/931cfc924a147d07b59e88badff9f170e852a03b/docker/saved_buildout/jenkins-buildlogs-raven.txt


## Related Issue / Discussion

- Matching notebook fixes:
  - Pavics-sdi: PR Ouranosinc/pavics-sdi#321
  - Finch: PR url: None
- PAVICS-landing: PR
Ouranosinc/PAVICS-landing#78
  - RavenPy: PR CSHS-CWRA/RavenPy#356
  - Resolves Ouranosinc/PAVICS-landing#65
  - Resolves Ouranosinc/PAVICS-landing#66

- Deployment to PAVICS:
bird-house/birdhouse-deploy#453

- Jenkins-config changes for new notebooks: PR url: None

- Other issues found while working on this one
  - computationalmodelling/nbval#204
  - jupyterlab-contrib/jupyter-archive#132
  - CSHS-CWRA/RavenPy#357
  - CSHS-CWRA/RavenPy#361
  - CSHS-CWRA/RavenPy#362

- Previous release: PR
#134


## Additional Information

Full diff conda env export:

81deb99...931cfc9#diff-e8f2a6a53085ae29bb7cedc701c1d345a330651ae971555e85a5c005e94f4cd9


Full new conda env export:

https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/931cfc924a147d07b59e88badff9f170e852a03b/docker/saved_buildout/conda-env-export.yml


DockerHub build log

https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/931cfc924a147d07b59e88badff9f170e852a03b/docker/saved_buildout/docker-buildlogs.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant