Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc updates #184

Merged
merged 13 commits into from
Dec 6, 2024
62 changes: 27 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,32 +4,39 @@
![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)
![CI](https://github.com/alan-turing-institute/clim-recal/actions/workflows/ci.yaml/badge.svg)

Welcome to `clim-recal`, a specialized resource designed to tackle systematic errors or biases in **Regional Climate Models (RCMs)**. As researchers, policy-makers, and various stakeholders explore publicly available RCMs, they need to consider the challenge of biases that can affect the accurate representation of climate change signals.
Welcome to `clim-recal`, a specialised resource which provides a data-processing pipeline for extracting parts of the **UK Climate Projections 2018 Convection Permitting model (UKCP18-CPM)** in order to apply and assess **bias correction methods** via adjustment to and comaprison with the **Had-UK grid**.
RuthBowyer marked this conversation as resolved.
Show resolved Hide resolved

`clim-recal` provides both a **broad review** of available bias correction methods as well as **software**, **practical tutorials** and **guidance** that helps users apply these methods methods to various datasets.

`clim-recal` is an **extensive software library and guide to application of Bias Correction (BC) methods**:
`clim-recal:`

- Contains accessible information about the [why and how of bias correction for climate data](#why-bias-correction)
- Is a software library for for the application of BC methods (see our full pipeline for bias-correction of the ground-breaking local-scale (2.2km) [Convection Permitting Model (CPM)](https://www.metoffice.gov.uk/pub/data/weather/uk/ukcp18/science-reports/UKCP-Convection-permitting-model-projections-report.pdf). `clim-recal` brings together different software packages in `python` and `R` that implement a variety of bias correction methods, making it easy to apply them to data and compare their outputs.
- Is a software library for pre-processing climate data to ready it for bias-correction
- Was developed in partnership with the MetOffice to ensure the propriety, quality, and usability of our work
- Provides a framework for open additions of new software libraries/bias correction methods (in planning)

# Overview: Bias Correction Pipeline
# Overview: Data-processing Pipeline

`clim-recal` is a debiasing pipeline, with the following steps:
Regional climate models (RCMs) contain systematic errors, or biases in their output [^1]. Biases arise in RCMs for a number of reasons, such as the assumptions in the general circulation models (GCMs), and in the downscaling process from GCM to RCM.

Researchers, policy-makers and other stakeholders wishing to use publicly available RCMs need to consider a range of "bias correction” methods (sometimes referred to as "bias adjustment" or "recalibration").
Bias correction methods offer a means of adjusting the outputs of RCM in a manner that might better reflect future climate change signals whilst preserving the natural and internal variability of climate [^2].

However, in order to apply and assess these methods, the climate model of interest needs to be overlaid to corresponding observation data. This can be a time-consuming and laborious process where data is spatially and temporally very granular.

The `clim-recal` pipeline addresses this by providing preprocessed data, including the innovative [UKCP18-CPM datasets](# The Datasets), to facilitate the assessment of these methods on aligned, reprojected data, without requiring the whole (very large) dataset.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the link to the datasets section might need to be specified as:

Suggested change
The `clim-recal` pipeline addresses this by providing preprocessed data, including the innovative [UKCP18-CPM datasets](# The Datasets), to facilitate the assessment of these methods on aligned, reprojected data, without requiring the whole (very large) dataset.
The `clim-recal` pipeline addresses this by providing preprocessed data, including the innovative [UKCP18-CPM datasets](#the-datasets), to facilitate the assessment of these methods on aligned, reprojected data, without requiring the whole (very large) dataset.


`clim-recal` is a data-processing pipeline, with the following steps:

1. **Set-up & data download**
*We provide custom scripts to facilitate download of data*
2. **Preprocessing**
*This includes reprojecting, resampling & splitting the data prior to bias correction*
3. **Apply bias correction**
*Our pipeline embeds two distinct methods of bias correction*
4. **Assess the debiased data**
*We have developed a way to assess the quality of the debiasing step across multiple alternative methods*


For a quick start on bias correction, refer to our [pipeline guide](python/README.md).
RuthBowyer marked this conversation as resolved.
Show resolved Hide resolved

Our work is however, just like climate data, intended to be dynamic, and we welcome collaboration from researchers who wish to further our aims!


# Documentation

We are in the process of developing comprehensive documentation for our code base to supplement the guidance provided in this and other `README.md` files. In the interim, there is documentation available in the following forms:
Expand All @@ -44,50 +51,32 @@ We are in the process of developing comprehensive documentation for our code bas
## To use `clim-recal` programmatically

- There are extensive [`API Reference`](docs/reference) within the python code.
- Comments within `R` scripts

## To contribute to `clim-recal`

- See the [Contributing](docs/contributing.md) section below

# The Datasets

## UKCP18
## UKCP18-CPM
The [UK Climate Projections 2018 (UKCP18)](https://www.metoffice.gov.uk/research/approach/collaboration/ukcp) dataset offers insights into the potential climate changes in the UK. UKCP18 is an advancement of the UKCP09 projections and delivers the latest evaluations of the UK's possible climate alterations in land and marine regions throughout the 21st century. This crucial information aids in future Climate Change Risk Assessments and supports the UK’s adaptation to climate change challenges and opportunities as per the National Adaptation Programme.
We make use of the [Convection Permitting Model (CPM)](https://www.metoffice.gov.uk/pub/data/weather/uk/ukcp18/science-reports/UKCP-Convection-permitting-model-projections-report.pdf). This dataset represents a much finer resolution of climate model (2.2km grid) than typical climate-models, representing a step forward in the ability to simulate small scale behavior (in particular 'atmospheric convection'), and the influence of mountains, coastlines and urban areas. As a result, the CPM provides access to credible climate information important for small-scale weather features and also on local (kilometre) scale; which is particularly important for improving our understanding of climate change in cities.

The UKCP18-CPM is comprised of 12 ensemble members (or runs). In addition to run 1, we selected the runs which represented the mean, 2nd highest and 2nd lowest daily tasmax values across the whole sequence (runs 5, 6, 7 & 8) to provide users with enough uncertainty in their estimates to appropriately assess bias correction methods.

## HADS
[HadUK-Grid](https://www.metoffice.gov.uk/research/climate/maps-and-data/data/haduk-grid/haduk-grid) is a comprehensive collection of climate data for the UK, compiled from various land surface observations across the country. This data is organized into a uniform grid to ensure consistent coverage throughout the UK at up to 1km x 1km resolution. The dataset, spanning from 1836 to the present, includes a variety of climate variables such as air temperature, precipitation, sunshine, and wind speed, available on daily, monthly, seasonal, and annual timescales.

# Why Bias Correction?

Regional climate models contain systematic errors, or biases in their output [^1]. Biases arise in RCMs for a number of reasons, such as the assumptions in the general circulation models (GCMs), and in the downscaling process from GCM to RCM.

Researchers, policy-makers and other stakeholders wishing to use publicly available RCMs need to consider a range of "bias correction” methods (sometimes referred to as "bias adjustment" or "recalibration"). Bias correction methods offer a means of adjusting the outputs of RCM in a manner that might better reflect future climate change signals whilst preserving the natural and internal variability of climate [^2].

Part of the `clim-recal` project is to review several bias correction methods. This work is ongoing and you can find our initial [taxonomy here](https://docs.google.com/spreadsheets/d/18LIc8omSMTzOWM60aFNv1EZUl1qQN_DG8HFy1_0NdWk/edit?usp=sharing). When we've completed our literature review, it will be submitted for publication in an open peer-reviewed journal.

Our work is however, just like climate data, intended to be dynamic, and we are in the process of setting up a pipeline for researchers creating new methods of bias correction to be able to submit their methods for inclusion on in the `clim-recal` repository.

[^1]: Senatore et al., 2022, <https://doi.org/10.1016/j.ejrh.2022.101120>
[^2]: Ayar et al., 2021, <https://doi.org/10.1038/s41598-021-82715-1>


# Contributing

If you have suggestions on the repository, or would like to include a new method (see below) or library, please
If you have suggestions on the repository, please:
- raise an [issue](https://github.com/alan-turing-institute/clim-recal/issues)
- [get in touch](mailto:[email protected])
- see our [contributing](docs/contributing.md) section, which includes details on contriubting to the documentation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- see our [contributing](docs/contributing.md) section, which includes details on contriubting to the documentation.
- see our [contributing](docs/contributing.md) section, which includes details on contributing to the documentation.


All are welcome and appreciated.

# Future plans
- **Finish refactor for BC**: The infrastructure for testing bias correction methods needs some reworking and documentation.
- **Release BC results**: Provide results from example BC runs.
- **More BC Methods**: Further bias correction of UKCP18 products. *This is planned for a future release and is not available yet.*
- **Pipeline for adding new methods**: *This is planned for a future release and is not available yet.*


## Acknowledgements

Prior to 12th September 2024 we included a reference to the [python-cmethods](https://github.com/btschwertfeger/python-cmethods) library, written by Benjamin Thomas Schwertfeger.
Expand All @@ -102,6 +91,9 @@ Inadvertently, we did not identify that the license for the `python-cmethods` li
* Added the citation below.


## Citation
## Citations

[^1]: Senatore et al., 2022, <https://doi.org/10.1016/j.ejrh.2022.101120>
[^2]: Ayar et al., 2021, <https://doi.org/10.1038/s41598-021-82715-1>

**python-cmethods**: Benjamin T. Schwertfeger. (2024). btschwertfeger/python-cmethods: v2.3.0 (v2.3.0). Zenodo. https://doi.org/10.5281/zenodo.12168002
3 changes: 0 additions & 3 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,6 @@ project:
- "README.md"
- "setup-instructions.md"
- "!clim-recal.Rproj"
- "R/README.md"
- "R/misc/Identifying_Runs.md"
- "R/comparing-r-and-python/HADs-reprojection/WIP-Comparing-HADs-grids.md"
- "docs/cpm_projection.qmd"
- "docs/reference"
- "docs/contributing.md"
Expand Down
8 changes: 4 additions & 4 deletions docs/download.qmd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested to know why you changed the region names to lowercase? Is this just cosmetic or did it not work otherwise?

  • If it is cosmetic, I'm happy to go with lowercase.
  • If the download commands didn't work with the title case, then we need to check this. It implies there is something wrong - possibly a setting in the Azure storage that we need to double-check.

(All the commands use the grep switches -i and -E. It might be appropriate to add a small note to explain the meaning of these and/or point to existing documentation for these standard tools).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging this @andrewphilipsmith - also happy to go with lowercase. I've taken a look and both title and lower case versions of the example command seem to function as expected.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks folks - I thought when I tried only the lowercase version worked for some reason! But probably I just had a typo in the actual name or something - I will change back!

Original file line number Diff line number Diff line change
Expand Up @@ -34,21 +34,21 @@ grep -iE "resample.*cpm.*rainfall.*01.*_[0-9]{8}-[0-9]{8}.*" data-v1.0.txt | xar

## Crops
### HADS
For a given region `<REGION>` (either `Scotland`, `Glasgow`, `Manchester` or `London`), for measurement `<MEASURE>` (either `tasmax`, `tasmin` or `pr`), the monthly data can be downloaded and decompressed with:
For a given region `<REGION>` (either `scotland`, `glasgow`, `manchester` or `london`), for measurement `<MEASURE>` (either `tasmax`, `tasmin` or `pr`), the monthly data can be downloaded and decompressed with:
```shell
grep -iE "crop.*hads.*<REGION>.*<MEASURE>.*<RUN>_[0-9]{8}-[0-9]{8}.*" data-v1.0.txt | xargs -n 1 curl -O; gunzip *.nc.gz
```
For example, for region is `Manchester`, measure is `tasmax`:
For example, for region is `manchester`, measure is `tasmax`:
```shell
grep -iE ".*crop.*hads.*manchester.*tasmax.*_[0-9]{8}-[0-9]{8}\.nc\.gz" data-v1.0.txt | xargs -n 1 curl -O; gunzip *.nc.gz
```

### CPM
For a given region `<REGION>` (either `Scotland`, `Glasgow`, `Manchester` or `London`), for measurement `<MEASURE>` (either `tasmax`, `tasmin` or `pr`), for run `<RUN>` (either `01`, `05`, `06`, `07`, `08`), the yearly data can be downloaded and decompressed with:
For a given region `<REGION>` (either `scotland`, `glasgow`, `manchester` or `london`), for measurement `<MEASURE>` (either `tasmax`, `tasmin` or `pr`), for run `<RUN>` (either `01`, `05`, `06`, `07`, `08`), the yearly data can be downloaded and decompressed with:
```shell
grep -iE "crop.*cpm.*<REGION>.*<MEASURE>.*<RUN>_[0-9]{8}-[0-9]{8}.*" data-v1.0.txt | xargs -n 1 curl -O; gunzip *.nc.gz
```
For example, for region `Manchester`, measure `tasmax`, run `01`:
For example, for region `manchester`, measure `tasmax`, run `01`:
```shell
grep -iE ".*crop.*cpm.*manchester.*tasmax.*01_[0-9]{8}-[0-9]{8}\.nc\.gz" data-v1.0.txt | xargs -n 1 curl -O; gunzip *.nc.gz
```
Loading