diff --git a/README.md b/README.md index e5c6d95a..516dff15 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # regional-mom6 -*Python package for automatic generation of regional configurations for the [Modular Ocean Model 6](https://github.com/mom-ocean/MOM6).* +*Python package for automatic generation of regional configurations for the [Modular Ocean Model version 6](https://github.com/mom-ocean/MOM6) (MOM6).* [![Repo status](https://www.repostatus.org/badges/latest/active.svg?style=flat-square)](https://www.repostatus.org/#active) [![conda forge](https://img.shields.io/conda/vn/conda-forge/regional-mom6.svg)](https://anaconda.org/conda-forge/regional-mom6) @@ -27,10 +27,12 @@ The idea behind this package is that it should let the user sidestep some of the ## Limitations -- Only supports one type of regional horizontal grid, namely one that's equally spaced in longitude - and latitude. Users can provide their own grid, or ideally [open a pull request](https://github.com/COSIMA/regional-mom6/pulls) with a method that implements another type of horizontal grid! -- Only boundary segments that are parallel to either lines of constant longitude or constant latitude - lines are supported. +- Only generates regional horizontal grids with uniform spacing in longitude and latitude. + However, users can provide their own non-uniform grid, or ideally + [open a pull request](https://github.com/COSIMA/regional-mom6/pulls) with a method that + generates other types of horizontal grids. +- Only supports boundary segments that are parallel to either lines of constant longitude or + lines of constant latitude. ## We want to hear from you @@ -64,11 +66,11 @@ conda install conda-forge::regional-mom6 That's it -- now enjoy! -#### "*But I want `pip`, can't I install with `pip`*?" +#### "*But I want pip, can't I install with pip*?" To install via `pip` is a bit more cumbersome. -A prerequisite is the binary `esmpy` dependency, which provides regridding capabilities. +A prerequisite is the binary `esmpy` dependency, which provides re-gridding capabilities. The easiest way to install `esmpy` is via conda: ```bash @@ -97,7 +99,7 @@ pip install git+https://github.com/COSIMA/regional-mom6.git ``` to get the version that corresponds to the latest commit in GitHub. -Alternatively, install the version that corresponds to a particular git commit using +Alternatively, install the version that corresponds to a particular git commit using, for example, ```bash pip install git+https://github.com/COSIMA/regional-mom6.git@061b0ef80c7cbc04de0566df329c4ea472002f7e @@ -122,7 +124,6 @@ necessarily those used by the GFDL's [`MOM6_examples`](https://github.com/NOAA-G The [example notebooks](https://regional-mom6.readthedocs.io/en/latest/demos.html) walk you through how to use the package using two different sets of input datasets. -Please ensure that you can get at least one of these working on your setup with your MOM6 executable before trying modify the example to suit your domain with your bathymethry, forcing, and boundary conditions. - -You can download the notebooks [from Github](https://github.com/COSIMA/regional-mom6/tree/ncc/installation/demos) or by clicking on the download download button, e.g., at the top-right of the [regional tasmania forced by ERA5 example](https://regional-mom6.readthedocs.io/en/latest/demo_notebooks/reanalysis-forced.html). +Please ensure that you can get at least one of these working on your setup with your MOM6 executable before trying to modify the example to suit your domain with your bathymetry, forcing, and boundary conditions. +You can download the notebooks [from Github](https://github.com/COSIMA/regional-mom6/tree/main/demos) or by clicking on the download download button, e.g., at the top-right of the [regional Tasmania forced by ERA5 example](https://regional-mom6.readthedocs.io/en/latest/demo_notebooks/reanalysis-forced.html). diff --git a/demos/access_om2-forced.ipynb b/demos/access_om2-forced.ipynb index 9b3cb19a..4ceac2ce 100644 --- a/demos/access_om2-forced.ipynb +++ b/demos/access_om2-forced.ipynb @@ -870,7 +870,7 @@ "expt.initial_condition(\n", " tmp_dir + '/ic_unprocessed.nc', # directory where the unprocessed initial condition is stored, as defined earlier\n", " ocean_varnames,\n", - " gridtype=\"B\"\n", + " arakawa_grid=\"B\"\n", " )\n", "\n", "# Now iterate through our four boundaries \n", diff --git a/demos/premade_run_directories/README.md b/demos/premade_run_directories/README.md new file mode 100644 index 00000000..a3488a41 --- /dev/null +++ b/demos/premade_run_directories/README.md @@ -0,0 +1,5 @@ +## Premade Run Directories + +These directories are used for the demo notebooks, and can be used as templates for setting up a new experiment. The [documentation](https://regional-mom6.readthedocs.io/en/latest/mom6-file-structure-primer.html) explains what all the files are for. + +The `common_files` folder contains all of the required files for a regional MOM6 run, including a `data_table` with constant surface forcing as a placeholder. The other two directories offer different choices for the surface forcing, affecting the `data_table`. diff --git a/demos/premade_run_directories/readme_premade_rundirs.md b/demos/premade_run_directories/readme_premade_rundirs.md deleted file mode 100644 index 18b5da0c..00000000 --- a/demos/premade_run_directories/readme_premade_rundirs.md +++ /dev/null @@ -1,6 +0,0 @@ - -## Premade Run Directories - -These folders are used for the demo notebooks, and can be used as templates for setting up a new experiment. The difference in the choices is the surface forcing, affecting the `data_table`. - -The `common_files` folder contains all of the required files for a regional MOM6 run, including a `data_table` with constant surface forcing as a placeholder. If you call `.setup_rundir` and specify `jra` \ No newline at end of file diff --git a/demos/reanalysis-forced.ipynb b/demos/reanalysis-forced.ipynb index dce24e32..7dd06778 100644 --- a/demos/reanalysis-forced.ipynb +++ b/demos/reanalysis-forced.ipynb @@ -257,7 +257,7 @@ "expt.initial_condition(\n", " glorys_path / \"ic_unprocessed.nc\", # directory where the unprocessed initial condition is stored, as defined earlier\n", " ocean_varnames,\n", - " gridtype=\"A\"\n", + " arakawa_grid=\"A\"\n", " )\n", "\n", "# Now iterate through our four boundaries \n", diff --git a/docs/contributing.md b/docs/contributing.md index 76ea192c..ca7e363c 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -10,20 +10,21 @@ how new docstrings or any new bits of documentation that you may have added look ## Testing To run the tests from a local clone of the repository we first need to create a conda -environment with all the dependencies required. From the repositories local clone main -directory do +environment with all the required dependencies. + +We create the environment by calling ```{code-block} bash conda env create --prefix ./env --file environment-ci.yml ``` -and then activate it +from the repository's local clone main directory. Then we activate it via ```{code-block} bash conda activate ./env ``` -Now we need to install the package in this environment as well as `pytest` +We then install both the package in this environment as well as the `pytest` package: ```{code-block} bash python -m pip install . @@ -36,8 +37,8 @@ Now we can run the tests with python -m pytest tests/ ``` -If we also want to run the doctests (tests that appear as examples in various docstrings), we -can use +If we also want to run the doctests (that is, the tests that appear as examples in +various docstrings), we can use ```{code-block} bash python -m pytest --doctest-modules tests/ regional_mom6/ @@ -46,30 +47,31 @@ python -m pytest --doctest-modules tests/ regional_mom6/ ## Documentation To build the docs from a local clone of the repository we first need to create a conda -environment. Navigate to the `docs` directory of your local repository clone (e.g., `cd docs`) -and then +environment after we first navigate to the `docs` directory of our local repository clone. ```{code-block} bash cd docs conda create --name docs-env --file requirements.txt ``` -Then activate this environment and install the package itself as an editable install (`pip install -e`). +We activate this environment and install the package itself as an editable install (`pip install -e`). ```{code-block} bash conda activate docs-env pip install -e .. ``` -Now we can make the docs +Now we can build the docs via `make`: ```{code-block} bash make html ``` -and open `_build/html/index.html` in your favorite browser. +and upon successful build, we preview the documentation by opening `_build/html/index.html` +in our favorite browser. -Alternatively, we can install the dependencies needed for the docs via `pip`; the rest is same, that is +Alternatively, instead of creating a conda environment, we can install the required +dependencies for the docs via `pip`; the rest is same, that is ```{code-block} bash cd docs diff --git a/docs/index.rst b/docs/index.rst index f1534b06..1b1fcba0 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,7 +1,7 @@ Regional MOM6 Documentation =========================== -*Python package for automatic generation of regional configurations for the `Modular Ocean Model 6`_.* +Python package for automatic generation of regional configurations for the `Modular Ocean Model version 6`_ (MOM6). In brief... @@ -35,11 +35,11 @@ Features Limitations ------------ -- Only supports one type of regional horizontal grid, namely one that's equally spaced in longitude - and latitude. Users can provide their own grid, or ideally `open a pull request`_ with a method - that implements another type of horizontal grid! -- Only boundary segments that are parallel to either lines of constant longitude or constant latitude - lines are supported. +- Only generates regional horizontal grids with uniform spacing in longitude and latitude. + However, users can provide their own non-uniform grid, or ideally `open a pull request`_ + with a method that generates other types of horizontal grids. +- Only supports boundary segments that are parallel to either lines of constant longitude or lines of + constant latitude. What you need to get started @@ -50,12 +50,12 @@ What you need to get started 3. a bathymetry file that at least covers your domain, 4. 3D ocean forcing files *of any resolution* on your choice of A, B, or C Arakawa grid, 5. surface forcing files (e.g., from ERA or JRA reanalysis), and -6. `GFDL's FRE tools `_ be downloaded and compiled on the machine you are using. +6. `GFDL's FRE tools `_ downloaded and compiled on the machine you are using. Browse through the `demos `_. -.. _Modular Ocean Model 6: https://github.com/mom-ocean/MOM6 +.. _Modular Ocean Model version 6: https://github.com/mom-ocean/MOM6 .. _open a pull request: https://github.com/COSIMA/regional-mom6/pulls diff --git a/docs/installation.md b/docs/installation.md index 950d31e9..0fad0d1f 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -1,4 +1,5 @@ -# Installation +Installation +============ We encourage creating a new or using an existing conda environment. @@ -16,7 +17,7 @@ That's it -- now enjoy! To install via `pip` is a bit more cumbersome. -A prerequisite is the binary `esmpy` dependency, which provides regridding capabilities. +A prerequisite is the binary `esmpy` dependency, which provides re-gridding capabilities. The easiest way to install `esmpy` is via conda: ```bash @@ -45,7 +46,7 @@ pip install git+https://github.com/COSIMA/regional-mom6.git ``` to get the version that corresponds to the latest commit in GitHub. -Alternatively, install the version that corresponds to a particular git commit using +Alternatively, install the version that corresponds to a particular git commit using, for example, ```bash pip install git+https://github.com/COSIMA/regional-mom6.git@061b0ef80c7cbc04de0566df329c4ea472002f7e diff --git a/docs/mom6-file-structure-primer.md b/docs/mom6-file-structure-primer.md index 503eb6c0..708fdebd 100644 --- a/docs/mom6-file-structure-primer.md +++ b/docs/mom6-file-structure-primer.md @@ -1,44 +1,47 @@ A primer on MOM6 file structure =============================== -Here we describe the various directories and files that `regional-mom6` package produces alongside with -userful insights that, hopefully, will help users deal with troubleshooting and more advanced customisations. +Here we describe the various directories and files that the `regional-mom6` package produces, and provide +useful insights that, hopefully, will help users deal with troubleshooting and more advanced customisations. ## `run` directory The directory, specified by the `mom_run_dir` path keyword argument in the `experiment` class, contains only text files that configure MOM6 and are used at model initialisation. -You can see examples of these files in the `premade_run_directories`. +You can see examples of these files in [`demos/premade_run_directories`](https://github.com/COSIMA/regional-mom6/tree/main/demos/premade_run_directories). These files are: * `input.nml`: High-level information that is passed directly to each component of your MOM6 setup. - The paths of to the `SIS` and `MOM` input directories and outputs are included. + The paths of the `SIS` and `MOM` input directories and outputs are included. The `coupler` section turns on or off different model components, and specifies how long to run the experiment for. * `diag_table`: The diagnostics to output at model runtime. We need to choose wisely the quantities/output frequency that are relevant to our experiment and the - analysis we plan to do otherwise the size of output can quickly grow a lot. + analysis we plan to do otherwise the size of output can quickly become excessive. Each line in the `diag_table` either specifies a new output *file* and its associated characteristics, or a new output *variable* and the matching file that it should go in (which needs to already have been specified). - If uncertain of the available diagnostics, we can runthe model for a short period (e.g., 1 hour) and then + If uncertain of the available diagnostics, we can run the model for a short period (e.g., 1 hour) and then look in the output directory for `available_diags` that lists every available diagnostic for our model configuration also mentioning which grids the quantity can be output on. Aside from the native model grid, we can create our own custom vertical coordinates to output on. - To output on a custom vertical coordinate, create a netCDF that contains all of the vertical points + To output on a custom vertical coordinate, create a netCDF file that contains all of the vertical points (in the coordinate of your choice) and then edit the `MOM_input` file to specify additional diagnostic coordinates. After that, we can use this custom vertical coordinate in the `diag_table`. Instructions for how to format the `diag_table` are included in the [MOM6 documentation](https://mom6.readthedocs.io/en/dev-gfdl/api/generated/pages/Diagnostics.html). -* `data_table` +* `data_table`: The data table is read by the coupler to provide the different model components with inputs. Instructions for how to format the `data_table` are included in the [MOM6 documentation](https://mom6.readthedocs.io/en/dev-gfdl/forcing.html). -* `MOM_input / SIS_input` +* `field_table`: + The field table defines tracer fields. + +* `MOM_input / SIS_input`: Basic settings for the core MOM and SIS code with reasonably good documentation. After running the experiment for a short amount of time, you can find a `MOM_parameter_doc.all` file which lists every possible setting your can modify for your experiment. The `regional-mom6` package can copy and modify a default set of input files to work with your experiment. @@ -50,18 +53,20 @@ These files are: Another important part section for regional modelling is the specification of open boundary segments. A separate line for each boundary in our domain is included and also any additional tracers need to be specified here. -* `MOM_override` - This file serves to override settings chosen in other input files. This is helpful for making a distinction between the thing you're fiddling with and 95% of the settings that you'll always be leaving alone. For instance, you might need to temporarily change your baroclinic (`DT`), thermal (`DT_THERM`) or baroclinic (`DT_BT`) timesteps, or are doing perturbation experiments that requires you to switch between different bathymetry files. +* `MOM_override`: + This file serves to override settings chosen in other input files. This is helpful for making a distinction between the thing you're fiddling with and 95% of the settings that you'll always be leaving alone. For instance, you might need to temporarily change your baroclinic (`DT`), tracer (`DT_THERM`) or barotropic (`DT_BT`) timesteps, or are doing perturbation experiments that require you to switch between different bathymetry files. -* `config file` - This file is machine dependent and environment dependent. For instance, if you're using Australia's National Computational Infrastructure (NCI), then you're likely using the `payu` framework, and you'll have a `config.yml` file. Regardless of what it looks like, this file should contain information that points to the executable, your input directory (aka the `mom_input_dir` you specified), the computational resources you'd like to request and other various settings. +* `MOM_layout`: + This file provides information on the model grid, processor layout and I/O layout. - The package does come with a premade `config.yml` file for payu users which is automatically copied and modified when the appropriate flag is passed to the `setup_rundir` method. If you find this package useful and you use a different machine, I'd encourage you to provide an example config file for your institution! Then this could be copied into. +* `config` file: + This file is machine dependent and environment dependent. For instance, if you're using Australia's National Computational Infrastructure (NCI), then you're likely using the [`payu`](https://payu.readthedocs.io/en/latest/) framework, and you'll have a `config.yaml` file. Regardless of what it looks like, this file should contain information that points to the executable, your input directory (aka the `mom_input_dir` you specified), the computational resources you'd like to request and other various settings. + The package does come with a premade `config.yaml` file for payu users which is automatically copied and modified when the appropriate flag is passed to the `setup_rundir` method. If you find this package useful and you use a different machine, I'd encourage you to provide an example config file for your institution! ## `input` directory -The directory referred to by as `mom_input_dir` path that hosts mostly netCDF files that are read by MOM6 at runtime. +The `mom_input_dir` directory stores mostly netCDF files that are read by MOM6 at runtime. These files can be big, so it is usually helpful to store them somewhere without any disk limitations. * `hgrid.nc` @@ -69,24 +74,24 @@ These files can be big, so it is usually helpful to store them somewhere without horizontal dimension as one would expect from the domain extent and the chosen resolution. This is because *all* points on the Arakawa C grid are included: both velocity and tracer points live in the 'supergrid'. For a regional configuration, we need to use the 'symmetric memory' configuration of the MOM6 executable. This implies that the - horizontal grids boundary must be entirely composed of cell edge points (like those used by velocities). Therefore, - for example, a model configuration that is 20-degrees wide in longitude and has 0.5 degrees longitudinal resolution,would imply that it has 40 cells in the `x` dimension and thus a supergrid with `nx = 41`. + horizontal grid's boundary must be entirely composed of cell edge points (i.e. those used by velocities). Therefore, + for example, a model configuration that is 20 degrees wide in longitude and has 0.5 degrees longitudinal resolution, would have 40 cells in the `x` dimension and thus a supergrid with `nx = 41`. The `nx` and `ny` points are where data is stored, whereas `nxp` and `nyp` here define the spaces between points used to compute area. The `x` and `y` variables in `hgrid` refer to the longitude and latitude. Importantly, `x` - and `y` are both two-dimensional (they both depend on both `nyx` and `nyp`) meaning that the grid does not have - to follow lines of constant latitude or longitude. Users that create their own own custom horizontal and vertical - grids can `read_existing_grid` to `True` when creating an experiment. + and `y` are both two-dimensional (they both depend on both `nx` and `ny`) meaning that the grid does not have + to follow lines of constant latitude or longitude. Users who create their own own custom horizontal and vertical + grids can set `read_existing_grid` to `True` when creating an experiment. * `vcoord.nc` - The values of the vertical coordinate. By default, `regional-mom6` sets up a `z*` vertical coordinate but other - coordinates may be provided after appropriate adjustments in the `MOM_input` file. Users that would like to - customise the vertical coordinate can initialise an `experiment` object to begin with, then modify and the `vcoord.nc` + The values of the vertical coordinate. By default, `regional-mom6` sets up a `z*` vertical coordinate, but other + coordinates may be provided after appropriate adjustments in the `MOM_input` file. Users who would like to + customise the vertical coordinate can initialise an `experiment` object to begin with, then modify the `vcoord.nc` file and save. Users can provide additional vertical coordinates (under different names) for diagnostic purposes. - These additional vertical coordinates allow diagnostics to be remapped and output during runtime. + These additional vertical coordinates allow diagnostics to be remapped and output during the model run. * `bathymetry.nc` - Fairly self explanatory, but can be the source of some difficulty. The package automatically attempts to remove "non advective cells". These are small enclosed lakes at the boundary that can cause numerical problems whereby water might flow in but have no way to flow out. Likewise, there can be issues with very shallow (only 1 or 2 layers) or very narrow (1 cell wide) channels. If your model runs for a while but then gets extreme sea surface height values, it could be caused by an unlucky combination of boundary and bathymetry. + Fairly self-explanatory, but can be the source of some difficulty. The package automatically attempts to remove "non-advective cells". These are small enclosed lakes at the boundary that can cause numerical problems whereby water might flow in but have no way to flow out. Likewise, there can be issues with very shallow (only 1 or 2 layers) or very narrow (1 cell wide) channels. If your model runs for a while but then gets extreme sea surface height values, it could be caused by an unlucky combination of boundary and bathymetry. Another thing to note is that the bathymetry interpolation can be computationally intensive. For a high-resolution dataset like GEBCO and a large domain, one might not be able to execute the `.setup_bathymetry` method within diff --git a/regional_mom6/regional_mom6.py b/regional_mom6/regional_mom6.py index 91757925..6b791c1d 100644 --- a/regional_mom6/regional_mom6.py +++ b/regional_mom6/regional_mom6.py @@ -33,8 +33,9 @@ def longitude_slicer(data, longitude_extent, longitude_coords): """ - Slice longitudes, handling periodicity and 'seams' where the - data wraps around (commonly either in domain [0, 360], [-180, 180], or [-270, 90]). + Slice longitudes while handling periodicity and the 'seams', that is the + longitude values where the data wraps around in a global domain (for example, + longitudes are defined, usually, within domain [0, 360] or [-180, 180]). The algorithm works in five steps: @@ -50,11 +51,11 @@ def longitude_slicer(data, longitude_extent, longitude_coords): the information we have about the way the dataset was shifted/rolled. - Slice the ``data`` index-wise. We know that ``|longitude_extent[1] - longitude_extent[0]| / 360`` - multiplied by the number of discrete longitude points will give - the total width of our slice, and we've already set the midpoint + multiplied by the number of discrete longitude points in the global input data gives + the number of longitude points in our slice, and we've already set the midpoint to be the middle of the target domain. - - Finally re-add the right multiple of 360 so the whole domain matches + - Finally re-add the correct multiple of 360 so the whole domain matches the target. Args: @@ -120,7 +121,7 @@ def longitude_slicer(data, longitude_extent, longitude_coords): new_lon[new_seam_index:] += 360 - ## new_x is used to recentre the midpoint to match that of target domain + ## new_lon is used to re-centre the midpoint to match that of target domain new_lon -= i * 360 new_data = new_data.assign_coords({lon: new_lon}) @@ -163,7 +164,7 @@ def hyperbolictan_thickness_profile(nlayers, ratio, total_depth): ``(1 + ratio * exp(2π)) / (ratio + exp(2π))``. This slight departure comes about because of the value of the hyperbolic tangent profile at the end-points ``tanh(π)``, which is approximately 0.9963 and not 1. Note that because ``exp(2π)`` is much greater - than 1, the value of the actual ratio is not that different from prescribed value + than 1, the value of the actual ratio is not that different from the prescribed value ``ratio``, e.g., for ``ratio`` values between 1/100 and 100 the final ratio of the bottom-most layer to the top-most layer only departs from the prescribed ``ratio`` by ±20%. @@ -262,8 +263,8 @@ def hyperbolictan_thickness_profile(nlayers, ratio, total_depth): def rectangular_hgrid(λ, φ): """ - Construct a horizontal grid with all the metadata required by MOM6 provided - an array of longitudes (``λ``) and latitudes (``φ``) on the supergrid. + Construct a horizontal grid with all the metadata required by MOM6, based on + arrays of longitudes (``λ``) and latitudes (``φ``) on the supergrid. Here, 'supergrid' refers to both cell edges and centres, meaning that there are twice as many points along each axis than for any individual field. @@ -322,15 +323,15 @@ def rectangular_hgrid(λ, φ): "y": {"standard_name": "geographic_latitude", "units": "degree_north"}, "dx": { "standard_name": "grid_edge_x_distance", - "units": "metres", + "units": "meters", }, "dy": { "standard_name": "grid_edge_y_distance", - "units": "metres", + "units": "meters", }, "area": { "standard_name": "grid_cell_area", - "units": "m2", + "units": "m**2", }, "angle_dx": { "standard_name": "grid_vertex_x_angle_WRT_geographic_east", @@ -368,13 +369,16 @@ class experiment: from MOM6 variable/coordinate name to the name in the input dataset. The class can be used to generate the grids for a new experiment, or to read in - an existing one by providing with ``read_existing_grids=True``. + an existing one (when ``read_existing_grids=True``; see argument description below). Args: - longitude_extent (Tuple[float]): Extent of the region in longitude in degrees. - latitude_extent (Tuple[float]): Extent of the region in latitude in degrees. - date_range (Tuple[str]): Start and end dates of the boundary forcing window. - resolution (float): Lateral resolution of the domain, in degrees. + longitude_extent (Tuple[float]): Extent of the region in longitude (in degrees). For + example: ``(40.5, 50.0)``. + latitude_extent (Tuple[float]): Extent of the region in latitude (in degrees). For + example: ``(-20.0, 30.0)``. + date_range (Tuple[str]): Start and end dates of the boundary forcing window. For + example: ``("2003-01-01", "2003-01-31")``. + resolution (float): Lateral resolution of the domain (in degrees). number_vertical_layers (int): Number of vertical layers. layer_thickness_ratio (float): Ratio of largest to smallest layer thickness; used as input in :func:`~hyperbolictan_thickness_profile`. @@ -388,8 +392,9 @@ class experiment: repeat_year_forcing (Optional[bool]): When ``True`` the experiment runs with repeat-year forcing. When ``False`` (default) then inter-annual forcing is used. read_existing_grids (Optional[Bool]): When ``True``, instead of generating the grids, - reads the grids and ocean mask from ``mom_input_dir`` and ``mom_run_dir``. Useful - for modifying or troubleshooting experiments. Default: ``False``. + the grids and the ocean mask are being read from within the ``mom_input_dir`` and + ``mom_run_dir`` directories. Useful for modifying or troubleshooting experiments. + Default: ``False``. """ def __init__( @@ -416,6 +421,7 @@ def __init__( self.mom_run_dir = Path(mom_run_dir) self.mom_input_dir = Path(mom_input_dir) + self.toolpath_dir = Path(toolpath_dir) self.mom_run_dir.mkdir(exist_ok=True) self.mom_input_dir.mkdir(exist_ok=True) @@ -428,7 +434,6 @@ def __init__( self.number_vertical_layers = number_vertical_layers self.layer_thickness_ratio = layer_thickness_ratio self.depth = depth - self.toolpath_dir = Path(toolpath_dir) self.grid_type = grid_type self.repeat_year_forcing = repeat_year_forcing self.ocean_mask = None @@ -438,7 +443,8 @@ def __init__( self.vgrid = xr.open_dataset(self.mom_input_dir / "vcoord.nc") except: print( - "Error in reading in existing grids. Make sure you've got files called `hgrid.nc` and `vcoord.nc` in {self.mom_input_dir}" + "Error while reading in existing grids!\n\n" + + f"Make sure `hgrid.nc` and `vcoord.nc` exists in {self.mom_input_dir} directory." ) raise ValueError else: @@ -460,30 +466,34 @@ def __getattr__(self, name): method for method in dir(self) if not method.startswith("__") ] error_message = ( - f"'{name}' method not found. Available methods are: {available_methods}" + f"{name} method not found. Available methods are: {available_methods}" ) raise AttributeError(error_message) def _make_hgrid(self): """ Set up a horizontal grid based on user's specification of the domain. - The default behaviour provides with a grid evenly spaced both in - longitude and in latitude. - - The latitudinal resolution is scaled with the cosine of the central latitude of - the domain, i.e., ``Δφ = cos(φ_central) * Δλ``, where ``Δλ`` is the longitudinal - spacing. This way, and given that the domain is small enough, the linear - distances between grid points are nearly identical: ``Δx = R * cos(φ) * Δλ`` - and ``Δy = R * Δφ = R * cos(φ_central) * Δλ``. That is, given that the domain is - small enough so that so that ``cos(φ_North_Side)`` is not much different from - ``cos(φ_South_Side)`` then ``Δx`` and ``Δy`` are similar. + The default behaviour generates a grid evenly spaced both in longitude + and in latitude. + + The latitudinal resolution is scaled with the cosine of the central + latitude of the domain, i.e., ``Δφ = cos(φ_central) * Δλ``, where ``Δλ`` + is the longitudinal spacing. This way, for a sufficiently small domain, + the linear distances between grid points are nearly identical: + ``Δx = R * cos(φ) * Δλ`` and ``Δy = R * Δφ = R * cos(φ_central) * Δλ`` + (here ``R`` is Earth's radius and ``φ``, ``φ_central``, ``Δλ``, and ``Δφ`` + are all expressed in radians). + That is, if the domain is small enough that so that ``cos(φ_North_Side)`` + is not much different from ``cos(φ_South_Side)``, then ``Δx`` and ``Δy`` + are similar. Note: - The intention is for the horizontal grid (``hgrid``) generation to be very flexible. + The intention is for the horizontal grid (``hgrid``) generation to be flexible. For now, there is only one implemented horizontal grid included in the package, - but you can customise it by simply overwriting the ``hgrid.nc`` file in the ``rundir`` - after initialising an ``experiment``. To conserve the metadata, it might be easiest - to read the file in, then modify the fields before re-saving. + but you can customise it by simply overwriting the ``hgrid.nc`` file in the + ``mom_run_dir`` directory after initialising an ``experiment``. To preserve the + metadata, it might be easiest to read the file in, then modify the fields before + re-saving. """ assert ( @@ -532,9 +542,10 @@ def _make_hgrid(self): def _make_vgrid(self): """ - Generate a vertical grid based on the number of layers ``nlayers`` and - the prescribed ratio of the vertical layer thicknesses (``layer_thickness_ratio``) - specified at the class level. + Generates a vertical grid based on the ``number_vertical_layers``, the ratio + of largest to smallest layer thickness (``layer_thickness_ratio``) and the + total ``depth`` parameters. + (All these parameters are specified at the class level.) """ thicknesses = hyperbolictan_thickness_profile( @@ -555,22 +566,25 @@ def _make_vgrid(self): return vcoord - def initial_condition(self, ic_path, varnames, gridtype="A", vcoord_type="height"): + def initial_condition( + self, ic_path, varnames, arakawa_grid="A", vcoord_type="height" + ): """ - Read the initial condition files, interpolates to the model grid fixes - up metadata and saves to the input directory. + Reads the initial condition from files in ``ic_path``, interpolates to the + model grid, fixes up metadata, and saves back to the input directory. Args: ic_path (Union[str, Path]): Path to initial condition file. varnames (Dict[str, str]): Mapping from MOM6 variable/coordinate names to the names in the input dataset. For example, ``{'xq': 'lonq', 'yh': 'lath', 'salt': 'so', ...}``. - gridtype (Optional[str]): Arakawa grid staggering of input; either ``'A'``, ``'B'``, - or ``'C'``. + arakawa_grid (Optional[str]): Arakawa grid staggering type of the initial condition. + Either ``'A'`` (default), ``'B'``, or ``'C'``. vcoord_type (Optional[str]): The type of vertical coordinate used in the forcing files. Either ``'height'`` or ``'thickness'``. """ - # Remove time dimension if present in the IC. Assume that the first time dim is the intended on if more than one is present + # Remove time dimension if present in the IC. + # Assume that the first time dim is the intended on if more than one is present ic_raw = xr.open_dataset(ic_path) if varnames["time"] in ic_raw.dims: @@ -585,25 +599,25 @@ def initial_condition(self, ic_path, varnames, gridtype="A", vcoord_type="height ] except: raise ValueError( - "Error in reading in initial condition tracers. Terminating" + "Error in reading in initial condition tracers. Terminating!" ) try: ic_raw_u = ic_raw[varnames["u"]] ic_raw_v = ic_raw[varnames["v"]] except: raise ValueError( - "Error in reading in initial condition tracers. Terminating" + "Error in reading in initial condition tracers. Terminating!" ) try: ic_raw_eta = ic_raw[varnames["eta"]] except: raise ValueError( - "Error in reading in initial condition tracers. Terminating" + "Error in reading in initial condition tracers. Terminating!" ) # Rename all coordinates to have 'lon' and 'lat' to work with the xesmf regridder - if gridtype == "A": + if arakawa_grid == "A": if ( "xh" in varnames.keys() and "yh" in varnames.keys() ): ## Handle case where user has provided xh and yh rather than x & y @@ -622,9 +636,12 @@ def initial_condition(self, ic_path, varnames, gridtype="A", vcoord_type="height ) else: raise ValueError( - "Can't find required coordinates in initial condition. Given that gridtype is 'A' the 'x' and 'y' coordinates should be provided in the varnames dictionary. E.g., {'x': 'lon','y': 'lat'}. Terminating" + "Can't find required coordinates in initial condition.\n\n" + + "Given that arakawa_grid is 'A' the 'x' and 'y' coordinates should be provided" + + "in the varnames dictionary. For example, {'x': 'lon', 'y': 'lat'}.\n\n" + + "Terminating!" ) - if gridtype == "B": + if arakawa_grid == "B": if ( "xq" in varnames.keys() and "yq" in varnames.keys() @@ -645,9 +662,13 @@ def initial_condition(self, ic_path, varnames, gridtype="A", vcoord_type="height ) else: raise ValueError( - "Can't find coordinates in initial condition. Given that gridtype is 'B' the names of the cell centre ('xh' & 'yh') as well as the cell edge ('xq' & 'yq') coordinates should be provided in the varnames dictionary. E.g {'xh':'lonh','yh':'lath' etc }. Terminating" + "Can't find coordinates in initial condition.\n\n" + + "Given that arakawa_grid is 'B' the names of the cell centers ('xh' & 'yh')" + + "as well as the cell edges ('xq' & 'yq') coordinates should be provided in " + + "the varnames dictionary. For example, {'xh': 'lonh', 'yh': 'lath', ...}.\n\n" + + "Terminating!" ) - if gridtype == "C": + if arakawa_grid == "C": if ( "xq" in varnames.keys() and "yq" in varnames.keys() @@ -668,9 +689,13 @@ def initial_condition(self, ic_path, varnames, gridtype="A", vcoord_type="height ) else: raise ValueError( - "Can't find coordinates in initial condition. Given that gridtype is 'C' the names of the cell centre ('xh' & 'yh') as well as the cell edge ('xq' & 'yq') coordinates should be provided in the varnames dictionary. E.g {'xh':'lonh','yh':'lath' etc }. Terminating" + "Can't find coordinates in initial condition.\n\n" + + "Given that arakawa_grid is 'C' the names of the cell centers ('xh' & 'yh')" + + "as well as the cell edges ('xq' & 'yq') coordinates should be provided in " + + "in the varnames dictionary. For example, {'xh': 'lonh', 'yh': 'lath', ...}.\n\n" + + "Terminating!" ) - ## Construct the xq,yh and xh yq grids + ## Construct the xq, yh and xh, yq grids ugrid = ( self.hgrid[["x", "y"]] .isel(nxp=slice(None, None, 2), nyp=slice(1, None, 2)) @@ -684,7 +709,7 @@ def initial_condition(self, ic_path, varnames, gridtype="A", vcoord_type="height .set_coords(["lat", "lon"]) ) - ## Construct the cell centre grid for tracers (xh,yh). + ## Construct the cell centre grid for tracers (xh, yh). tgrid = xr.Dataset( { "lon": ( @@ -699,7 +724,8 @@ def initial_condition(self, ic_path, varnames, gridtype="A", vcoord_type="height ) ### Drop NaNs to be re-added later - # NaNs might be here from the land mask of the model that the IC has come from. If they're not removed then the coastlines from this other grid will be retained! + # NaNs might be here from the land mask of the model that the IC has come from. + # If they're not removed then the coastlines from this other grid will be retained! ic_raw_tracers = ( ic_raw_tracers.interpolate_na("lon", method="linear") .ffill("lon") @@ -799,6 +825,7 @@ def initial_condition(self, ic_path, varnames, gridtype="A", vcoord_type="height eta_out.yh.attrs = ic_raw_tracers.lat.attrs eta_out.attrs = ic_raw_eta.attrs + # if temp units are K, convert to C if np.min(tracers_out["temp"].isel({"zl": 0})) > 100: tracers_out["temp"] -= 273.15 @@ -857,8 +884,8 @@ def rectangular_boundary( self, path_to_bc, varnames, orientation, segment_number, arakawa_grid="A" ): """ - Setup a boundary forcing file for a given orientation. Here the term 'rectangular' - implies boundaries along lines of constant latitude or longitude. + Set up a boundary forcing file for a given orientation. Here the term 'rectangular' + means boundaries along lines of constant latitude or longitude. Args: path_to_bc (str): Path to boundary forcing file. Ideally this should be a pre cut-out @@ -885,7 +912,7 @@ def rectangular_boundary( segment_name="segment_{:03d}".format(segment_number), orientation=orientation, # orienataion startdate=self.date_range[0], - gridtype=arakawa_grid, + arakawa_grid=arakawa_grid, repeat_year_forcing=self.repeat_year_forcing, ) @@ -918,9 +945,9 @@ def setup_bathymetry( bathymetry_path (str): Path to the netCDF file with the bathymetry. longitude_coordinate_name (Optional[str]): The name of the longitude coordinate in the bathymetry dataset at ``bathymetry_path``. For example, for GEBCO bathymetry: ``'lon'`` (default). - latitude_coordinate_name (Optional[str]): The name of the longitude coordinate in the bathymetry + latitude_coordinate_name (Optional[str]): The name of the latitude coordinate in the bathymetry dataset at ``bathymetry_path``. For example, for GEBCO bathymetry: ``'lat'`` (default). - vertical_coordinate_name (Optional[str]): The name of the longitude coordinate in the bathymetry + vertical_coordinate_name (Optional[str]): The name of the vertical coordinate in the bathymetry dataset at ``bathymetry_path``. For example, for GEBCO bathymetry: ``'elevation'`` (default). fill_channels (Optional[bool]): Whether or not to fill in diagonal channels. This removes more narrow inlets, @@ -1006,7 +1033,7 @@ def setup_bathymetry( bathymetry_output.lon.attrs["units"] = "degrees_east" bathymetry_output.lat.attrs["units"] = "degrees_north" bathymetry_output.elevation.attrs["_FillValue"] = -1e20 - bathymetry_output.elevation.attrs["units"] = "m" + bathymetry_output.elevation.attrs["units"] = "meters" bathymetry_output.elevation.attrs["standard_name"] = ( "height_above_reference_ellipsoid" ) @@ -1065,7 +1092,7 @@ def setup_bathymetry( tgrid.lon.attrs["_FillValue"] = 1e20 tgrid.lat.attrs["units"] = "degrees_north" tgrid.lat.attrs["_FillValue"] = 1e20 - tgrid.elevation.attrs["units"] = "m" + tgrid.elevation.attrs["units"] = "meters" tgrid.elevation.attrs["coordinates"] = "lon lat" tgrid.to_netcdf( self.mom_input_dir / "bathymetry_unfinished.nc", mode="w", engine="netcdf4" @@ -1080,7 +1107,7 @@ def setup_bathymetry( + "directly from a terminal in the input directory via\n\n" + "mpirun ESMF_Regrid -s bathymetry_original.nc -d bathymetry_unfinished.nc -m bilinear --src_var elevation --dst_var elevation --netcdf4 --src_regional --dst_regional\n\n" + "For details see https://xesmf.readthedocs.io/en/latest/large_problems_on_HPC.html\n\n" - + "Aftewards, we run 'tidy_bathymetry' method to skip the expensive interpolation step, and finishing metadata, encoding and cleanup." + + "Afterwards, we run 'tidy_bathymetry' method to skip the expensive interpolation step, and finishing metadata, encoding and cleanup." ) # If we have a domain large enough for chunks, we'll run regridder with parallel=True @@ -1113,19 +1140,19 @@ def tidy_bathymetry( method from :func:`~setup_bathymetry` allows for the regridding to be done separately, since regridding can be really expensive for large domains. - If the bathymetry is already regridded and what is left to be done is fixing the metadata, - or fill in some channels then call this function directly to read in the existing + If the bathymetry is already regridded and what is left to be done is fixing the metadata + or fill in some channels, then call this function directly to read in the existing ``bathymetry_unfinished.nc`` file that should be in the input directory. Args: - fill_channels (Optional[bool]): Whether or not to fill in + fill_channels (Optional[bool]): Whether to fill in diagonal channels. This removes more narrow inlets, but can also connect extra islands to land. Default: ``False``. minimum_layers (Optional[int]): The minimum depth allowed as an integer number of layers. The default value of ``3`` layers means that anything shallower than the 3rd layer (as specified by the ``vcoord``) is deemed land. - positive_down (Optional[bool]): If ``True`` (default), it assumes that + positive_down (Optional[bool]): If ``True`` (default), assume that bathymetry vertical coordinate is positive down. """ @@ -1368,7 +1395,7 @@ def setup_run_directory( overwrite=False, ): """ - Setup the run directory for MOM6. Either copy a pre-made set of files, or modify + Set up the run directory for MOM6. Either copy a pre-made set of files, or modify existing files in the 'rundir' directory for the experiment. Args: @@ -1393,7 +1420,8 @@ def setup_run_directory( base_run_dir = premade_rundir_path / "common_files" if not premade_rundir_path.exists(): raise ValueError( - f"Cannot find the premade run directory files at \n{premade_rundir_path}\n. Something is not right about how the package has been installed as these files are missing!" + f"Cannot find the premade run directory files at {premade_rundir_path}.\n\n" + + "These files missing might be indicating an error during the package installation!" ) if surface_forcing: overwrite_run_dir = premade_rundir_path / f"{surface_forcing}_surface" @@ -1454,11 +1482,14 @@ def setup_run_directory( ncpus = (x * y) - int(masked) if mask_table == None: print( - "No mask table found! This suggests the domain is mostly water, so there are no `non compute` cells that are entirely land. If this doesn't seem right, ensure you've already run .FRE_tools()." + "No mask table found! This suggests the domain is mostly water, so there are " + + "no `non compute` cells that are entirely land. If this doesn't seem right, " + + "ensure you've already run `FRE_tools`." ) if not hasattr(self, "layout"): raise AttributeError( - "No layout information found. This suggests you haven't run .FRE_tools() yet. Please do so first so I know how many processors you'd like to use." + "No layout information found. This suggests that `FRE_tools()` hasn't been called yet. " + + "Please do so, in order for the number of processors required is computed." ) ncpus = self.layout[0] * self.layout[1] print("Number of CPUs required: ", ncpus) @@ -1636,7 +1667,8 @@ class segment: from the provided startdate. Function ignores the time metadata and puts it on Julian calendar. - Only supports z-star (z*) vertical coordinate! + Note: + Only supports z-star (z*) vertical coordinate. Args: hgrid (xarray.Dataset): The horizontal grid used for domain. @@ -1644,14 +1676,15 @@ class segment: outfolder (Union[str, Path]): Path to folder where the model inputs will be stored. varnames (Dict[str, str]): Mapping between the variable/dimension names and - standard naming convension of this pipeline, e.g., ``{"xq": "longitude, + standard naming convention of this pipeline, e.g., ``{"xq": "longitude, "yh": "latitude", "salt": "salinity", ...}``. Key "tracers" points to nested dictionary of tracers to include in boundary. segment_name (str): Name of the segment, e.g., ``'segment_001'``. - orientation (str): Cardinal direction (lowercase) of the boundary segment. + orientation (str): Cardinal direction (lowercase) of the boundary segment, + i.e., ``'east'``, ``'west'``, ``'north'``, or ``'south'``. startdate (str): The starting date to use in the segment calendar. - gridtype (Optional[str]): Arakawa staggering of input grid, one of ``'A'``, ``'B'``, - or ``'C'`` + arakawa_grid (Optional[str]): Arakawa grid staggering type of the boundary forcing. + Either ``'A'`` (default), ``'B'``, or ``'C'``. time_units (str): The units used by the raw forcing files, e.g., ``hours``, ``days`` (default). tidal_constituents (Optional[int]): An integer determining the number of tidal @@ -1673,17 +1706,17 @@ def __init__( segment_name, orientation, startdate, - gridtype="A", + arakawa_grid="A", time_units="days", tidal_constituents=None, repeat_year_forcing=False, ): ## Store coordinate names - if gridtype == "A": + if arakawa_grid == "A": self.x = varnames["x"] self.y = varnames["y"] - elif gridtype in ("B", "C"): + elif arakawa_grid in ("B", "C"): self.xq = varnames["xq"] self.xh = varnames["xh"] self.yq = varnames["yq"] @@ -1702,10 +1735,19 @@ def __init__( self.time_units = time_units ## Store other data + orientation = orientation.lower() + if orientation not in ("north", "south", "east", "west"): + raise ValueError( + "orientation must be one of: 'north', 'south', 'east', or 'west'" + ) + self.orientation = orientation + + if arakawa_grid not in ("A", "B", "C"): + raise ValueError("arakawa_grid must be one of: 'A', 'B', or 'C'") + self.arakawa_grid = arakawa_grid + self.infile = infile self.outfolder = outfolder - self.orientation = orientation.lower() ## might not be needed? NSEW - self.grid = gridtype self.hgrid = hgrid self.segment_name = segment_name self.tidal_constituents = tidal_constituents @@ -1713,9 +1755,8 @@ def __init__( def rectangular_brushcut(self): """ - Cut out and interpolates tracers. This method assumes that the boundary - is a simple Northern, Southern, Eastern, or Western boundary. Cuts out - and interpolates tracers. + Cut out and interpolate tracers. This method assumes that the boundary + is a simple Northern, Southern, Eastern, or Western boundary. """ if self.orientation == "north": self.hgrid_seg = self.hgrid.isel(nyp=[-1]) @@ -1759,7 +1800,7 @@ def rectangular_brushcut(self): rawseg = xr.open_dataset(self.infile, decode_times=False, engine="netcdf4") - if self.grid == "A": + if self.arakawa_grid == "A": rawseg = rawseg.rename({self.x: "lon", self.y: "lat"}) ## In this case velocities and tracers all on same points regridder = xe.Regridder( @@ -1783,7 +1824,7 @@ def rectangular_brushcut(self): ] ) - if self.grid == "B": + if self.arakawa_grid == "B": ## All tracers on one grid, all velocities on another regridder_velocity = xe.Regridder( rawseg[self.u].rename({self.xq: "lon", self.yq: "lat"}), @@ -1820,7 +1861,7 @@ def rectangular_brushcut(self): ] ) - if self.grid == "C": + if self.arakawa_grid == "C": ## All tracers on one grid, all velocities on another regridder_uvelocity = xe.Regridder( rawseg[self.u].rename({self.xq: "lon", self.yh: "lat"}), @@ -1913,7 +1954,8 @@ def rectangular_brushcut(self): dz.name = "dz" dz = xr.concat([dz, dz[-1]], dim=self.z) - # Here, keep in mind that 'var' keeps track of the mom6 variable names we want, and self.tracers[var] will return the name of the variable from the original data + # Here, keep in mind that 'var' keeps track of the mom6 variable names we want, and self.tracers[var] + # will return the name of the variable from the original data allfields = { **self.tracers, diff --git a/regional_mom6/utils.py b/regional_mom6/utils.py index 14c6df75..fb0ce865 100644 --- a/regional_mom6/utils.py +++ b/regional_mom6/utils.py @@ -5,7 +5,7 @@ def vecdot(v1, v2): """Return the dot product of vectors ``v1`` and ``v2``. ``v1`` and ``v2`` can be either numpy vectors or numpy.ndarrays in which case the last dimension is considered the dimension - over which the dot product is taken of. + over which the dot product is taken. """ return np.sum(v1 * v2, axis=-1) @@ -88,7 +88,7 @@ def quadrilateral_area(v1, v2, v3, v4): def latlon_to_cartesian(lat, lon, R=1): - """Convert latitude-longitude (in degrees) to Cartesian coordinates on + """Convert latitude and longitude (in degrees) to Cartesian coordinates on a sphere of radius ``R``. By default ``R = 1``. Args: @@ -136,7 +136,7 @@ def quadrilateral_areas(lat, lon, R=1): Returns: numpy.array: Array with the areas of the quadrilaterals defined by the ``lat``-``lon`` grid provided. If the provided ``lat``, ``lon`` arrays are of dimension *m* :math:`\\times` *n* - then return areas array is of dimension (*m-1*) :math:`\\times` (*n-1*). + then returned areas array is of dimension (*m-1*) :math:`\\times` (*n-1*). Example: diff --git a/tests/test_expt_class.py b/tests/test_expt_class.py index 9466425c..d61f2afe 100644 --- a/tests/test_expt_class.py +++ b/tests/test_expt_class.py @@ -248,7 +248,7 @@ def test_ocean_forcing( expt.initial_condition( tmp_path / "ic_unprocessed", varnames, - gridtype="A", + arakawa_grid="A", )