-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes to support subsetting datasets by constraining coordinate values. #494
base: develop
Are you sure you want to change the base?
Conversation
Produces a problem - which I think is to do with the 2D nature of the latitude and longitudes. |
@dataclass | ||
class Coordinates2D: | ||
"""Represent a point in one-to-two-dimensional space with optional X and Y coordinates.""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coordinates2D is an example of an object where the user needs to know the ordering convention: Coordinates2D(45, 50) is obviously a different place to Coordinates2D(50,45), but which is which.
Convention would have x preceding y, but also that latitude proceeds longitude...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the parameters are named "x" and "y", what would be you preferred way of indicating that?
Returns: | ||
The corresponding coordinate array from the provided dataset. | ||
""" | ||
return get_coord(dataset, "x") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, a coordinate longitude can be 2 dimensional. E.g. if you want a box that is 1000 km x 1000 km, then near the north pole it would vary with latitude and longitude.
Or is the terminology for "coordinates" referring to indices in an array, which are 1-dimensional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is the terminology for "coordinates" referring to indices in an array, which are 1-dimensional?
That's right, I was basing the name off xr.Dataset.coords
, what's the most correct term to use here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. If I load your copernicus example and inspect the coords
I get the following:
import coast
fn_config_t_grid = "./config/example_cmems_grid_t.json"
database = coast.Copernicus(username, password, "nrt")
forecast = database.get_product("global-analysis-forecast-phy-001-024")
nemo_t = coast.Gridded(fn_data=forecast, config=fn_config_t_grid)
start = coast.Coordinates2D(50, -10)
end = coast.Coordinates2D(60, 5)
nemo_t.set_constraint(start, end)
nemo_t.dataset.coords
Out[10]:
Coordinates:
longitude (x_dim) float32 50.0 50.08 50.17 50.25 ... 59.75 59.83 59.92 60.0
latitude (y_dim) float32 -10.0 -9.917 -9.833 -9.75 ... 4.833 4.917 5.0
* z_dim (z_dim) float32 0.494 1.541 2.646 ... 5.275e+03 5.728e+03
time (t_dim) datetime64[ns] 2020-01-01T12:00:00 ... 2022-07-10T12:0...
So the horizontal coords are have names latitude
and longitude
and their content are floats not index integers.
Indices are generally referred to as j
and i
, which are a nightmare for grepping. So we are shifting towards j_ind
and i_ind
notation. j_ind
is the index for the nominally northward, or y-direction. i_ind
is the index for the nomillay eastward, or x-direction.
So, from the above example:
longitude[i_ind=0] = 50.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, from data in the example files:
import coast
import unit_testing.unit_test_files as files
nemo_t = coast.Gridded(files.fn_nemo_dat, files.fn_nemo_dom, config=files.fn_config_t_grid)
#start = coast.Coordinates2D(50, -10)
#end = coast.Coordinates2D(60, 5)
#nemo_t.set_constraint(start, end)
# Set some indices values
j_ind = 0
i_ind = 0
nemo_t.dataset.latitude[j_ind, i_ind].values # notice the pain in the but `j` before `i` expected ordering.
Gives a floating point latitude, which will vary with both indices.
Out[26]: array(40.066406, dtype=float32)
In summary indices are more basic than coordinates because coordinates are a function of indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #494 (comment)
# Conflicts: # coast/__init__.py # coast/data/coast.py
@jpolton There are a couple of knownn limitations to this version: Are these features we want to handle now or do we push them back to another ticket? |
Matt Incase you spot this before the end of the day.
|
Hmm maybe ignore the previous comment. I need to have a proper look at what you've been able to do. |
@jpolton Apologies if I've used the term incorrectly (I was basing this in a conversation with @anwiseNOCL) - you're right. The Copernicus dataset I was using in the original example works as I expected. |
@jpolton I'm hoping to get you to try this out and tell me...
A: Is this way of defining coordinates and shapes something that feels good for you to use?
B: Have I used the correct terminology in my docstrings/method names?
Example code based on what I have been playing with locally: