Support creating global graphs by introducing haversine distance metric #37

leifdenby · 2024-11-25T11:29:38Z

Describe your changes

A student I am working with would like to look at emulating 1D fluid simulations on a periodic 1D domain with neural-lam. I thought an easy way to create a 1D periodic domain was to just create a line along the equator and use haversine distance to do the mesh-generation and grid-to-mesh connections. This PR is a proof-of-concept that builds on #32 to make that possible.

The PR introduces the distance_metric argument (which defaults to euclidean) to all graph generation functions, so that this metric is used when computing distances. By using the haversine distance metric and also not providing a projection argument then we can create a 1D periodic domain that wraps the equator:

The PR also adapts the flat-mesh creation routines to ensure that both the "x" and "y" directions have at least one grid point (this is necessary when the point-to-point span of on direction, here the poleward direction, is zero).

The haversine distance calculation is used within the Ball-tree implementation in scikit-learn which has been added as a dependency.

NB: the hierarchical graph generation doesn't work yet, I think because the graph refinement is applied in both "x" and "y" directions for now. I will try and fix this.

NB no 2: this PR does not add more clever mesh-geometries than the "rectangular" mesh that we already have. I think how this should fit into the existing codebase needs a bit more thought. But I just wanted to show you all what I have done :)

Issue Link

This builds on #32 and the suggestion I gave in there that the generalisation in #32 allows for.

Type of change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📖 Documentation (Addition or improvements to documentation)

Checklist before requesting a review

My branch is up-to-date with the target branch - if not update your fork with the changes from the target branch (use pull with --rebase option if possible).
I have performed a self-review of my code
For any new/modified functions/classes I have added docstrings that clearly describe its purpose, expected inputs and returned values
I have placed in-line comments to clarify the intent of any hard-to-understand passages of my code
I have updated the documentation to cover introduced code changes
I have added tests that prove my fix is effective or that my feature works
I have given the PR a name that clearly describes the change, written in imperative form (context).
I have requested a reviewer and an assignee (assignee is responsible for merging)

Checklist for reviewers

Each PR comes with its own improvements and flaws. The reviewer should check the following:

the code is readable
the code is well tested
the code is documented (including return types and parameters)
the code is easy to maintain

Author checklist after completed review

I have added a line to the CHANGELOG describing this change, in a section
reflecting type of change (add section where missing):
- added: when you have added new functionality
- changed: when default behaviour of the code has been changed
- fixes: when your contribution fixes a bug

Checklist for assignee

PR is up to date with the base branch
the tests pass
author has added an entry to the changelog (and designated the change as added, changed or fixed)
Once the PR is ready to be merged, squash commits and merge the PR.

Co-authored-by: sadamov <[email protected]>

joeloskarsson · 2024-11-26T13:39:12Z

This is neat :) I did not know that sklearn.neighbors.BallTree had that many different metrics implemented, so that's quite useful!

Something I am thinking about around this is to what extent the "rectangular" layout of mesh nodes makes sense when we use a different metric as distance. In some sense the concept of equally spaced rows and columns of mesh nodes would be different for a different metric. Now this implementation is only concerned with the connections between mesh and grid.
We could very much treat these as two separate issues I think. Just a consideration in how this is presented. Maybe it would make more sense to see the distance metric here as a property of the g2m_connectivity and m2g_connectivity rather than the full graph?

And yes, I understand this is just an example demonstration. Just wanted to put my thoughts here for future reference 😄

leifdenby · 2024-11-28T15:45:03Z

This is neat :) I did not know that sklearn.neighbors.BallTree had that many different metrics implemented, so that's quite useful!

Yes! It's quite a heavy dependency though, so we might want to make it optional... We'll see

Something I am thinking about around this is to what extent the "rectangular" layout of mesh nodes makes sense when we use a different metric as distance. In some sense the concept of equally spaced rows and columns of mesh nodes would be different for a different metric. Now this implementation is only concerned with the connections between mesh and grid.

Yes, that is true. The current "rectangular" layout (I wish I could think of a better name...) implemented would spacing lat/lon values equally apart, but that would then not be equidistantly spaced with for example the haversine metric. Hmm...

I was also thinking that if want to introduce other types of mesh layout than this "rectangular" layout then we might need to change/adapt the arguments a bit. m2m_connectivity="hierarchical" or m2m_connectivity="flat" could still apply with a triangular grid I guess though, so maybe we just need a another argument called mesh_layout with mesh_layout="rectangular" by default, and we could pass kwargsthrough withmesh_layout_kwargs. Or should mesh_connectivitymaybe become part of them2m_connectivity_kwargs`?

Maybe if we do that then we could add a warning/raise an exception if someone tries to use a "rectangular" layout while trying to use the haversine distance for the g2m and m2g connectivity. Does that make sense? I think this would benefit from some examples in a notebook... 😆

We could very much treat these as two separate issues I think. Just a consideration in how this is presented. Maybe it would make more sense to see the distance metric here as a property of the g2m_connectivity and m2g_connectivity rather than the full graph?

Yes unless it (the distance metric) starts playing a role in mesh layout too. If you contrast making a triangular mesh layout on a plane vs on a sphere, you might want to set the mesh-spacing by applying a different metric in the two cases.

leifdenby · 2024-11-28T21:54:02Z

Maybe rectilinear would be a better name for the "layout" of the mesh?

joeloskarsson · 2024-11-29T14:12:08Z

Rectilinear is probably the correct term yes. We wrote that in a paper and then I have stopped using it. I realize that there's hardly anyone that read rectilinear and immediately understands what you mean, whereas just writing rectangular is wrong, but people understand you 😄

joeloskarsson and others added 27 commits October 2, 2024 17:58

Start looking into where xy has to be changed

8a11527

Change shapes in docstrings

2c7e4eb

Make flat mesh graphs work with new coordinate layout

6f03908

Merge PR mllam#26 into branch

d2a081a

Fix coordinate handling in multirange graph creation

9dbd2a5

Rename grid_refinement_factor to mesh_node_distance

eada6ea

Fix existing tests to work with new coordinate format

01180a1

Add test for irregularlygridded coordinates

d103ad6

Remove unneeded eps in mesh level calculation

7ba387e

Change documentation to use new format and arguments for coordinates

9069c84

Fix bug in coordinate order for flat graphs

d592c2b

Start working on allowing latlon coordinates

fd695b1

Introduce coords and projection

2c781ee

Merge branch 'main' into general_coordinates

5cf3bbf

Fix linting

0f17d2c

Fix tests with coords keyword argument

f145025

Implement lat-lon transformation through projection

cc4cc5e

Add documentation page about graphs constructed using lat-lons

2627e37

Adjust coords keyword arg in docs

c764fd7

Add test for lat-lon coordinates

f6ae35b

Fix linting of docs

22caf65

Merge main into branch

7ec34ff

Fix typos and clarifications as suggested from code review

3ee25ae

Co-authored-by: sadamov <[email protected]>

Change euclidean coordinates to Cartesian coordinates

70eef3e

Merge branch 'main' into general_coordinates

3c4866b

Merge branch 'main' into general_coordinates

4fb69dc

add 1D periodic domain as havesine dist example

e0f0825

leifdenby mentioned this pull request Nov 25, 2024

Allow non-regularly gridded and lat-lon coordinates #32

Merged

20 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support creating global graphs by introducing haversine distance metric #37

Support creating global graphs by introducing haversine distance metric #37

leifdenby commented Nov 25, 2024 •

edited

Loading

joeloskarsson commented Nov 26, 2024 •

edited

Loading

leifdenby commented Nov 28, 2024 •

edited

Loading

leifdenby commented Nov 28, 2024

joeloskarsson commented Nov 29, 2024

Support creating global graphs by introducing haversine distance metric #37

Are you sure you want to change the base?

Support creating global graphs by introducing haversine distance metric #37

Conversation

leifdenby commented Nov 25, 2024 • edited Loading

Describe your changes

Issue Link

Type of change

Checklist before requesting a review

Checklist for reviewers

Author checklist after completed review

Checklist for assignee

joeloskarsson commented Nov 26, 2024 • edited Loading

leifdenby commented Nov 28, 2024 • edited Loading

leifdenby commented Nov 28, 2024

joeloskarsson commented Nov 29, 2024

leifdenby commented Nov 25, 2024 •

edited

Loading

joeloskarsson commented Nov 26, 2024 •

edited

Loading

leifdenby commented Nov 28, 2024 •

edited

Loading