-
Notifications
You must be signed in to change notification settings - Fork 30
feat(sql): Implement ST_Azimuth()
#183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| pub fn st_azimuth_udf() -> SedonaScalarUDF { | ||
| SedonaScalarUDF::new_stub( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since ST_Azimuth is simple enough to be implemented manually, we should move the entire implementation that you wrote inside of sedona-geo to this file in sedona-functions. No need for this stub function. sedona-geo is really for functions where we need to use the geo_generic_alg package to call more complicated algorithms. You can see an example implementation in st_isempty.rs. It should be as simple as copy-pasting over the implementation and modifying this function a bit to use SedonaScalarUDF::new instead of SedonaScalarUDF::new_stub.
| ArgMatcher::is_geometry_or_geography(), | ||
| ArgMatcher::is_geometry_or_geography(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ArgMatcher::is_geometry_or_geography(), | |
| ArgMatcher::is_geometry_or_geography(), | |
| ArgMatcher::is_geometry(), | |
| ArgMatcher::is_geometry(), |
Then, we can reduce this to geometry only for now, since we don't want to call the geometry implementation on geography objects.
rust/sedona-geo/src/st_azimuth.rs
Outdated
| } | ||
|
|
||
| // Note: When the two points are completely coincident, PostGIS's ST_Azimuth() | ||
| // returns NULL. However, this returns 0.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's definitely add a test for this case.
But first, we should confirm what the desired behavior is. I suspect this is a just something that was missed in the original Sedona, and maybe we should follow PostGIS behavior here instead. There doesn't seem to be any discussion about this in the original Sedona PR. @jiayuasu was this difference intentional or maybe just something that was missed? Shall we fix it in Sedona as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To date, SedonaDB tests against PostGIS for feature parity and we file bugs with Sedona when we notice something is inconsistent.
|
Also, add a benchmark here in native-functions.rs (they're in alphabetical order). There's an example of a bench for a function with two geometries as inputs here |
Co-authored-by: Peter Nguyen <[email protected]>
|
Thanks for reviewing! |
|
I think I addressed all the comments. Thanks for the detailed explanation, it really helps! I switch the implementation to return |
|
Looks great. I just remembered, we should also add a new You can follow these directions in the contributors-guide.md for testing Python. It will also require a running instance of PostGIS, which you can spin up using Docker by following these instructions I'm adding. |
|
Thanks, I'll try it. I also wonder if I should update this page, but probably there's no tool for this yet, guessing from #180? https://github.com/apache/sedona-db/blob/main/docs/reference/sql.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! (And thanks Peter for the review!)
Looks great. I just remembered, we should also add a new test_st_azimuth test
These will be similar to the ones for st_distance():
sedona-db/python/sedonadb/tests/functions/test_distance.py
Lines 21 to 47 in 7f91135
| @pytest.mark.parametrize("eng", [SedonaDB, PostGIS]) | |
| @pytest.mark.parametrize( | |
| ("geom1", "geom2", "expected"), | |
| [ | |
| (None, None, None), | |
| ("POINT (0 0)", None, None), | |
| (None, "POINT (0 0)", None), | |
| ("POINT (0 0)", "POINT (0 0)", 0), | |
| ( | |
| "POINT(-72.1235 42.3521)", | |
| "LINESTRING(-72.1260 42.45, -72.123 42.1546)", | |
| 0.0015056772638228177, | |
| ), | |
| ( | |
| "POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))", | |
| "POLYGON ((5 5, 6 5, 6 6, 5 6, 5 5))", | |
| 5.656854249492381, | |
| ), | |
| ], | |
| ) | |
| def test_st_distance(eng, geom1, geom2, expected): | |
| eng = eng.create_or_skip() | |
| eng.assert_query_result( | |
| f"SELECT ST_Distance({geom_or_null(geom1)}, {geom_or_null(geom2)})", | |
| expected, | |
| numeric_epsilon=1e-8, | |
| ) |
...and could go here:
I also wonder if I should update this page
You're correct...don't worry about this, as long as the rust function documentation is there it will get updated.
(We'll hopefully do a better job documenting the process of adding a function in the future 😬 )
| // If either of the points is empty, the result is NULL | ||
| _ => Ok(None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking: does PostGIS allow a MULITPOINT with a single child here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems PostGIS also rejects MULTIPOINT.
postgres=# SELECT ST_Azimuth(ST_Point(0, 0), ST_GeomFromText('MULTIPOINT (1 1)'));
ERROR: Argument must be POINT geometries
rust/sedona-geo/src/st_azimuth.rs
Outdated
| } | ||
|
|
||
| // Note: When the two points are completely coincident, PostGIS's ST_Azimuth() | ||
| // returns NULL. However, this returns 0.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To date, SedonaDB tests against PostGIS for feature parity and we file bugs with Sedona when we notice something is inconsistent.
|
During adding these tests, I found PostGIS raise errors for these two cases where my implementation returns The first one is two The second one is empty point. I thought this returns |
Co-authored-by: Dewey Dunnington <[email protected]>
…sedona-db into feat/st-azimuth
|
Thanks for your help. I think I addressed your comments.
Good to know! |
|
@yutannihilation does it make sense to show off your benchmark result (compared to DuckDB and PostGIS)? 😁 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
| # TODO: PostGIS fails without explicit ::GEOMETRY type cast, but casting | ||
| # doesn't work on SedonaDB yet. | ||
| # (None, None, None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this! We have a few cases where we're not sure exactly what this kind of thing should do (luckily people typing SQL NULLs for this kind of thing is pretty rare 🙂 )
|
Thanks for reviewing! @jiayuasu https://github.com/apache/sedona-db/blob/main/benchmarks/test_functions.py |
|
@yutannihilation we have a tool in SedonaDB to produce benchmark results like what's shown in this PR: #171 @petern48 @paleolimbot can you help @yutannihilation figure out how to run it? |
|
Thanks, I guess I can generate the benchmark result if I add a case to https://github.com/apache/sedona-db/blob/main/benchmarks/README.md |
|
I think you are right. Looking forward to it! |
|
Sure, I'll try it! |
|
Yep, that README.md is exactly it. I was reluctant to suggest adding a benchmark because there are still some issues to iron out about its purpose and validity (discussion in this PR. Regardless, it surely doesn't hurt to add it. Especially since this is a native (manual) implementation, I'd expect to see some appealing numbers compared to DuckDB. |
|
Hmm, before adding the case, it seems all benchmark tests are skipped on my local. Here's what I did. Am I missing some steps? Since I'm not very good at Python, I might make some silly mistake... python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install "python/sedonadb[test]" pytest pytest-benchmark
cd benchmarks
pytest test_functions.py::TestBenchFunctions
Since DuckDB's |
|
I've also never successfully run the Python benchmarks, so no worries 🙂 (We're stoked you're here at all...feel free to spend your time doing whatever brings you the most joy!) |
|
I'm not sure how you got actual Here's the benchmark screenshot copied over.
I guess I'll instead thank you that DuckDB is slow enough 😉 |
Sorry, it was turned out that I simply forgot to launch PostGIS docker image this time... Anyway, thanks for adding benchmark! Probably I can do it next time.
😉 |
|
Got it, thanks for catching that. It helps smooth the process for everyone when we know to document these small details. It's not ideal that the whole thing is skipped when the container isn't running, but I added a note about it to that same PR for now. |

This pull request is my attempt to implement
ST_Azimuth().This pull request is mostly for figuring out how to implement a function so that I can contribute more. So, please feel free to close if there's another plan to implement this.
I picked
ST_Azimuth()because it looks relatively simple. I follow the implementation of Sedona, although I might not understand the code correctly as I'm poor at Java...https://github.com/apache/sedona/blob/f9069e7dc6682d53335f0e0c6fb4bd444024d3b5/common/src/main/java/org/apache/sedona/common/Functions.java#L202-L209
I think the implementation is straightforward. One thing to note is that, if I understand the Sedona's implementation correctly, it returns
0.0when the two points are the same. However, PostGIS returnsNULL. The document says: