-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speed up large polygon intersections using STRtrees and many other things #259
base: as/stabletasks
Are you sure you want to change the base?
Conversation
@rafaqz this is where I want to make it super easy to have a DimMatrix of benchmarks, to test scaling dynamics.... |
- Move STRDualQuery and the utils to utils/ - Create a new module LoopStateMachine that just defines some actions, and a macro @processloopactions that you can use to break out of a loop if you want, in a `map_some_things(f1, f2, f3, ...)`. Add tests for LoopStateMachine Use LoopStateMachine in the maybe function I added to clipping processor
I just realized, the tests aren't actually testing anything :D |
* lay off 3/4 of the testsets this uses the ContextTestSet from Test to reduce the amount of visual cruft. It doesn't reduce it all the way but it can help. Next step is to have a custom testset that can elide the excessive display. * print better context * also test densified geoms
See #260 for the testset diff |
Seems the densified module isn't actually testing anything... |
The last 4 commits before this comment implement the natural tree from The natural tree has been tested to perform a scanline of Boston's coordinates in the USA polygon, 35000 vertices, in ~80 ns! That's insane timing. It ought to speed up the point in polygon to be similar in speed to the C library one, if we don't use exact predicates. |
With preparation we can be 2.5 OOM faster, without it the indexing approach is 2x slower, at least for now. There are likely some efficiencies in the memory allocation that I am not exploiting, and the GeometryOps implementation is not as cache friendly as the tg implementation is. # Original
julia> @be GO.contains($(usa_poly), $((-76, 42)))
Benchmark: 3590 samples with 4 evaluations
min 5.729 μs
median 6.604 μs
mean 6.426 μs
max 16.208 μs
# After ExactPredicates
@be GO.contains($(usa_poly), $((-76, 42)))
Benchmark: 3672 samples with 4 evaluations
min 5.708 μs
median 6.396 μs
mean 6.235 μs
max 8.198 μs
# After NaturalIndex (this IS constructing the natural index as well)
# this is substantially slower than brute force!
julia> @be GO.contains($(usa_poly), $((-76, 42)))
Benchmark: 1359 samples with 1 evaluation
min 22.291 μs (11 allocs: 403.688 KiB) # compared to 14 microseconds for TG given a GI geom
median 38.500 μs (11 allocs: 403.688 KiB) # compared to 38 microseconds for TG given a GI geom
mean 58.483 μs (11 allocs: 403.688 KiB, 1.34% gc time)
max 3.341 ms (11 allocs: 403.688 KiB, 98.75% gc time)
# With pre-computed natural index on usa_poly
# 2.5 OOM faster!
julia> @be GO.contains($(GO.prepare_naturally(usa_poly)), $((-76, 42)))
Benchmark: 3629 samples with 66 evaluations
min 341.545 ns (1 allocs: 16 bytes)
median 381.318 ns (1 allocs: 16 bytes)
mean 390.878 ns (1 allocs: 16 bytes)
max 901.515 ns (1 allocs: 16 bytes) |
This allows a user to fully break out of a recursive function, like a tree query. Useful for geometric predicates, when you want to return once you've hit some criterion.
but at least we have it on the books now
So fast!! I guess we can reduce those allocs a bit or something |
Turns out intersection_point tests are failing because the algorithm is returning strange intersection points... |
Brief experiment to confirm whether a lazy generator edge list has equivalent speed to a non-lazy version: # Setup code
import GeometryOps as GO
import GeometryOps.SpatialTreeInterface as STI
using Extents
function doityourself(f, pred, geom)
p1 = GI.getpoint(geom, 1)
#p2 = GI.getpoint(geom, 2)
for i in 2:GI.npoint(geom)
p2 = GI.getpoint(geom, i)
ext = Extent(X = extrema(GI.x, (p1, p2)), Y = extrema(GI.y, (p1, p2)))
pred(ext) && f(i)
p1 = p2
end
end
# get data together
using NaturalEarth
all_countries = naturalearth("admin_0_countries", 10)^C
usa_multipoly = all_countries.geometry[findfirst(==("United States of America"), all_countries.NAME)] |> GO.tuples #x -> GI.convert(LG, x) |> LG.makeValid |> GO.tuples^C
usa_poly = GI.getgeom(usa_multipoly, findmax(GO.area.(GI.getgeom(usa_multipoly)))[2]) # isolate the poly with the most area^C
usa_ring = GI.getexterior(usa_poly)
# now try things
using Chairmarks
# benchmark: find all edges that cross 42 degrees latitude
tree_lazy = STI.FlatNoTree(GO.lazy_edge_extents(usa_ring))
tree_eager = STI.FlatNoTree(GO.edge_extents(usa_ring))
tree_natural = GO.NaturalIndexing.NaturalIndex(GO.lazy_edge_extents(usa_ring)) Natural indexing (fastest, of course) julia> @be Int[] STI.do_query(Base.Fix1(push!, _), $(ext -> ext.Y[1] <= 42 <= ext.Y[2]), $tree_natural)
Benchmark: 2717 samples with 126 evaluations
min 205.357 ns (0.06 allocs: 177.524 bytes)
median 224.532 ns (0.06 allocs: 177.524 bytes)
mean 276.546 ns (0.06 allocs: 177.524 bytes, 0.11% gc time)
max 36.939 μs (0.06 allocs: 177.524 bytes, 98.80% gc time) Lazy extent list in FlatNoTree (what it says on the tin - no tree at all) julia> @be Int[] STI.do_query(Base.Fix1(push!, _), $(ext -> ext.Y[1] <= 42 <= ext.Y[2]), $tree_lazy)
Benchmark: 4073 samples with 2 evaluations
min 11.188 μs (1 allocs: 232 bytes)
median 11.229 μs (1 allocs: 232 bytes)
mean 11.490 μs (1 allocs: 232 bytes)
max 48.896 μs (1 allocs: 232 bytes) Materialized vector of extents in eager tree julia> @be Int[] STI.do_query(Base.Fix1(push!, _), $(ext -> ext.Y[1] <= 42 <= ext.Y[2]), $tree_eager)
Benchmark: 4983 samples with 3 evaluations
min 5.611 μs (0.67 allocs: 154.667 bytes)
median 6.194 μs (0.67 allocs: 154.667 bytes)
mean 6.200 μs (0.67 allocs: 154.667 bytes)
max 15.597 μs (0.67 allocs: 154.667 bytes) DIY solution - compute extent and check on the fly julia> @be Int[] doityourself(Base.Fix1(push!, _), $(ext -> ext.Y[1] <= 42 <= ext.Y[2]), $(usa_ring))
Benchmark: 4080 samples with 2 evaluations
min 11.146 μs (1 allocs: 232 bytes)
median 11.188 μs (1 allocs: 232 bytes)
mean 11.332 μs (1 allocs: 232 bytes)
max 21.354 μs (1 allocs: 232 bytes) It's clear that the lazy indexing takes the same amount of time as the DIY solution, meaning we can actually |
Just pulled natural trees in to the With SingleSTRTree plus extent thinning: julia> @be GO.intersection_points($usa_poly, $usa_reflected)
Benchmark: 37 samples with 1 evaluation
min 1.967 ms (32601 allocs: 3.785 MiB)
median 2.214 ms (32601 allocs: 3.785 MiB)
mean 2.729 ms (32601 allocs: 3.785 MiB, 7.83% gc time)
max 8.556 ms (32601 allocs: 3.785 MiB, 65.28% gc time) With SingleNaturalTree plus extent thinning (~2x performance improvement) julia> @be GO.intersection_points($(GO.Planar()), $(GO.SingleNaturalTree()), $usa_poly, $usa_reflected)
Benchmark: 50 samples with 1 evaluation
min 1.087 ms (30294 allocs: 2.958 MiB)
median 1.178 ms (30294 allocs: 2.958 MiB)
mean 2.004 ms (30294 allocs: 2.958 MiB, 8.10% gc time)
max 26.876 ms (30294 allocs: 2.958 MiB, 94.69% gc time) With SingleNaturalTree and no extent thinning julia> @be GO.intersection_points($(GO.Planar()), $(GO.SingleNaturalTree()), $usa_poly, $usa_reflected)
Benchmark: 57 samples with 1 evaluation
min 1.086 ms (30230 allocs: 4.443 MiB)
median 1.316 ms (30230 allocs: 4.443 MiB)
mean 1.754 ms (30230 allocs: 4.443 MiB, 10.87% gc time)
max 6.512 ms (30230 allocs: 4.443 MiB, 74.14% gc time) So it doesn't seem like extent thinning helps too much, maybe a 100 microsecond difference. Will now try double-tree query with naturaltrees. Dual tree queriesSince the cost of creating a tree has gone way down with SingleNaturalTree, we can do dual tree queries easily. It turns out that extent thinning takes too long on natural trees, so we can exclude those. With no extent thinning: julia> @be GO.intersection_points($(GO.Planar()), $(GO.DoubleNaturalTree()), $usa_poly, $usa_reflected) seconds=1
Benchmark: 1155 samples with 1 evaluation
min 485.375 μs (2283 allocs: 5.967 MiB)
median 606.000 μs (2283 allocs: 5.967 MiB)
mean 791.566 μs (2283 allocs: 5.967 MiB, 11.66% gc time)
max 5.663 ms (2283 allocs: 5.967 MiB, 88.21% gc time) With extent thinning on both sides: julia> @be GO.intersection_points($(GO.Planar()), $(GO.ThinnedDoubleNaturalTree()), $usa_poly, $usa_reflected) seconds=1
Benchmark: 1186 samples with 1 evaluation
min 582.583 μs (2368 allocs: 2.744 MiB)
median 693.104 μs (2368 allocs: 2.744 MiB)
mean 798.452 μs (2368 allocs: 2.744 MiB, 5.87% gc time)
max 5.503 ms (2368 allocs: 2.744 MiB, 78.88% gc time) |
Yeah, extent thinning was speeding up the huge But that looks 4x faster than before! So, prob faster than GEOS in all cases? |
Unfortunately not, it's still about the same speed for the circle case and the super-segmentized collection of squares. But for natural polygons it seems pretty good, and more efficient than computing an STR tree for sure. |
huh - turns out there are situations in which dual-tree is slower than single tree by a substantial amount. I guess it's good we have options then! |
Note
This is effectively the GeometryOps dev branch now.
\This PR uses STRtrees to substantially accelerate polygon queries. It pulls the actual looping all the way out and allows the use of STRtrees to skip substantial sections of polygons.
There are three methods here:
This algorithm should be easily portable to s2 when that becomes available - and the structure allows e.g. manifold passthrough as well.

This PR also adds a LoopStateMachine module that basically just defines an
Action
wrapper struct, and a macro that can take care of that action wrapper struct.With this macro, a function running within a loop can return
Break()
orContinue()
which indicates to the caller that it should break or continue.E.g.:
TODOs: