Weird behavior when playing around with intersects(Sphere, Box) #611

aprokop · 2022-01-05T19:47:35Z

aprokop
Jan 5, 2022
Maintainer

I've been playing around with intersects(Sphere, Box) and observed a weird behavior.

I've been trying to use a (modified) Jim Arvo's algorithm. Something like this patch (diff to fb59980):

--- a/src/details/ArborX_DetailsAlgorithms.hpp
+++ b/src/details/ArborX_DetailsAlgorithms.hpp
@@ -208,7 +208,21 @@ constexpr bool intersects(Point const &point, Box const &other)
 KOKKOS_INLINE_FUNCTION
 bool intersects(Sphere const &sphere, Box const &box)
 {
-  return distance(sphere.centroid(), box) <= sphere.radius();
+  auto const &bmin = box.minCorner();
+  auto const &bmax = box.maxCorner();
+  auto const &c = sphere.centroid();
+
+  float dist = sphere.radius() * sphere.radius();
+  for (int d = 0; d < 3; ++d)
+  {
+    if (c[d] < bmin[d])
+      dist -= (c[d] - bmin[d])*(c[d] - bmin[d]);
+    else if (c[d] > bmax[d])
+      dist -= (c[d] - bmax[d])*(c[d] - bmax[d]);
+    if (dist < 0)
+        return false;
+  }
+  return true;
 }

I've been running it on A100 using FDBSCAN with 37M HACC data. When running minPts = 2, it is much faster than original:

# master (fb59980)
$ ./ArborX_DBSCAN.exe --binary --filename ~/hacc.arborx --core-min-size 2 --eps 0.042 --print-dbscan-timers --impl fdbscan
<snip>
-- construction     :      0.020
-- query+cluster    :      0.510
-- postprocess      :      0.020
total time          :      0.552

# branch
$ ./ArborX_DBSCAN_master.exe --binary --filename ~/hacc.arborx --core-min-size 2 --eps 0.042 --print-dbscan-timers --impl fdbscan
<snip>
-- construction     :      0.020
-- query+cluster    :      0.375
-- postprocess      :      0.020
total time          :      0.417

So the query is 27% faster. However, running mintPts = 5, the situation is reversed:

# master (fb59980)
$ ./ArborX_DBSCAN.exe --binary --filename ~/hacc.arborx --core-min-size 5 --eps 0.042 --print-dbscan-timers --impl fdbscan
<snip>
-- construction     :      0.020
-- query+cluster    :      0.623
---- neigh          :      0.051
---- query          :      0.564
-- postprocess      :      0.022
total time          :      0.667

# branch
$ ./ArborX_DBSCAN.exe --binary --filename ~/hacc.arborx --core-min-size 5 --eps 0.042 --print-dbscan-timers --impl fdbscan
-- construction     :      0.021
-- query+cluster    :      0.697
---- neigh          :      0.052
---- query          :      0.637
-- postprocess      :      0.020
total time          :      0.739

So here the neighbor counting is the same time, but the query is 13% slower.

In fact, when using the standard benchmark, it is almost always either the same, or slower for Cuda, with a couple benchmarks up to 50% slower.

I don't understand this at all. It boggles my mind that some simple changes in intersect implementation could have such a disproportionate effect on the overall time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird behavior when playing around with intersects(Sphere, Box) #611

{{title}}

Replies: 0 comments

Select a reply

Weird behavior when playing around with intersects(Sphere, Box) #611

aprokop Jan 5, 2022 Maintainer

Replies: 0 comments

aprokop
Jan 5, 2022
Maintainer