-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[✨ feature] Scene: allow bypassing ray intersection acceleration data structures #297
base: master
Are you sure you want to change the base?
Conversation
This can be useful to completely bypass OptiX or other heavy acceleration data structures when the scene is trivial (e.g. a single cube).
Mostly adapted from the existing render tests.
src/render/scene.cpp
Outdated
accel_init_cpu(props); | ||
// Decide whether to use acceleration data structures for ray intersections | ||
// TODO: do we even want a heuristic? Could lead to surprising changes in performance for the user. | ||
bool naive_intersection_desirable = m_shapes.size() <= 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed with @Speierers, we may not want to enable this feature unless the user explicitly asks for it. The downside is that users may not find out it exists, but it's probably better than creating surprising performance jumps due to this heuristic.
src/render/scene.cpp
Outdated
dr::masked(pi.prim_index, valid) = prim_pi.prim_index; | ||
dr::masked(pi.shape, valid) = prim_pi.shape; | ||
dr::masked(pi.instance, valid) = prim_pi.instance; | ||
dr::masked(pi.shape_index, valid) = shape_index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to fix this. Currently encountering a compilation error if trying to do a masked assignment to pi
, something about type computation from masked arrays of Shape
:
In file included from /home/merlin/Code/mitsuba3-master/src/render/scene.cpp:3:
In file included from /home/merlin/Code/mitsuba3-master/include/mitsuba/render/bsdf.h:4:
/home/merlin/Code/mitsuba3-master/include/mitsuba/render/interaction.h:560:36: error: no type named 'Ray3f' in 'drjit::detail::MaskedArray<mitsuba::Shape<dr::DiffArray<dr::CUDAArray<float>>, Color<dr::DiffArray<dr::CUDAArray<float>>, 3>>>'
using Ray3f = typename Shape_::Ray3f;
~~~~~~~~~~~~~~~~~^~~~~
/home/merlin/Code/mitsuba3-master/ext/drjit/include/drjit/array_router.h:2044:21: note: in instantiation of template class 'mitsuba::PreliminaryIntersection<drjit::detail::MaskedArray<drjit::DiffArray<CUDAArray<float>>>, drjit::detail::MaskedArray<mitsuba::Shape<dr::DiffArray<dr::CUDAArray<float>>, Color<dr::DiffArray<dr::CUDAArray<float>>, 3>>>>' requested here
masked_t<T> result;
^
/home/merlin/Code/mitsuba3-master/src/render/scene.cpp:138:17: note: in instantiation of function template specialization 'drjit::masked<mitsuba::PreliminaryIntersection<drjit::DiffArray<CUDAArray<float>>, mitsuba::Shape<dr::DiffArray<dr::CUDAArray<float>>, Color<dr::DiffArray<dr::CUDAArray<float>>, 3>>>, drjit::DiffArray<drjit::CUDAArray<bool>>>' requested here
dr::masked(pi, valid) = prim_pi;
^
/home/merlin/Code/mitsuba3-master/src/render/scene.cpp:460:22: note: in instantiation of member function 'mitsuba::Scene<drjit::DiffArray<CUDAArray<float>>, mitsuba::Color<drjit::DiffArray<CUDAArray<float>>, 3>>::ray_intersect' requested here
MI_INSTANTIATE_CLASS(Scene)
^
src/render/scene.cpp
Outdated
dr::masked(pi.prim_index, valid) = prim_pi.prim_index; | ||
dr::masked(pi.shape, valid) = prim_pi.shape; | ||
dr::masked(pi.instance, valid) = prim_pi.instance; | ||
dr::masked(pi.shape_index, valid) = shape_index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise.
|
||
print(f'\n--- {mi.variant()} ---') | ||
for bypass, values in results.items(): | ||
print(f'{"naive" if bypass else "accel"}: {1000 * np.mean(values):.6f} ms') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently getting these results for a scene containing a single analytic sphere, with the direct
integrator:
--- scalar_rgb ---
naive: 71.434951 ms
[0.06400513648986816, 0.06670022010803223, 0.07629013061523438, 0.07175850868225098, 0.06787610054016113, 0.08331823348999023, 0.06868195533752441, 0.0772852897644043, 0.07218050956726074, 0.0662534236907959]
accel: 73.885083 ms
[0.07530069351196289, 0.07167458534240723, 0.07322049140930176, 0.06887698173522949, 0.07498621940612793, 0.08003997802734375, 0.07488870620727539, 0.0752708911895752, 0.07259535789489746, 0.07199692726135254]
------------------
--- llvm_ad_rgb ---
naive: 51.368260 ms
[0.04977750778198242, 0.055547237396240234, 0.04845452308654785, 0.04771161079406738, 0.05972123146057129, 0.04860043525695801, 0.0483553409576416, 0.059114694595336914,
0.047663211822509766, 0.0487368106842041]
accel: 51.119304 ms
[0.0474705696105957, 0.05318307876586914, 0.0463564395904541, 0.04656219482421875, 0.061937808990478516, 0.05034995079040527, 0.046396732330322266, 0.04927563667297363, 0.06153106689453125, 0.04812955856323242]
------------------
--- cuda_ad_rgb ---
naive: 13.897085 ms
[0.014396190643310547, 0.013736724853515625, 0.013788223266601562, 0.013866901397705078, 0.013941049575805664, 0.014055013656616211, 0.013768434524536133, 0.013695955276489258, 0.013820171356201172, 0.01390218734741211]
accel: 12.030840 ms
[0.012321233749389648, 0.01208186149597168, 0.011997222900390625, 0.011968374252319336, 0.01200246810913086, 0.011963605880737305, 0.011999368667602539, 0.01200246810913086, 0.011951684951782227, 0.012020111083984375]
------------------
This doesn't include the savings from accel build up and updates, e.g. in the context of an optimization loop.
Also, the case where I saw huge gains before was with a hardcoded axis-aligned cube intersection routine. Currently our cube
shape is a type of Mesh
, so we'd need a new shape type to find out if it still holds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the output values be exactly the same? (They are close, but it is curious that the equivalence is not perfect).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, are those things in brackets also time values? 🤦
Regarding the actually computed intersections: differences around the machine epsilon (relative error ~1e-07) would be expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay yeah we're not at machine epsilon :p
To be investigated, it might be that the implementations actually compute slightly different results (not just due to e.g. order of operations).
In my test scene all of the differences seem to be concentrated around the sphere (which happens to be the only refractive object). Could also be a matter of mint
handling.
MI_DECLARE_CLASS() | ||
private: | ||
/// Axis-aligned bounding box in world space | ||
field<BoundingBox3f, ScalarBoundingBox3f> m_bbox; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it enough to only carry a ScalarBoundingBox3f
in this plugin?
4af72bd
to
199b607
Compare
df4cbe0
to
64fedcd
Compare
3f3b8d0
to
1bdea6e
Compare
Description
For very simple scenes, the cost of building, maintaining and using acceleration data structures can sometime be greater than the time savings.
This PR enables skipping them entirely.
Testing
See
test_accel_bypass.py
.Checklist
cuda_*
andllvm_*
variants. If you can't test this, please leave below