-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Early feedback request for the API shape #2
Comments
A couple of first thoughts:
|
Thanks for taking a look! Responses below.
We can extend the API to expose this capability via a method on XRWebGLBinding to hand out a WebGLTexture directly - would this work here? This way, implementations would be able to implement the API with less overhead if the underlying APIs allow it to. 2 things worth calling out about the approach with WebGLTexture:
I'd say each eye can have its own depth buffer - the API already accepts XRView which is associated with a specific eye, so the implementation can use that to decide what needs to be returned - if it is able to provide stereo depth buffers, it could do so. Let me know if that makes sense! |
The layers spec defines a texture that is only valid during the Raf call where it was created: https://immersive-web.github.io/layers/#xropaquetextures
You could also do it all in the browser process so you can avoid having to send the data to the render process and back.
Does that mean that devices that have a single physical depth sensor won't be able to support this spec? Or would they have to reproject the depth map for each eye? |
Ooh, thanks for the pointer! I think this is effectively what we currently do in our prototype implementation for raw camera access API, but I was concerned about exposing the data as a WebGLTexture since that seemed to me like a potential footgun - for example, users could still try and call gl.deleteTexture() on them. Maybe we can try to work around this by providing very detailed diagnostics for the cases where the textures are not used correctly.
Good point, if we do not care about exposing the data on the CPU at all, this is definitely something we should do. The question then is: should we focus only on the GPU use case? From what I've seen, the data we get from ARCore is detailed enough that leveraging it on the CPU for physics is probably going to be beneficial, but for GPU use cases (occlusion), it won't be good enough.
Either reproject the depth map, or expose an additional XRView that will only be used to obtain that single depth map. In that case, it may be better to expose the more sensor-oriented features via something else than XRView (raw camera access API also happens to need such mechanism, so we can think about the best way to expose those kinds of features). |
I suspect the system would have to do the reprojection because it would be a big burden on the author to do this correctly. |
No, the
Won't this depend on the intended usage of the depth map? If the goal is to perform some simple environment reconstruction based on depth map data, I imagine the authors will already need to know what they are doing when trying to leverage the data we return.
Unfortunately, I do not - I've seen articles about HoloLens Research Mode that referred to depth cameras, but I did not see a way to access the data through the API. FWIW, based on documentation available from ARCore, it seems that RGB camera is sufficient to provide a (maybe less precise) depth map, so maybe it's something that could be made available on stereoscopic devices with multiple cameras as well. |
Call for action for potential implementers of the depth-sensing API! Please take a look at the issues linked above, explainer, and at the PR #8 with early draft of the spec (it should hopefully contain most of the relevant details), taking extra care to ensure that the API can be implemented in other browsers! Other things that will be adjusted that I am tracking:
|
/agenda for visibility & to ensure that the initially proposed API shape is implementable for other form factors |
For texture lifetime, consider a flow where the user allocates a texture, and receives data in it through |
In the explainer, it looks like the proposal is to use |
We really should have a texImage2D(XRDepthInformation), and make the cpu-data path optional. (maybe |
In particular, having to go through the cpu data path means a couple of extra copies:
|
There are some concerns about how to upload this to WebGL but I think DEPTH_COMPONENT16 would always work, and would be great for this, allowing both direct sampling of values (through the R channel) as well as supporting depth-sampling operations, which I can imagine might be desirable here. ;) |
Async API here, definitely would be not good, as real-time applications need this data in sync manner. There is time in background IO, to pass that data before it is needed by JS thread, so sync blocking - should not be an issue. |
On HoloLens 2, serving up depth images to apps will involve rendering world mesh on the GPU, and so it would be wasteful for an app that intends to use the data on the GPU (e.g. for occlusion) to move the images from GPU to CPU and back to GPU again. Excited to find the right GPU-first path here for apps that end up using GPU-derived images on the GPU! |
Please take a look at PR #11, hopefully it addresses the issues we chatted about during the call. Most importantly, it allows the user to specify the desired usage of data, thus allowing the user agents to do the right thing and minimize the amount of data round trips. Additionally, it allows the UAs to express their preferred usage and data format, with the intent that if the user picks the preferred settings, it would incur the lowest possible cost. |
I took a look at the current API and couldn't find anything about "raw depth" information - I'm not sure if this is the right place to ask/give feedback. Currently the reported depth values seem to be smoothed which might be good in a lot of cases, but in situations where the device (like a smartphone) is used to 3D scan an environment (just collect data, and post process later) it would be great if there would be an option to report raw depth (and confidence if available). I haven't thought about what a good API for that would look like, but IMO it would certainly be good to have that option because currently the only option is to use native arcore (or arkit) to do that. |
Hey all,
I'd like to ask people to take a look at the initial version of the explainer and let me know if there are any major problems with the current approach (either as a comment under this issue, or by filing a new issue). I'm looking mostly for feedback around API ergonomics / general usage, and possible challenges related to implementation of the API on various different kinds of hardware / in different browsers / etc.
+@toji, @thetuvix, @grorg, @cabanier, @mrdoob, @elalish
The text was updated successfully, but these errors were encountered: