OpenGL based GPU ISP notes and discussions

Existing/Alternative implementations

https://www.nxp.com/design/designs/i-mx8-software-image-signal-processing:SOFTISP-I.MX8

2022-10-12 - #dri-devel discussions about buffers/GBM

* frieder ([email protected]) has joined
<pq> kbingham, GBM is not OpenGL. GBM is not EGL, either. And obviously OpenGL is not EGL. Nor Vulkan. You do not want to allocate in OpenGL if you need a dmabuf, there are no sufficient APIs for that.
<pq> kbingham, CPU cache is not the only cache in a machine. GPUs have caches and who knows what has caches.
<Prf_Jakob> You can allocate in Vulkan tho plenty of API to go around there.
* vliaskov ([email protected]) has joined
<pq> kbingham, AFAIU, while CPU may have uncached mapping, that means nothing to GPU caches which still need flushing in both directions correctly. Depending on the actual hardware, of course.
<pq> kbingham, when you have a dmabuf, you can assume the worst case: that the buffer resides in VRAM, behind a slow bus and some caches you don't know about. That makes the regular mmap() perilous, which is why gbm_bo_map() exists. Copying with DMA may be faster than direct access, even if you only ever read the buffer once.
<emersion> pq, you mean copying to a staging GPU buffer?
* anshuma1 ([email protected]) has joined
* tursulin ([email protected]) has joined
<pq> emersion, staging CPU buffer, or whatever magic gbm_bo_map() does.
<pq> naturally the optimal magic to do is completely hardware dependent
* pH5 ([email protected]) has joined
<pq> kbingham, regular mmap() of a dmabuf might be fast, or too slow, or not work at all. That depends on where the buffer physically resides, the exact hardware, and maybe the driver too. So if you must mmap, use gbm_bo_map().
* jkrzyszt_ ([email protected]) has joined
<pq> kbingham, also, it's really hard to beat glReadPixels performance if that would be the simplest way to get the pixels.
* Jeremy_Rand_Talos_ has quit (Remote host closed the connection)
* Jeremy_Rand_Talos_ ([email protected]) has joined
<mlankhorst> airlied: no pull reques from me this week eiher it seems, both drm-misc-next-fixes and drm-misc-fixes are empthy
<mlankhorst> empty*
* mvlad (~mvlad@2a02:2f08:4605:ca00:24d7:51ff:fed6:906d) has joined
* lynxeye (~lynxeye@2a02:560:58a6:3600:20e1:7881:a58b:630d) has joined
* swalker_ ([email protected]) has joined
* swalker_ is now known as Guest2880
* frankbinns ([email protected]) has joined
* swalker__ ([email protected]) has joined
<kbingham> pq, That all sounds understandable, but it's up to dorota's implementation (which I haven't seen yet). But an application using libcamera will provide a buffer, which is expected to be either allocated by a V4L2 device (an encoder) or a display device (drm) ... or potentially routed through to dorota's allocator so she could allocate using GBM methods. But the application generically has control of what the target buffers are that her 
<kbingham> code will be writing data to with the shaders.
<kbingham> The aim was to be able to move image processing away from the CPU and to the GPU. I expect there will be limitations along the way though.
<kbingham> Essentially dcz is working on an implementation like this : https://www.nxp.com/design/designs/i-mx8-software-image-signal-processing:SOFTISP-I.MX8 (but not that one)
* Guest2880 has quit (Ping timeout: 480 seconds)
* frankbinns1 ([email protected]) has joined
<pq> kbingham, right. That makes me wonder where is mmap going to be needed?
* frankbinns has quit (Ping timeout: 480 seconds)
<pq> kbingham, if libcamera is offering a helper library for allocating suitable buffers for CPU consumers, shouldn't that library also have API for mmapping?
<kbingham> pq, I think so yes ;-)
<kbingham> mmap can be needed to read the completed image.
<pq> it sounded a bit like the application is given a dmabuf and then left to deal with it.
<kbingham> It won't always be encoded or displayed - it could be written to file directly, or encoded with a software encoder.
<pq> *dmabuf fd
<kbingham> I created a 'MappedFrameBuffer' class, that could handle the sync ioctl. But pinchartl didn't like it - and wants to do something else. But that something else isn't done yet.
* Major_Biscuit ([email protected]) has joined
<pq> until that something else, I'd very much recommend gbm_bo_map() (*after* glFlush), or glReadPixels for CPU access.
 pq Prf_Jakob pendingchaos psykose pinchartl pa pac85[m] padovan paulk pepp Peuc pH5 phire PiGLDN[m] pixelcluster pjakobsson Plagman ppascher pushqrdx[m] pzanoni 
<kbingham> pq, The difficulty is how to know if CPU access will occur or not. Or perhaps we assume it will always occur, and we do a glReadPixels into a second dmabuf ... but that seems ... wasteful.
<pq> kbingham, yeah, that's what I don't understand. Why would you need to know if CPU access will occur before an appliction actually asks to mmap something?
<pq> or rather, in my imagination, only applications that need CPU access will use the helper allocator API, so anything allocated through that helper is always going to be CPU-accessed. I suppose that's not how it's designed then?
<kbingham> The 'soft isp' component is several layers down and separate is the distinction.
<pq> why does that make a difference?
<kbingham> And there is no public 'api' to help applications map the buffer.
<pq> aha!
<kbingham> So right now - applications have to do it directly 
<pq> yeah, that's the problem
<pq> so applictions have to use GBM themselves  to import the dmabuf and then gbm_bo_map()
<kbingham> Yes - but that won't work for our design, as the application *wont know* GBM is being used.
<kbingham> The application shouldn't know if it's OpenGL ISP - or a V4L2 ISP
<pq> the app does not need to know
<pq> the does not need to choose a DRM device to initialize GBM on, and that device would need to be compatible with the buffer that's actually used.
<pq> that might be risky
<pq> ok, another solution is that you tell apps what API to use to handle the buffers
<pq> dmabufs are not equal vs. devices used to access them
* anshuma1 has quit ()
<kbingham> indeed. But somehow we have to make a generic API that applications 'get a picture from the camera'
<pq> right, and Wayland has a precedent of such buffer API
<kbingham> And the application shouldn't care if that came from a USB UVC camera, or through a mipi-csi2 receiver into a hardware ISP ... or in this case that dorota is looking into, from a CSI2 receiver, then processed with the GPU.
<kbingham> I don't know much about wayland internals. Can you point me to any documentation for that please?
* jernej has quit (Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net)
<pq> https://gitlab.freedesktop.org/wayland/wayland-protocols/-/tree/main/unstable/linux-dmabuf
<pq> this involves knowing which device a buffer must be usable on, and communicating the supported formats and format modifiers.
* jernej ([email protected]) has joined
<emersion> there are two sides: buffer producer and buffer consumer
<pq> it's basically saying: the application must allocate a buffer that is usable on at least this device, and uses a format+modifier from this list.
<emersion> one side needs to allocate, and the other side needs to communicate what constraints it needs to be able to use the buffer
<pq> in Wayland, the allocating side (client/application) also writes the buffer and the server only reads it, but having server write and client read doesn't make much difference.
* jernej_ ([email protected]) has joined
<pq> I guess one problem is figuring out the set of formats+modifiers the writer side can write, because with GPUs they can often read a lot more formats than write.
* jernej has quit (Ping timeout: 480 seconds)
* JohnnyonFlame has quit (Ping timeout: 480 seconds)
* fahien ([email protected]) has joined
* flto_ ([email protected]) has joined
* root__ ([email protected]) has joined
* Daanct12 ([email protected]) has joined
<pq> kbingham, I should also point out that Wayland has a completely separate interface (wl_shm) for CPU accessible buffers. Dmabuf are assumed to be not CPU accessed.
* flto has quit (Ping timeout: 480 seconds)
<kbingham> It will be harder for us to make that same assumption :(

https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/main/unstable/linux-dmabuf/linux-dmabuf-unstable-v1.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenGL based GPU ISP notes and discussions

Existing/Alternative implementations

2022-10-12 - #dri-devel discussions about buffers/GBM

Clone this wiki locally