-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vulkan renderer crashes after only a couple thousand draw calls. #3112
Comments
Actually the same issue on MacOS with the Metal backend enable, after 4k draw calls it will crash so definitely some limitation hit. However this is ONLY in debug mode, release mode there is not a problem:
|
Make debug build and see debug output. |
I already have a debug build, that's how I was able to do the analysis in the issue description. I'm actually getting a slight different behaviour now, it crashes pretty much immediately. Not sure why, I haven't really used Vulkan ever since I created the ticket originally. But the out of bounds access error, stacktrace and everything is the same, so it's still the same issue. |
Update your drivers. |
I've isolated my problem and fixed it: Change [src/renderer_mtl.mm:1556]:( Line 1556 in 954c18b
void setShaderUniform(uint8_t _flags, uint32_t _loc, const void* _val, uint32_t _numRegs)
{
uint32_t offset = 0 != (_flags&kUniformFragmentBit)
? m_uniformBufferFragmentOffset
: m_uniformBufferVertexOffset
;
uint8_t* dst = (uint8_t*)m_uniformBuffer.contents();
bx::memCopy(&dst[offset + _loc], _val, _numRegs*16);
} To check for the void setShaderUniform(uint8_t _flags, uint32_t _loc, const void* _val, uint32_t _numRegs)
{
uint32_t offset = 0 != (_flags&kUniformFragmentBit)
? m_uniformBufferFragmentOffset
: m_uniformBufferVertexOffset
;
uint8_t* dst = (uint8_t*)m_uniformBuffer.contents();
if (offset + _loc > UNIFORM_BUFFER_SIZE) {
return;
}
bx::memCopy(&dst[offset + _loc], _val, _numRegs*16);
} I can also just increase the buffer instead from src/renderer_mtl.mm:19 |
@joseph-montanez I'm not entirely sure we are seeing the same issue. I'm not testing this with my code, this is happening with example 17. @bkaradzic If you mean my nvidia drivers then they are up to date. Is there any other Vulkan specific driver that I should have and I'm not aware of? |
@magester1 That limit |
But that's what I mean, this is happening because of vk's scratch memory, which I believe has nothing to do with Metal (please correct me if that's wrong). The number being the same seems like a happy coincidence to me, or maybe because bgfx is using this magic "128" for both of them? Oh I feel like an idiot, I forgot to add the lines numbers to the stack trace!! Thank you for pointing that out. Just to clarify, I do have this running in debug mode, and I know exactly which lines are causing the issue (linked in the original description). But I don't know enough about Vulkan to understand the design decision behind the size of the scratch memory, that's why I created this ticket here. Here's the trace with the line numbers, sorry about that I didn't realize they were missing:
|
So here is the issue: uint8_t m_fsScratch[64<<10];
uint8_t m_vsScratch[64<<10]; Take anything that increments in 16 and you get 3971 limit. BTW its also used for... void setShaderUniform(uint8_t _flags, uint32_t _regIndex, const void* _val, uint32_t _numRegs)
{
if (_flags & kUniformFragmentBit)
{
bx::memCopy(&m_fsScratch[_regIndex], _val, _numRegs*16);
}
else
{
bx::memCopy(&m_vsScratch[_regIndex], _val, _numRegs*16);
}
} Why the limit... no idea. In my case macOS running on Arm64 doesn't have vram since its all shared memory. I am not sure why this needs to be limited to 64KB for Vulkan. |
In my case the main culprit was the But yeah, like you I don't know why this limits exists or how it was determined. Specially considering that what goes here depends on the shader size (is it size in number of uniforms?), since with the original example shader it works fine up to the max draw calls. |
64k / 16 is 4096. If you're running out of fs/vsScratch that means you're setting over 4k uniforms. |
I don't think example-17 is setting any uniforms besides the default ones (you know view transformations, etc), so I don't think that's the issue. |
Describe the bug
Vulkan renderer crashes after only a couple thousand of draw calls if you use anything but the most simple vertex shader. It seems to be that the vk renderer is running out of scratch pad memory.
In case it matters, I'm building on Windows 10 (SDK 10.0.20348.0) with clang 16.0.0. I'm not on the latest bgfx master, but I have looked through the commits and there doesn't seem to be any significant change since then (this is where I'm at).
Below are the steps to reproduce with one of the examples, but I'll describe some more first.
The application crashes with:
Exception thrown at 0x00007FFB2EBD14F7 (vcruntime140d.dll) in example-17-drawstress.exe: 0xC0000005: Access violation writing location 0x00000194D4000000.
and this is the callback stack:
Basically, the problem is in this line. The copy is being called with the values:
bx::memCopy(&m_data[8386752], ..., 2112)
which exceeds the maximum value:m_size == 8388480
.I can see 2 problems here:
m_pos < (m_size - _size)
or something similar.To Reproduce
Steps to reproduce the behavior:
./example-17-drawstress.exe --vk
Expected behavior
I assume the application shouldn't crash. Unless there's some limitations with Vulkan? In which case, I think those should be exposed somewhere in the capabilities so we can know what the limit is. But it doesn't look very promising, with a simple shader like vs_metaballs I can only send about 4000 draw calls before reaching the limit, it's about
(64<<10) * 128 / 2112 ~ 3971
.Note that it works perfectly well with other renderers like d3d or gl.
Screenshots
N/A
Additional context
I think the issue looks pretty straightforward, but please let me know if other information about my system would be useful. I could provide the logs if you think they would help.
The text was updated successfully, but these errors were encountered: