-
-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reserve space for LocalVector if size is known #100269
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I agree with this PR: It's good practice to reserve space before adding a bunch of items sequentially, to improve efficiency.
I think this may have trouble getting merged though since not all of the reserve calls are trivial. There's quite a lot of text to sift through to find whether each of the reserves are correct. Especially, some may be better served with a different approach entirely (in-place allocation, known-length map, etc.).
It may be better to see if any of the spots are in particular need of speed up, adding a reserve there, making a small PR, and getting it approved by someone with knowledge of that particular spot of code.
@@ -2229,6 +2230,9 @@ RDD::RenderPassID RenderingDevice::_render_pass_create(RenderingDeviceDriver *p_ | |||
for (int i = 0; i < p_passes.size(); i++) { | |||
const FramebufferPass *pass = &p_passes[i]; | |||
RDD::Subpass &subpass = subpasses[i]; | |||
subpass.color_references.reserve(pass->color_attachments.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure the subpass lists are empty at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the whole subpasses
is freshly defined just 7 lines above.
@@ -3181,6 +3188,7 @@ RID RenderingDevice::uniform_set_create(const Collection &p_uniforms, RID p_shad | |||
RDD::BoundUniform &driver_uniform = driver_uniforms[i]; | |||
driver_uniform.type = uniform.uniform_type; | |||
driver_uniform.binding = uniform.binding; | |||
driver_uniform.ids.reserve(uniform.get_id_count()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure the ids
are empty at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole driver_uniforms
is defined just above this loop and resized, so all these items are just default constructed, unless you do some weird stuff in constructor it should be empty.
@@ -368,6 +368,7 @@ String ShaderPreprocessor::vector_to_string(const LocalVector<char32_t> &p_v, in | |||
|
|||
String ShaderPreprocessor::tokens_to_string(const LocalVector<Token> &p_tokens) { | |||
LocalVector<char32_t> result; | |||
result.reserve(p_tokens.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be better served with an in-place String
construction; though I'm not sure whether the token.text
can be trusted to be correct UTF-32
.
All these cases are trival enough. I restrict myself to behave somehow like a smarter linter and avoid any possible regressions. I've filtered these changes 2 times to ensure every call to the
Actually I suspect these are ethier too corner case or just not hot enough to make people aware of them. Should probably be treated as a refactor instead of optimization. That said, no one knows if this won't benefit performance a bit (like editor startup time). |
So godot/core/templates/local_vector.h Lines 140 to 141 in dc5f1b7
tight is false by default, which means each reserve() will actually reserve more than necessary. push_back() has the same behavior though, just pointing that it might still be improved.
|
That's a good point, if the maximum size is guaranteed, we can use a TightLocalVector instead |
The difference should be small and I don't know if power of 2 size plays better with malloc or not. And tight may give people an impression that we have some assumption about memory layout (as it is now mostly used in rendering code). |
Follow up #100251, covers other part of the engine.
Change in
scene/property_utils.cpp
is special, all caller passes in a newly constructedLocalVector
so reserve helps.