Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reserve space for LocalVector if size is known #100269

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

YYF233333
Copy link
Contributor

Follow up #100251, covers other part of the engine.

Change in scene/property_utils.cpp is special, all caller passes in a newly constructed LocalVector so reserve helps.

@YYF233333 YYF233333 requested review from a team as code owners December 11, 2024 08:08
@Chaosus Chaosus added this to the 4.4 milestone Dec 11, 2024
@AThousandShips AThousandShips modified the milestones: 4.4, 4.x Dec 11, 2024
Copy link
Contributor

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I agree with this PR: It's good practice to reserve space before adding a bunch of items sequentially, to improve efficiency.

I think this may have trouble getting merged though since not all of the reserve calls are trivial. There's quite a lot of text to sift through to find whether each of the reserves are correct. Especially, some may be better served with a different approach entirely (in-place allocation, known-length map, etc.).

It may be better to see if any of the spots are in particular need of speed up, adding a reserve there, making a small PR, and getting it approved by someone with knowledge of that particular spot of code.

@@ -2229,6 +2230,9 @@ RDD::RenderPassID RenderingDevice::_render_pass_create(RenderingDeviceDriver *p_
for (int i = 0; i < p_passes.size(); i++) {
const FramebufferPass *pass = &p_passes[i];
RDD::Subpass &subpass = subpasses[i];
subpass.color_references.reserve(pass->color_attachments.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the subpass lists are empty at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the whole subpasses is freshly defined just 7 lines above.

@@ -3181,6 +3188,7 @@ RID RenderingDevice::uniform_set_create(const Collection &p_uniforms, RID p_shad
RDD::BoundUniform &driver_uniform = driver_uniforms[i];
driver_uniform.type = uniform.uniform_type;
driver_uniform.binding = uniform.binding;
driver_uniform.ids.reserve(uniform.get_id_count());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the ids are empty at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole driver_uniforms is defined just above this loop and resized, so all these items are just default constructed, unless you do some weird stuff in constructor it should be empty.

@@ -368,6 +368,7 @@ String ShaderPreprocessor::vector_to_string(const LocalVector<char32_t> &p_v, in

String ShaderPreprocessor::tokens_to_string(const LocalVector<Token> &p_tokens) {
LocalVector<char32_t> result;
result.reserve(p_tokens.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be better served with an in-place String construction; though I'm not sure whether the token.text can be trusted to be correct UTF-32.

@YYF233333
Copy link
Contributor Author

I think this may have trouble getting merged though since not all of the reserve calls are trivial. There's quite a lot of text to sift through to find whether each of the reserves are correct. Especially, some may be better served with a different approach entirely (in-place allocation, known-length map, etc.).

All these cases are trival enough. I restrict myself to behave somehow like a smarter linter and avoid any possible regressions. I've filtered these changes 2 times to ensure every call to the reserve can at least reduce one realloc, so they should all benifit performance (although not much if on cold path).

It may be better to see if any of the spots are in particular need of speed up, adding a reserve there, making a small PR, and getting it approved by someone with knowledge of that particular spot of code.

Actually I suspect these are ethier too corner case or just not hot enough to make people aware of them. Should probably be treated as a refactor instead of optimization.

That said, no one knows if this won't benefit performance a bit (like editor startup time).

@KoBeWi
Copy link
Member

KoBeWi commented Dec 13, 2024

So

_FORCE_INLINE_ void reserve(U p_size) {
p_size = tight ? p_size : nearest_power_of_2_templated(p_size);

tight is false by default, which means each reserve() will actually reserve more than necessary. push_back() has the same behavior though, just pointing that it might still be improved.

@clayjohn
Copy link
Member

tight is false by default, which means each reserve() will actually reserve more than necessary. push_back() has the same behavior though, just pointing that it might still be improved.

That's a good point, if the maximum size is guaranteed, we can use a TightLocalVector instead

@YYF233333
Copy link
Contributor Author

tight is false by default, which means each reserve() will actually reserve more than necessary. push_back() has the same behavior though, just pointing that it might still be improved.

The difference should be small and I don't know if power of 2 size plays better with malloc or not. And tight may give people an impression that we have some assumption about memory layout (as it is now mostly used in rendering code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants