Add microbenchmark framework to godot to reduce the work of performance PR #11285

YYF233333 · 2024-12-04T08:25:28Z

Describe the project you are working on

Godot PR with tag :)

Describe the problem or limitation you are having in your project

Trying to get an optimizing PR merged is not as easy as optimizing the code itself. Sometimes it feels like constructing the benchmark and benching themselves are much more time-consuming than the code change. We keep creating new MRPs and barely reuse them. And sometimes self-proving is just impossible like this PR, if the changes scatter around in the codebase.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Give developers a way to write beanchmarks easily. Which mean:

No need for boilerplate code. You write a function and the framework takes care of env setup and teardown.
Stable, repeatble result, with straightforward visualization. Results like this is just meaningless:

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

Integrate a proper microbenchmark framework into godot (or reinvent the wheel). I search the web and grab this topmost google benchmark just for example. Feel free to introduce other frameworks.

Should be something like the unit test we are using now. You build with benchmark=yes and run ./bin/godot --benchmark and get the result.

Once the framework is setup, people can add a benchmark when they enconter a perofrmance regression or submit a PR, and these benches can help with other issues/PRs.

If this enhancement will not be used often, can it be worked around with a few lines of script?

Just take a look at godot bencmark.

Is there a reason why this should be core and not an add-on in the asset library?

Needing some add-on to develop godot itself seems odd.

I see #6747 but that's quite different with my proposal so I open this one.

The text was updated successfully, but these errors were encountered:

Calinou · 2024-12-04T14:09:52Z

Related to Add microbenchmarking macros to ease benchmarking portions of C++ code #6747.

Stable, repeatble result, with straightforward visualization. Results like this is just meaningless:

I've been considering removing that particular category of benchmarks. I don't know why the time taken to run is so variable. Maybe it's intentional for security reasons (to avoid timing attacks)?

AThousandShips · 2024-12-04T15:03:12Z

Are you sure those are that variable? That looks the same as my graph on chrome which is broken due to known issues, do they look the same in Firefox?

YYF233333 · 2024-12-04T15:24:34Z

I've been considering removing that particular category of benchmarks. I don't know why the time taken to run is so variable. Maybe it's intentional for security reasons (to avoid timing attacks)?

Well actually others suffer from the same problem, this is just the most significant one. Generally I see 5% or above diff between identical runs locally. That's not good enough to evaluate micro optimizations with 1-2 point improvements. Ideally I'd expect cpu core binding and auto confidence analysis.

Are you sure those are that variable? That looks the same as my graph on chrome which is broken due to known issues, do they look the same in Firefox?

Oh, Firefox's graph looks much better, I'll update the image.

AThousandShips · 2024-12-04T15:52:40Z

Those variations do indeed look more like natural anti-timing-attack variations

Ivorforce · 2024-12-05T10:24:39Z

A good way to test performance changes would be welcome! The past days, I've used this:

godot-benchmarks if the feature was covered in a benchmark.
- e.g. godot --headless -- --run-benchmarks --include-benchmarks="C++/StringManipulation/*"
Inject a benchmark into some unit test otherwise, run as godot --test
- A good place to put it is test_main.cpp at the top of test_main.

For the benchmarks, I've created a small ipython notebook for re-use:

results_old_str = "..."
results_new_str = "..."

import pandas as pd
import json

def read_json_to_dict(s: str):
    as_dict = json.loads(s)
    return {
        d["name"].lower(): d["results"]["time"]
        for d in as_dict["benchmarks"]
    }

results_df = pd.DataFrame(
    {"old": read_json_to_dict(results_old_str), "new": read_json_to_dict(results_new_str)},
)
print(results_df)

import plotly.express as px
fig = px.bar(results_df, barmode='group')
fig.update_layout(yaxis_title="time: ms", xaxis_title="benchmark")
fig.show()

As for the quick mini benchmark code, someone may find this handy:

// Setup
String s = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";

auto t0 = std::chrono::high_resolution_clock::now();
for (int i = 0; i < 100000; i ++) {
	// Test
	String s1 = s.replace("e", "b");
}
auto t1 = std::chrono::high_resolution_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count() << "ms\n";

clayjohn · 2024-12-05T20:45:12Z

For optimizations that impact engine code we should pretty much always be profiling using a proper profiler instead of a benchmark. Benchmarks are deceptive and can easily lead you astray, especially if you aren't benchmarking inside the engine. The gold standard is always to benchmark the code that is running in the engine, using the data that the engine provides instead of arbitrary data created for the benchmark.

Of course for many things you can't do that, especially for convenience functions (in classes like String for example) where the benefit is more for end-users who call these APIs than it is for engine internals.

AThousandShips added the performance label Dec 4, 2024

Calinou added the topic:core label Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add microbenchmark framework to godot to reduce the work of performance PR #11285

Add microbenchmark framework to godot to reduce the work of performance PR #11285

YYF233333 commented Dec 4, 2024 •

edited

Loading

Calinou commented Dec 4, 2024 •

edited

Loading

AThousandShips commented Dec 4, 2024

YYF233333 commented Dec 4, 2024

AThousandShips commented Dec 4, 2024

Ivorforce commented Dec 5, 2024 •

edited

Loading

clayjohn commented Dec 5, 2024

Add microbenchmark framework to godot to reduce the work of performance PR #11285

Add microbenchmark framework to godot to reduce the work of performance PR #11285

Comments

YYF233333 commented Dec 4, 2024 • edited Loading

Describe the project you are working on

Describe the problem or limitation you are having in your project

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

If this enhancement will not be used often, can it be worked around with a few lines of script?

Is there a reason why this should be core and not an add-on in the asset library?

Calinou commented Dec 4, 2024 • edited Loading

AThousandShips commented Dec 4, 2024

YYF233333 commented Dec 4, 2024

AThousandShips commented Dec 4, 2024

Ivorforce commented Dec 5, 2024 • edited Loading

clayjohn commented Dec 5, 2024

YYF233333 commented Dec 4, 2024 •

edited

Loading

Calinou commented Dec 4, 2024 •

edited

Loading

Ivorforce commented Dec 5, 2024 •

edited

Loading