Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm-decompile: add function index comments #2482

Merged
merged 1 commit into from
Oct 8, 2024

Conversation

python273
Copy link
Contributor

No description provided.

@python273 python273 force-pushed the wasm-decompile-fn-num-comment branch from 83e2564 to 047514f Compare October 4, 2024 20:22
Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you find this feature useful? Can you explain a little more about why its useful? I assumed the point of the decompiler was so that the user would not have to worry about things like the internal index spaces that live in the binary format?

@@ -826,9 +826,9 @@ struct Decompiler {
}
}
if (is_import) {
s += ";";
s += cat("; // f", std::to_string(func_index));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about index: instead of f?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are other types that are also indexed? So a unique prefix seems better, can change to func if it makes more sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've just never seen f0, f1 used to refer to function indexes, so its not obvious to me as the reader what it means.

@python273
Copy link
Contributor Author

You still have to cross-reference with other tools, all of which show function indexes: Chromium debugger, .wat, Cetus. So it's easier to search for functions in text editors when there's a numeric index.

@sbc100
Copy link
Member

sbc100 commented Oct 8, 2024

You still have to cross-reference with other tools, all of which show function indexes: Chromium debugger, .wat, Cetus. So it's easier to search for functions in text editors when there's a numeric index.

But doesn't the chromium debugger and the wat format both use function names (from the name section)?

Also, I assume in the case when there is no name in the name section the decompiler will use the function index as the function name?

@python273 python273 force-pushed the wasm-decompile-fn-num-comment branch from 047514f to c533954 Compare October 8, 2024 17:40
@python273
Copy link
Contributor Author

I'm new to WASM, so not sure in which cases names are available. Here's how most functions look in a random WASM game (Unity, Emscripten):

Chromium / Firefox devtools:
(func $func704 (param $var0 i32) (param $var1 i32)

.wat
(func (;704;) (type 4) (param i32 i32)

wasm-decompile:
function f_caa(a:int, b:int) {

The names are only shown for imported/exported functions:
(func $stackRestore (;415;) (export "stackRestore") (param $var0 i32)
(func $env._glClientWaitSync (;411;) (import "env" "_glClientWaitSync") (param i32 i32 i32 i32) (result i32))

So in the end you need numerical indexes to navigate.

@sbc100
Copy link
Member

sbc100 commented Oct 8, 2024

I'm new to WASM, so not sure in which cases names are available. Here's how most functions look in a random WASM game (Unity, Emscripten):

Chromium / Firefox devtools: (func $func704 (param $var0 i32) (param $var1 i32)

.wat (func (;704;) (type 4) (param i32 i32)

If you build you module with a name section I think you will see the actual names here.

wasm-decompile: function f_caa(a:int, b:int) {

Ah... where is caa coming from here? It seems like this should be f_703 in the absence of a better/real name. Perhaps we should fix that?

The names are only shown for imported/exported functions: (func $stackRestore (;415;) (export "stackRestore") (param $var0 i32) (func $env._glClientWaitSync (;411;) (import "env" "_glClientWaitSync") (param i32 i32 i32 i32) (result i32))

So in the end you need numerical indexes to navigate.

import/exported names are just one way to give a function a name. The better way to to use the names section.

@sbc100
Copy link
Member

sbc100 commented Oct 8, 2024

BTW, this change LGTM, with comments resolved.

@sbc100 sbc100 merged commit b15a13e into WebAssembly:main Oct 8, 2024
18 checks passed
@python273
Copy link
Contributor Author

If you build you module with a name section I think you will see the actual names here.

It's not my module and I don't have the source code :)

Ah... where is caa coming from here? It seems like this should be f_703 in the absence of a better/real name. Perhaps we should fix that?

So I tried adding a cmd arg for NameOpts::None (#2477), but it turns it off for other things like variables or globals, making the output way less readable, as it becomes a mess of numbers. Maybe adding an option to turn it off only for function names might be nice, though alphabetic names seem better than 5 digit numbers, so not sure, opted for comments.

@sbc100
Copy link
Member

sbc100 commented Oct 8, 2024

If you build you module with a name section I think you will see the actual names here.

It's not my module and I don't have the source code :)

In those cases I would hope the decompiler would put the function index into the name, not only in a comment.

Ah... where is caa coming from here? It seems like this should be f_703 in the absence of a better/real name. Perhaps we should fix that?

So I tried adding a cmd arg for NameOpts::None (#2477), but it turns it off for other things like variables or globals, making the output way less readable, as it becomes a mess of numbers.

So is caa just a made up string? Is that somehow less messy than using a number?

Maybe adding an option to turn it off only for function names might be nice, though alphabetic names seem better than 5 digit numbers, so not sure, opted for comments.

@python273
Copy link
Contributor Author

So is caa just a made up string?

It's generated from an index, so 704 = caa, 25 = z, 26 = aa, 27 = ba:

inline std::string IndexToAlphaName(Index index) {
std::string s;
do {
// For multiple chars, put most frequently changing char first.
s += 'a' + (index % 26);
index /= 26;
// Continue remaining sequence with 'a' rather than 'b'.
} while (index--);
return s;
}

Is that somehow less messy than using a number?

Just slightly easier to read/remember f_ficb instead of f37419. But yes, it'd be a good option to add, my main problem was finding the functions having an index from debugger, so a comment works well.

@sbc100
Copy link
Member

sbc100 commented Oct 8, 2024

It's generated from an index, so 704 = caa, 25 = z, 26 = aa, 27 = ba:

Interesting! That seems like pointless obfuscation to me... why not put the useful information in the name?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants