Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mumble lags and freezing when Left 4 Dead 2 is launched. PA not working #6558

Closed
I-ToRTUGa-I opened this issue Sep 21, 2024 · 9 comments · Fixed by #6598
Closed

Mumble lags and freezing when Left 4 Dead 2 is launched. PA not working #6558

I-ToRTUGa-I opened this issue Sep 21, 2024 · 9 comments · Fixed by #6598
Assignees
Labels
bug A bug (error) in the software client

Comments

@I-ToRTUGa-I
Copy link

Description

So I was trying to use positional audio in Left 4 Dead 2 (2.2.4.1). "Link to game and transmit position", "enable" and "PA" for Source Engine is ticked, but Mumble is getting super laggy and freezing every 2 seconds for me and my friend when the game is launched. Like voice is cracking and even the windows of the program freeze when you move them. Despite this, we checked if the positional audio even works and no, it's not working. We hear each other like in a normal chat (with voice cracking and huge latency ofc huh)

P.S. I am using Mumble for the first time, and there are no specific guides for L4D2.

Steps to reproduce

  1. Tick "Link to game and transmit position", "enable" and "PA" for Source Engine
  2. Launch Left 4 Dead 2

Mumble version

1.5.634

Mumble component

Client

OS

Windows

Reproducible?

Yes

Additional information

No response

Relevant log output

No response

Screenshots

No response

@I-ToRTUGa-I I-ToRTUGa-I added bug A bug (error) in the software triage This issue is waiting to be triaged by one of the project members labels Sep 21, 2024
@davidebeatrici davidebeatrici self-assigned this Sep 21, 2024
@davidebeatrici
Copy link
Member

For reference, the cause was found in #6491:

On my machine in a debug build, the constructor to ProcessWindows takes about 380ms for Grounded. It's largely from HostWindows::modules() and I think that is mostly time spent in VirtualQueryEx.

To be fair, VirtualQueryEx() is called several times (at least once per module) but it has no business taking so long just to retrieve some info.

And I'm not aware of another way to read the info we need...

I'm not aware how this plugin needs the information from VirtualQueryEx() anyway. It just needs the base address of the executable module. I think that tends to be the first module, but even if it looks modules up by name, the name is part of the MODULEENTRY32 from iterating with Module32First and Module32Next. So, at least for this plugin, it seems like HostWindows::modules() is doing a lot more work than it needs.

Nevertheless, it's probably not a bad idea to run this plugin initialization stuff in another thread instead of the main/GUI thread. But I haven't looked into how much work that would be.

You're absolutely right. We simply never noticed the issue until the new plugin framework became a thing (HostWindows already existed prior to that).

In my opinion plugin initialization should happen in a separate thread in any case, especially because the plugins that may be installed by a user are not guaranteed to be programmed correctly.

As for the modules: only a few games I've encountered hold memory areas of interest in libraries rather than the (main) executable. Refactoring the interface so that it doesn't iterate through all modules by default would make fully sense.

@davidebeatrici
Copy link
Member

Upon further investigation, VirtualQueryEx() appears to acquire some kind of lock in the target process, which explains the poor performance. The locking mechanism is probably there to prevent potential race conditions, but in my opinion it's just a bad design overall.

In any case, we'll take care of improving our own code so that the client is not affected by that anymore.

@I-ToRTUGa-I
Copy link
Author

I-ToRTUGa-I commented Sep 21, 2024

I hope you can solve this problem. This app has huge potential. I wish I was a coder to understand all this and help you...
Thank you for answering.

@davidebeatrici
Copy link
Member

No worries, this is high priority for us.

@davidebeatrici davidebeatrici added client and removed triage This issue is waiting to be triaged by one of the project members labels Sep 21, 2024
@Krzmbrzl
Copy link
Member

Maybe QueryVirtualMemoryInformation can also be used and has different performance characteristics?

@davidebeatrici
Copy link
Member

The author of this pull request claims that switching to QueryVirtualMemoryInformation yields a performance improvement, which is promising.

However, according to this the new function calls NtQueryVirtualMemory() under the hood, just like VirtualQueryEx().

@sqwishy
Copy link
Contributor

sqwishy commented Oct 4, 2024

I took a quick look at using QueryVirtualMemoryInformation instead of VirtualQueryEx and I'm realizing that I'm almost certain we aren't using VirtualQueryEx correctly.

Something to note: QueryVirtualMemoryInformation minimum supported version is windows 10, _WIN32_WINNT in cmake/compiler.cmake right now says that windows 7 is the oldest supported version. That flag would be set to NTDDI_WIN10_RSI or greater to use this part of the win api.

I noticed QueryVirtualMemoryInformation is considerably faster but it gives different results. As far as I can tell, it only returns one result for each mapped file, like a dll or exe. Whereas VirtualQueryEx returns a page for different section of a module, like the executable section, read only data, writable data, with different protection flags for each. Also when I ran using QueryVirtualMemoryInformation it was missing private pages, which I think are dynamic allocations? From my understanding of the docs, I expect this function can query those allocations but we happened to not see them because QueryVirtualMemoryInformation doesn't return information about holes and those allocations aren't continuous.

VirtualQueryEx returns "holes" between memory pages. Here's an example of what VirtualQueryEx can give us; the third row is a hole.

address size address + size base
0037D000 3000 00380000 00370000
00380000 4e000 003ce000 00370000
003ce000 2d2000 006a0000 00000000
006a0000 11000 006b1000 006a0000

The first two rows are pages right next to each other. The second one starts where the first one ends. They both have the same AllocationBase belonging to the module at that address. The third row is a hole between the third page (fourth row) and the second page. We see the hole from querying the virtual memory at address 003ce000 with VirtualQueryEx but QueryVirtualMemoryInformation would return false instead and we stop iteration. It's a hole because AllocationBase is zero.

The important thing here isn't that we're seeing holes and storing them in the MemoryRegion class, it's that we keep reading past the module and into other modules or mapped files or dynamic allocations and record those as MemoryRegions belonging to the module that they don't necessarily belong to. And this is done for each module. So for each module, it collects all the pages after the start of that module and adds them as MemoryRegions. So MemoryRegions for the same page get added to multiple modules.

The easiest fix is break out of the while (VirtualQueryEx(... loop if we see a page with an allocation address mbi.AllocationBase that is different from the module address me.modBaseAddr. When the source engine plugin calls ProcessBase::modules() in L4D2 it took about 145ms on my machine, after that change it takes about 3.5ms. It would still be doing more work than it needs by querying pages for modules that the plugin isn't interested in, but optimizing this further might not be super valuable to you guys idk. But at least in the short term the easy fix is easy.

@Krzmbrzl
Copy link
Member

Krzmbrzl commented Oct 4, 2024

right now says that windows 7 is the oldest supported version. That flag would be set to NTDDI_WIN10_RSI or greater to use this part of the win api.

Ah that appears to be a remnant of old times. I don't think current Mumble versions properly run on Windows 7 anymore. Plus, if fixing this requires non-EOL Windows versions then that's just what it is. We try to support old systems as long as feasible but at some point there has to be a cut so it might as well be now 🤷

sqwishy added a commit to sqwishy/mumble that referenced this issue Oct 4, 2024
This `VirtualQueryEx()` loop is called for each module in a
process. It reads pages starting at the module address but seems to
continue past into other modules and into dynamic allocations also.

This check stops enumerating pages once it encounters one that no longer
belongs to the module for which pages are being collected.

(Also this function opens two handles, this adds a clean up for the
first handle if opening the second fails.)

Fixes mumble-voip#6558
Krzmbrzl pushed a commit that referenced this issue Oct 4, 2024
This `VirtualQueryEx()` loop is called for each module in a
process. It reads pages starting at the module address but seems to
continue past into other modules and into dynamic allocations also.

This check stops enumerating pages once it encounters one that no longer
belongs to the module for which pages are being collected.

(Also this function opens two handles, this adds a clean up for the
first handle if opening the second fails.)

Fixes #6558

(cherry picked from commit 1498b83)
@davidebeatrici
Copy link
Member

I can confirm we have Windows 10 as a baseline, not sure about the specific build though.

It would still be doing more work than it needs by querying pages for modules that the plugin isn't interested in, but optimizing this further might not be super valuable to you guys idk.

I'm always open to potential performance improvements. And I agree that ideally we should only query modules the plugin is interested in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug (error) in the software client
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants