Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dota2 benchmark regression #228

Open
11 of 15 tasks
kvark opened this issue Sep 27, 2020 · 1 comment
Open
11 of 15 tasks

Dota2 benchmark regression #228

kvark opened this issue Sep 27, 2020 · 1 comment
Labels

Comments

@kvark
Copy link
Member

kvark commented Sep 27, 2020

Somewhere between https://gfx-rs.github.io/2018/08/10/dota2-macos-performance.html and today we regressed the benchmark significantly. We used to get 33.9 / 3.5 (in the table), but now we are closer to just 30 (in immediate mode).
Here are the things to try:

  • new swapchain model is less efficient at fullscreen? Tried with "-sw" parameter, no change.
  • Dota doesn't discover some of the features we have, it needs to support VK_KHR_portability_subset
  • on macOS 11 we aren't using some of the Metal features we are supposed to use
  • we are locking too much, and that hurts on macOS11 beta, possibly because of parking_lot regression? Tried parking_lot-0.6, no change.
  • macOS 11 beta has some strange scheduling issues. Maybe background apps interfere too much? 🥇 disabling Firefox in the background gets us from 30 to 32fps in average.
  • setting labels on internal encoders is slow - hid them behind a flag now, not sure if changed any metrics
  • ExactSizeIterator bounds are confusing the compiler and making us heap-allocate more?
  • Rust optimization regression?
  • creating and filling MTLRenderPassDescriptor is too slow - implemented a pool for reusing them in gfx-rs/gfx@bbf55ff
  • binding descriptor sets is still to slow - rewritten in gfx-rs/gfx@15e456b
  • too many redundant depth-stencil changes - fixed in gfx-rs/gfx@0d581a6
  • redundant scissor changes - fixed in gfx-rs/gfx@f0051da
  • we are doing many redundant passes - investigated, no clear picture of why that would happen
  • we are rebinding too many resources - testing a fix but it gave no performance difference. We are binding them in ranges, anyway, so Metal run-time possibly does this for us.
  • we are using mutable pixel views too often - fixed in gfx-rs/gfx@22e2f4d

Related to gfx-rs/gfx#3378, gfx-rs/gfx#3382, gfx-rs/gfx#3381, gfx-rs/gfx#3383

@kvark kvark added the bug label Sep 27, 2020
@kvark
Copy link
Member Author

kvark commented Jan 17, 2021

Dota doesn't discover some of the features we have

I just noticed that it does indeed print the following lines when ran on gfx-portability:

Vulkan physical device (0): does not support Metal depthSampleCompare.
Vulkan physical device (0): using transform constant buffer: false
Vulkan physical device (0): supports shader clip distance: true
Vulkan physical device (0): using secondary command buffers: false

I imagine working around the comparison samplers has a solid GPU performance cost, and on Intel GPU it would show up as an overall regression. So I believe this issue is blocked on ValveSoftware/Dota-2-Vulkan#351

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant