Unoptimized gamma correction shader math in crt-pi #35

battaglia01 · 2017-10-08T03:20:29Z

There's a few unoptimized lines of code in the gamma correction part of crt-pi.glsl, which is linked for reference here:
https://github.com/libretro/glsl-shaders/blob/master/crt/shaders/crt-pi.glsl

Gamma correction has been noted to be a potential source of slowdown in the code, and also in this thread here. However, all of the math here is really unoptimized, which is likely what is causing the slowdown.

Gamma correction is done on line 190-208. For reference here:

#if defined(SCANLINES)
#if defined(GAMMA)
#if defined(FAKE_GAMMA)
		colour = colour * colour;
#else
		colour = pow(colour, vec3(INPUT_GAMMA));
#endif
#endif
		scanLineWeight *= BLOOM_FACTOR;
		colour *= scanLineWeight;

#if defined(GAMMA)
#if defined(FAKE_GAMMA)
		colour = sqrt(colour);
#else
		colour = pow(colour, vec3(1.0/OUTPUT_GAMMA));
#endif
#endif
#endif

If we assume SCANLINES, GAMMA and FAKE_GAMMA are all defined, the above reduces to the following:

		colour = colour * colour;
		scanLineWeight *= BLOOM_FACTOR;
		colour *= scanLineWeight;
		colour = sqrt(colour);

Is there a reason it's being done like this? All of that is equivalent to

		colour *= sqrt(scanLineWeight * BLOOM_FACTOR)

This saves one multiplication and three assignments per loop! We avoid the unnecessary squaring and subsequent square rooting of colour, and we also don't need to update scanLineWeight as it's never used again in this scope.
we'
I don't know how much the assignments matter or if they're optimized out anyway, but fighting with the emulator over memory accesses has been noted as one of the major causes of slowdown, so worth bringing up...

There's a similar (but slightly trickier) thing you can do with the true gamma correction, not just FAKE_GAMMA, but I'll start here for now to see if I'm on the right wavelength...

The text was updated successfully, but these errors were encountered:

hizzlekizzle · 2017-10-09T16:03:07Z

Yeah, probably just done that way for code clarity. It'd be worth looking at the assembly to see how much of a difference it makes.

battaglia01 · 2017-10-10T16:26:39Z

I'd be really surprised if any compiler knew to optimize a squaring and subsequent square root into one operation. The assignments, probably. How can I compile this to assembly and check the output? Does OpenGL have an app for that, or do I just do something with GCC? Not used to GL shaders.

On Mon, Oct 9, 2017 at 12:03 PM hizzlekizzle ***@***.***> wrote: Yeah, probably just done that way for code clarity. It'd be worth looking at the assembly to see how much of a difference it makes. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#35 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA-SsuzmUAvY7hzypNjR2-2k9woYy6Oqks5sqkO7gaJpZM4Pxk3D> .

-- Mike

hizzlekizzle · 2017-10-10T19:00:12Z

That's a good question. I've used fxc.exe for HLSL shaders, but there doesn't seem to be anything as universally easy to use for GLSL, which probably shouldn't surprise me...

However, it seems this Radeon GPU Analyzer from AMD may be able to do it:
https://github.com/GPUOpen-Tools/RGA/releases

metallic77 · 2023-05-18T06:10:49Z

It gains about 15-20 fps this way in my test. 668 after, 650 before, thats 2-3% difference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unoptimized gamma correction shader math in crt-pi #35

Unoptimized gamma correction shader math in crt-pi #35

battaglia01 commented Oct 8, 2017

hizzlekizzle commented Oct 9, 2017

battaglia01 commented Oct 10, 2017 via email

hizzlekizzle commented Oct 10, 2017

metallic77 commented May 18, 2023 •

edited

Loading

Unoptimized gamma correction shader math in crt-pi #35

Unoptimized gamma correction shader math in crt-pi #35

Comments

battaglia01 commented Oct 8, 2017

hizzlekizzle commented Oct 9, 2017

battaglia01 commented Oct 10, 2017 via email

hizzlekizzle commented Oct 10, 2017

metallic77 commented May 18, 2023 • edited Loading

metallic77 commented May 18, 2023 •

edited

Loading