Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r500 hardware: Don't expose full “NPOT Textures” support in Gallium Nine #133

Open
lorn10 opened this issue Apr 22, 2022 · 9 comments
Open

Comments

@lorn10
Copy link

lorn10 commented Apr 22, 2022

My "brainstorming" around bug #132 (which was in the end r300 driver related and corrected soon by @ondracka) 😉 led me to the following finding.

It looks that we have another feature called NPOT Textures which should be restricted in Gallium Nine for r500 and all pre-DX10 class ATI/AMD hardware (in fact the r300 Mesa driver). This is most likely also true for the corresponding NVIDIA GPUs of that era like the NV30 and NV40 GPU series. However, regarding the later one I am not absolutely sure.

All those pre-DX10 type GPU comes usually just with a limited support for NPOT Textures.

The following blog page of Aras Pranckevičius shows what should be done regarding this in Gallium Nine. I quote:

Things are quite simple here. D3DCAPS9.TextureCaps has capability bits, D3DPTEXTURECAPS_POW2 and D3DPTEXTURECAPS_NONPOW2CONDITIONAL both being off indicates full support for NPOT texture sizes. When both D3DPTEXTURECAPS_POW2 and D3DPTEXTURECAPS_NONPOW2CONDITIONAL bits are on, then you have limited NPOT support.

I’ve no idea what would it mean if NONPOW2CONDITIONAL bit is set, but POW2 bit is not.

Hardware wise, limited NPOT has been generally available since 2002-2004, and full NPOT since 2006 or so.

So regarding the r500 and generally all pre-DX10 class ATI/AMD hardware both flags should be set to "on". As mentioned, this seems to be true also for the Nvidia NV30 GPU range. However, I have no clue about NV40 but most likely it also doesn't support full NPOT.

I assume that the default setting is simply always "off" because Gallium Nine was mainly designed for newer DX10+ hardware. 😉

Addition: ATI/AMD and Nvidia are "lying" about the NPOT support in GL and GLES on DX9 hardware. Effectively there exist only support (in hardware) for the limited variant of NPOT, everything other is emulated in software.

@axeldavy
Copy link
Collaborator

Right, I think you are right about setting both these flags for these old cards (ideally based on a gallium flag). Though maybe this won't be enough.
What about the volume flag ?

@lorn10
Copy link
Author

lorn10 commented Apr 25, 2022

Maybe @ondracka has some more detailed ideas how to realize this at best. He is already fixing the r300 black rendering problem (#134) and maybe he has the time and motivation to add also a "NPOT Textures" correction for older hardware in Gallium Nine.

@ondracka
Copy link

Actually I won't be looking into this. There is a huge pile of more pressing r300 bugs so I'm not particularly motivated to go chasing hypothetical problems. If I understand it correctly this issue is based solely on the nine code inspection, but you are actually not aware of any app where the NPOT texture handling really leads to some specific rendering issues with r300 driver?

@lorn10
Copy link
Author

lorn10 commented Apr 26, 2022

I agree, this is effectively more a "hypothetical thing". 😉 So it has no priority.

However, according to the information available, and if you are following the "play book", then there should only be exposed a "limited NPOT Textures" support for old DX9 class hardware because it simply has only support for that.

And yes, I really tried the game "A Hat in Time" (GOG Version) also on my RV530 based iMac5,1 computer. I hoped to provoke there somehow a "NPOT texture situation".

But I failed dramatically right at the beginning. This game seems to be "monstrous", it is simply too heavy for my old iMac5,1. It was even an exercise to get it working at my iMac12,2 computer. Maybe this is one of the most complex D3D9 games ever produced. It gave me some strange errors at the CLI and then it totally messed up my Wine prefix. Luckily I made a copy of the prefix 👍 More information can be found at my comment here.

Whatever, maybe Axel Davy can put this on a nine "ToDo list". And if it's too complex to implement, then it can be left as it is.

@axeldavy
Copy link
Collaborator

@lorn10 I've some code that should fix r500's support of 256 constants (Last 3 commits of https://github.com/iXit/Mesa-3D/commits/master) in case you want to test.

@ondracka
Copy link

@lorn10 I've some code that should fix r500's support of 256 constants (Last 3 commits of https://github.com/iXit/Mesa-3D/commits/master) in case you want to test.

I did a quick testing and now we fail to compile pretty much any vertex shader using relative addressing. The issue with r500 hardware is that not only constants but also immediates (I'm using the TGSI nomenclature here, as I don't know much about DX) must fit into the 256 constants limit. And what sucks the most is that for vertex shaders we don't have any inlining options (we can inline 1, 0, -1 using the constant swizzles and thats it). To be honest I have no clue how this is supposed to work (how this works on Windows). I'd be happy to implement/enhance stuff on the driver side, but I really have no idea. As I see it, there are two options:

  • we can overwrite the last constant with the immediates. This is what Wine and if I understand it correctly also nine (before this series) does, we can do it a bit more efficiently in the driver because we can get rid at least of the 0 and 1.
  • crazy idea could be to potentially construct some of the constants (at least the easy ones like 2, 0.5, etc..) in the shaders, but that would have obvious code size and speed disadvantages.
    But maybe this would be better discussed in a separate thread/issue, feel free to CC me to anything related.

BTW regarding the NPOT support discussed here, r500 emulates ALL of NPOT support in shaders (we don't have even partial support, we support only rectangular textures), so if there is something not working, open a bug at fdo and I'll take a look if I can enhance the shader emulation.

@axeldavy
Copy link
Collaborator

@ondracka Is there an option to inline the offset for relative addressing ?

The difference indeed with this series is that when relative addressing is used, all 256 constants will be declared as used, thus leaving no space for the immediates.
Something that nine does is that it reads the immediate if it has them, and write the immediate value into the constant buffer for it to work with relative addressing. We could instead always read the constant buffer, not the immediate, for r500. The question is whether that will be enough for the few immediate that the d3d asm -> TGSI translation generates.

@axeldavy
Copy link
Collaborator

axeldavy commented Jul 21, 2022

I think if you can work with constructing the few immediates that are generated by the asm conversion (-1, 0, 1, 0.5, 2., If I checked correctly), and the integer offset of relative addressing, then everything could fit.

EDIT: Actually this is more complicated than that. Many of these constants are not in paths that can trigger when we use relative addressing. However there are a few immediates that are used to enforce d3d9 behaviour for a few things that I bet r500 doesn't need. For example the a0 clamping eats a few immediates, as well as the pointsize clamping. We'll need to continue this discussion elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants