Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

renderer: implement ivec_t and the ability to send integer vectors to GLSL #1040

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

illwieckz
Copy link
Member

@illwieckz illwieckz commented Feb 8, 2024

Right now we can only send a single integer to GLSL, or float vectors. Meaning that if we need to sends more than one integer, we either have to send multiple variables, or to send integers in floats vectors.

This implements the ivec_t type in engine and implements the related classes to send ivec2, ivec3 and ivec4 to GLSL in a similar way than what we already do for vec2, vec3 and vec4.

@illwieckz
Copy link
Member Author

illwieckz commented Feb 8, 2024

Right now this code is not used, but I stumbled upon this need years ago, and I may use it in the future.

My linear light computation code in #1034 currently makes use of it:

Though, since that code uses a ivec2 to send two booleans, I may implement a hacky flag system to only send a single integer (GLSL doesn't suport bitwise operations but I only have to track two bits, so the values 0, 1, 2 and 3).

In all cases, as I said I found it very annoying that we can only send a single integer and not vector integer while we can send single float and float vector, so having this implemented makes the engine easier to deal with when implementing stuff.

@slipher
Copy link
Member

slipher commented Feb 13, 2024

We should add this if/when it is actually used by something.

That said, please don't add more stuff to q_shared.h. The definitions in question are only relevant to renderer so it could go in e.g. tr_local.h

@illwieckz
Copy link
Member Author

We should add this if/when it is actually used by something.

I usually agree with this, but here, definitely not.

Those are generic types, it's a library. I faced the need of this more than once in the last years, and I always had to rely on alternative solutions. This is even needed when not merging code that use it, because this is needed to prototype implementations and evaluate solutions. That's why I gave the example that maybe some legit usage of ivec2 may be one day avoided by using hacky optimizations and hacky flag emulation, but this never removes the need for ivec2 for the canonical implementation. Even if no one code is merged using this, we need this library code to prototype solutions. I always have to rely on float vectors when I write temporary code to evaluate something, even if each time I did that in the past, after multiple steps of optimizations and collapsing I ended up removing the need for an ivec. I'm fed-up of using floats as booleans when implementing exhaustive implementations prior to simplification, or to copypaste 3 or 4 times all the boiler plate code in 3 or 4 files for every integer, while an ivec allows to only do the boilerplate copypaste once in 3 or 4 files.

That said, please don't add more stuff to q_shared.h. The definitions in question are only relevant to renderer so it could go in e.g. tr_local.h

This is doable, though this is a very basic generic type system, and the float vectors are defined in q_shared.h. In the future we may replace all the VectorCopy and other macros with type-safe functions, and then we would need to define all those types in the most global way. Implementing those types and replacing those macros with type-safe functions would even help a migration to something else, one day (like GLM).

This work is not renderer-specific, it's engine architecture and fundamental type implementation. It just happens that I need this in renderer first because I'm focusing on the renderer right now.

We did similar work for float vector types when I worked on the model code. This PR is anothere step on that same road. Last time for models, today for some rendering techniques, but it's generic ground work for the whole engine, not just the renderer.

@slipher
Copy link
Member

slipher commented Feb 13, 2024

If there are many cases in previously written code where it would have been useful, you should be able to find one of those and migrate it? I'm not saying you need to use every length of vector, but at least one of GLUniform2i/GLUniform3i/GLUniform4i/.

Anyway using foo = bar[3]; typedefs are hardly something we want to encourage greater usage of in the future as they are type-unsafe.

void f(vec3_t v);
void g() {
   vec2_t v{};
   f(v);
}

The above code compiles without protest because 'f''s declaration is equivalent to void f(float* v);

Furthermore if you're going to have a general-purpose integer vector typedef, it should at least be decent enough what size of integer it is using (like i16vec4_t or u16vec2_t defined in tr_local.h). The generic ivec naming only arguably makes sense in the specific context of the OpenGL uniform APIs which only have one integer type.

You might consider going without a typedef at all and doing it like SetValue(a, b, c).

@illwieckz
Copy link
Member Author

illwieckz commented Feb 13, 2024

Anyway using foo = bar[3]; typedefs are hardly something we want to encourage greater usage of in the future as they are type-unsafe.

void f(vec3_t v);
void g() {
   vec2_t v{};
   f(v);
}

The above code compiles without protest because 'f''s declaration is equivalent to void f(float* v);

Furthermore if you're going to have a general-purpose integer vector typedef, it should at least be decent enough what size of integer it is using (like i16vec4_t or u16vec2_t defined in tr_local.h). The generic ivec naming only arguably makes sense in the specific context of the OpenGL uniform APIs which only have one integer type.

Right, immediately after having posted my comment I remember that what we had converted before was not vec2_t but f16vec2_t, so copypasting vec2_t for ivec2_t could not make it type-safe as our vec2_t implementation is not.

So right now I looked how to implement a vec_t the same way we did with f16vec2_t, I started with this:

struct vec_t {
	operator float() const { return bits; };

	float bits;
};

using vec2_t = vec_t[2];
using vec3_t = vec_t[3];
using vec4_t = vec_t[4];

But then I struggle to implement the vec_t() cast for float to allow us to do vec_t v = 1.0;

In the end I would like to replace the Vector2Copy/VectorCopy/Vector4Copy macros with something like:

inline void VectorCopy( const vec2_t in, vec2_t out )
{
	out[ 0 ] = in[ 0 ];
	out[ 1 ] = in[ 1 ];
}

inline void VectorCopy( const vec3_t in, vec3_t out )
{
	out[ 0 ] = in[ 0 ];
	out[ 1 ] = in[ 1 ];
	out[ 2 ] = in[ 2 ];
}

inline void VectorCopy( const vec4_t in, vec4_t out )
{
	out[ 0 ] = in[ 0 ];
	out[ 1 ] = in[ 1 ];
	out[ 2 ] = in[ 2 ];
	out[ 3 ] = in[ 3 ];
}

#define Vector2Copy VectorCopy
#define Vector4Copy VectorCopy

And the same with all the other vector macro (subtraction, addition…).

@illwieckz
Copy link
Member Author

illwieckz commented Feb 13, 2024

(I edited my previous comment as some wording was saying the contrary of what I meant).

@slipher
Copy link
Member

slipher commented Feb 13, 2024

If you must use an array-like type for some reason you can use std::array<int32_t, n>

@illwieckz
Copy link
Member Author

illwieckz commented Feb 13, 2024

If there are many cases in previously written code where it would have been useful, you should be able to find one of those and migrate it?

Everytime I faced the need myself I went up with other implementations anyway, and as I said, most of the time the ivec disappeared after multiple iteration in development, but this is luck. For example by implementing a compute in GLSL for every pixel, one may read the canonical documentation for the feature to implement and need an ivec3 for three numbers. This is the first iteration. This is a direct translation from a generic math algorithm to GLSL. Then later it is discovered the result of the computation using the ivec3 can be precomputed once for all in the C++ code, instead of doing it it per pixel. So in the next development iteration, the compute on the ivec3 is done in some C++ file and a single integer for the result is sent to be used by GLSL.

The removal of the need for sending the ivec3 then becomes an optimization. It is not guaranteed that any ivec3 computation we may want to do will be exportable to C++ instead in a way only a single integer result is needed. This is just luck if it happened before.

And in all cases, this is very annoying to have to think about the optimization at the same time the naive implementation is done. I want to be able to do naive implementation first, to validate they work, then to think about optimization to save the amount of variable we pass to GLSL and to reduce the GLSL code.

Many work we merged without using any ivec2/ivec3 were optimized code from which the naive implementation in the first iteration were using ivec2/ivec3, so I had to rely on float vectors as booleans or to copy paste many single-integer variable sending code, just to deleted all of that before submitting the PR when everything was optimized after 3 or 4 optimization passes.

One good example I have in mind is some existing code using some USE_ macro to select this or that compute, my first iterations for them were using many booleans in early stage of development to focus on the actual verification of the expected result, rather than looking for performance. I postponed the implementation of macro exclusion and all the related mental load for later, for after the computation was verified to be correct. In those intermediary steps I had to rely on boolean as floats, or copy-pasting many useless lines.

@illwieckz
Copy link
Member Author

illwieckz commented Feb 13, 2024

If you must use an array-like type for some reason you can use std::array<int32_t, n>

Ah, I forgot about std_array. Is there a reason why we did not used it for f16vec2_t?

I'm trying to convert vec_t first, then my ivec_t implementation would do the same once I'm sure I have an implementation that works for all our vec_t use case. My idea is to not implement yet another vector library (that would require to port all our existing code) but to implement a type-safe drop-in implementation that requires to rewrite none of the existing code, by providing the exact same syntax, but in a type-safe way.

I did that:

using vec_t = float;
using vec2_t = std::array<vec_t, 2>;
using vec3_t = std::array<vec_t, 3>;
using vec4_t = std::array<vec_t, 4>;

I'm now getting this:

In file included from Unvanquished/daemon/src/common/Common.h:35:
Unvanquished/daemon/src/engine/qcommon/q_shared.h: In function ‘__m128 sseLoadVec3(vec3_t)’:
Unvanquished/daemon/src/engine/qcommon/q_shared.h:1122:38: error: invalid cast from type ‘const vec3_t’ {aka ‘const std::array<float, 3>’} to type ‘__m64*’
 1122 |                 v = _mm_loadl_pi( v, (__m64 *)vec );
      |                                      ^~~~~~~~~~~~
Unvanquished/daemon/src/engine/qcommon/q_shared.h: In function ‘__m128 sseLoadVec3Unsafe(vec3_t)’:
Unvanquished/daemon/src/engine/qcommon/q_shared.h:1127:38: error: cannot convert ‘const vec3_t’ {aka ‘const std::array<float, 3>’} to ‘const float*’
 1127 |                 return _mm_loadu_ps( vec );
      |                                      ^~~
      |                                      |
      |                                      const vec3_t {aka const std::array<float, 3>}
In file included from Unvanquished/daemon/src/common/Platform.h:68,
                 from Unvanquished/daemon/src/engine/qcommon/q_shared.h:156:
/usr/lib/gcc/x86_64-linux-gnu/13/include/xmmintrin.h:938:28: note:   initializing argument 1 of ‘__m128 _mm_loadu_ps(const float*)’
  938 | _mm_loadu_ps (float const *__P)
      |               ~~~~~~~~~~~~~^~~
Unvanquished/daemon/src/engine/qcommon/q_shared.h: In function ‘void sseStoreVec3(__m128, vec3_t)’:
Unvanquished/daemon/src/engine/qcommon/q_shared.h:1130:32: error: invalid cast from type ‘vec3_t’ {aka ‘std::array<float, 3>’} to type ‘__m64*’
 1130 |                 _mm_storel_pi( (__m64 *)out, in );
      |   

Any idea to solve those cast issues?

@illwieckz
Copy link
Member Author

With help from @DolceTriade on the chat, I did this:

diff --git a/src/engine/qcommon/q_shared.h b/src/engine/qcommon/q_shared.h
index e27ab1832..a191b3398 100644
--- a/src/engine/qcommon/q_shared.h
+++ b/src/engine/qcommon/q_shared.h
@@ -226,10 +226,9 @@ void  Com_Free_Aligned( void *ptr );
 	*/
 
 	using vec_t = float;
-	using vec2_t = vec_t[2];
-
-	using vec3_t = vec_t[3];
-	using vec4_t = vec_t[4];
+	using vec2_t = std::array<vec_t, 2>;
+	using vec3_t = std::array<vec_t, 3>;
+	using vec4_t = std::array<vec_t, 4>;
 
 	using axis_t = vec3_t[3];
 	using matrix3x3_t = vec_t[3 * 3];
@@ -1120,15 +1119,15 @@ inline float DotProduct( const vec3_t x, const vec3_t y )
 	inline __m128 sseLoadVec3( const vec3_t vec ) {
 		__m128 v = _mm_load_ss( &vec[ 2 ] );
 		v = sseSwizzle( v, YYXY );
-		v = _mm_loadl_pi( v, (__m64 *)vec );
+		v = _mm_loadl_pi( v, (__m64 *) &vec[0] );
 		return v;
 	}
 	ATTRIBUTE_NO_SANITIZE_ADDRESS inline __m128 sseLoadVec3Unsafe( const vec3_t vec ) {
 		// Returns garbage in 4th element
-		return _mm_loadu_ps( vec );
+		return _mm_loadu_ps( &vec[0] );
 	}
 	inline void sseStoreVec3( __m128 in, vec3_t out ) {
-		_mm_storel_pi( (__m64 *)out, in );
+		_mm_storel_pi( (__m64 *) &out[0], in );
 		__m128 v = sseSwizzle( in, ZZZZ );
 		_mm_store_ss( &out[ 2 ], v );
 	}

This solved the first errors, but then I face many other errors, I may need to port many stray float* to vec3_t/vec4_t, which is find, but then I have to descend to hell to save the poor souls, like that:

 bool CM_PlaneFromPoints( vec4_t plane, const vec3_t a, const vec3_t b, const vec3_t c )
 {
-       vec3_t d1, d2;
+       vec3_t d1, d2, d3;
 
        VectorSubtract( b, a, d1 );
        VectorSubtract( c, a, d2 );
-       CrossProduct( d2, d1, plane );
 
-       if ( VectorNormalize( plane ) == 0 )
+       CrossProduct( d2, d1, d3 );
+
+       if ( VectorNormalize( d3 ) == 0 )
        {
                return false;
        }
 
-       plane[ 3 ] = DotProduct( a, plane );
+       plane[ 0 ] = d3[ 0 ];
+       plane[ 1 ] = d3[ 1 ];
+       plane[ 2 ] = d3[ 2 ];
+       plane[ 3 ] = DotProduct( a, d3 );
        return true;
 }

It looks like implementing a type-safe vec_t implementation would require to port the engine to vec_t first. 🤯️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants