Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct floor and ceil from truncation! #141

Open
p0nce opened this issue Jul 19, 2024 · 4 comments
Open

Correct floor and ceil from truncation! #141

p0nce opened this issue Jul 19, 2024 · 4 comments

Comments

@p0nce
Copy link
Collaborator

p0nce commented Jul 19, 2024

Do those actually work and are correct? Found in stb_image_resize2.h
if true, could be faster than changing the rounding mode

image

@p0nce
Copy link
Collaborator Author

p0nce commented Aug 13, 2024

  • exhaustive testing in float32

@p0nce
Copy link
Collaborator Author

p0nce commented Oct 18, 2024

  • Finite number from -2e9 to 2e9 => OK, above that rounding makes no sense for a float
  • Denormals => OK
  • Preserve NaN-ness (warning: libc mandates to return the same NaN), test
  • -0 and +0, should preserve sign, test

If x is integral, +0, -0, NaN, or infinite, x itself is returned.

@p0nce
Copy link
Collaborator Author

p0nce commented Oct 18, 2024

Complete description here:

// Spec difference with ceilf: 
// - May return non-sensical for -inf and +inf (not seen on x86_64 though)
// - Doesn't preserve the sign of -0.0
// - Doesn't preserve NaN payload, does preserve being NaN
// Works similarly to ceilf on the [-2e9, 2e9] range, including denormals.
float testceilf(float x)  // martins ceilf
{
    assert(-2e9 <= x && x <= 2e9);
    __m128 f = _mm_set_ss(x);
    __m128 t = _mm_cvtepi32_ps(_mm_cvttps_epi32(f));
    __m128 r = _mm_add_ss(t, _mm_and_ps(_mm_cmplt_ss(t, f), _mm_set_ss(1.0f)));
    return _mm_cvtss_f32(r);
}

// Spec difference with floorf: 
// - Return non-sensical values for -inf and +inf
// - Doesn't preserve the sign of -0.0
// - Doesn't preserve NaN payload, does preserve being NaN
// Works similarly to floorf on the [-2e9, 2e9] range, including denormals.
float testfloorf(float x)  // martins floorf
{
    assert(-2e9 <= x && x <= 2e9);
    __m128 f = _mm_set_ss(x);
    __m128 t = _mm_cvtepi32_ps(_mm_cvttps_epi32(f));
    __m128 r = _mm_add_ss(t, _mm_and_ps(_mm_cmplt_ss(f, t), _mm_set_ss(-1.0f)));
    return _mm_cvtss_f32(r);
}

@p0nce
Copy link
Collaborator Author

p0nce commented Oct 18, 2024

  • I wonder how it fares on arm64 => people report working, so it could be used to emulate _mm_floor_ps and friends

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant