Correct floor and ceil from truncation! #141

p0nce · 2024-07-19T10:42:00Z

Do those actually work and are correct? Found in stb_image_resize2.h
if true, could be faster than changing the rounding mode

p0nce · 2024-08-13T14:05:29Z

exhaustive testing in float32

p0nce · 2024-10-18T09:34:35Z

Finite number from -2e9 to 2e9 => OK, above that rounding makes no sense for a float
Denormals => OK
Preserve NaN-ness (warning: libc mandates to return the same NaN), test
-0 and +0, should preserve sign, test

If x is integral, +0, -0, NaN, or infinite, x itself is returned.

p0nce · 2024-10-18T09:42:57Z

Complete description here:

// Spec difference with ceilf: 
// - May return non-sensical for -inf and +inf (not seen on x86_64 though)
// - Doesn't preserve the sign of -0.0
// - Doesn't preserve NaN payload, does preserve being NaN
// Works similarly to ceilf on the [-2e9, 2e9] range, including denormals.
float testceilf(float x)  // martins ceilf
{
    assert(-2e9 <= x && x <= 2e9);
    __m128 f = _mm_set_ss(x);
    __m128 t = _mm_cvtepi32_ps(_mm_cvttps_epi32(f));
    __m128 r = _mm_add_ss(t, _mm_and_ps(_mm_cmplt_ss(t, f), _mm_set_ss(1.0f)));
    return _mm_cvtss_f32(r);
}

// Spec difference with floorf: 
// - Return non-sensical values for -inf and +inf
// - Doesn't preserve the sign of -0.0
// - Doesn't preserve NaN payload, does preserve being NaN
// Works similarly to floorf on the [-2e9, 2e9] range, including denormals.
float testfloorf(float x)  // martins floorf
{
    assert(-2e9 <= x && x <= 2e9);
    __m128 f = _mm_set_ss(x);
    __m128 t = _mm_cvtepi32_ps(_mm_cvttps_epi32(f));
    __m128 r = _mm_add_ss(t, _mm_and_ps(_mm_cmplt_ss(f, t), _mm_set_ss(-1.0f)));
    return _mm_cvtss_f32(r);
}

p0nce · 2024-10-18T09:43:46Z

I wonder how it fares on arm64 => people report working, so it could be used to emulate _mm_floor_ps and friends

p0nce added enhancement perf labels Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct floor and ceil from truncation! #141

Correct floor and ceil from truncation! #141

p0nce commented Jul 19, 2024

p0nce commented Aug 13, 2024 •

edited

Loading

p0nce commented Oct 18, 2024 •

edited

Loading

p0nce commented Oct 18, 2024 •

edited

Loading

p0nce commented Oct 18, 2024 •

edited

Loading

Correct floor and ceil from truncation! #141

Correct floor and ceil from truncation! #141

Comments

p0nce commented Jul 19, 2024

p0nce commented Aug 13, 2024 • edited Loading

p0nce commented Oct 18, 2024 • edited Loading

p0nce commented Oct 18, 2024 • edited Loading

p0nce commented Oct 18, 2024 • edited Loading

p0nce commented Aug 13, 2024 •

edited

Loading

p0nce commented Oct 18, 2024 •

edited

Loading

p0nce commented Oct 18, 2024 •

edited

Loading

p0nce commented Oct 18, 2024 •

edited

Loading