Incorrect representation with Schubfach? #13

blueglyph · 2023-03-09T17:04:06Z

With a double-precision value corresponding to the minimum value (hex representation 0x0000000000000001),

I'm expecting this string: "4.9E-324"
I get this string: "5e-324"

The expected string is coming from the Java implementation from the algorithm's author, Raffaello Giulietti.

I haven't done a more thorough test to see if there were other discrepancies. Most of the results are definitely fine but finding different values, and seeing the algorithm has been adapted, is not reassuring. Perhaps were there small adaptations in the original algorithm? There are several versions of the article.

Quick and dirty test file (requires schubfach_64.cpp and schubfach_64.h)

#include <cstdint>
#include <cstdio>
#include <cstring>
#include "schubfach_64.h"

using namespace std;
using namespace schubfach;

char BUFFER[DtoaMinBufferLength];

char *dtoa(double value)
{
    char *end = Dtoa(BUFFER, value);
    *end = 0;
    return BUFFER;
}

template <typename Dest, typename Source>
static inline Dest ReinterpretBits(Source source)
{
    static_assert(sizeof(Dest) == sizeof(Source), "size mismatch");

    Dest dest;
    std::memcpy(&dest, &source, sizeof(Source));
    return dest;
}

int main()
{
    uint64_t min_value_bits = 0x0000000000000001;
    double min_value = ReinterpretBits<double>(min_value_bits);
    double values[] = { 1.0, 10.0, 0.5, min_value };
    for (double i: values) {
        printf("%g: %s\n", i, dtoa(i));
    }
}

The text was updated successfully, but these errors were encountered:

abolz · 2023-03-09T17:54:18Z

Java, C++ (std::to_chars), and JavaScript all have slightly different formatting specifications. The Java implementation requires to print two significant digits for the minimum double precision value, although one significant digit is sufficient (i.e. 4.9e-324 and 5e-324 actually represent the same number). The algorithm implemented here always uses the shorter representation: it is not Java-conform. (C++ and JavaScript both require to always print the minimum number of digits.)

You can customize the current formatting procedure slightly by tweaking these constants

Drachennest/src/schubfach_64.cc

Lines 1171 to 1172 in e6714a3

    
           static constexpr int32_t MinFixedDecimalPoint = -6; 
        
           static constexpr int32_t MaxFixedDecimalPoint =  17;

IIRC JavaScript uses (-6, 21) and std::to_chars uses (-4, 6) here. I'm actually not sure why I chose 17 for the upper limit... but it might have something to do with the performance of the (partial) std::from_chars implementation here.

These constants do not affect the correctness of the algorithm. You can choose (almost) any pair of values here. Or you could even write your own formatting procedure if you like: The core algorithm gives you a number of the form N * 10^E.

blueglyph · 2023-03-09T18:50:29Z

I know about these values, but that's in FormatDigits(). The difference I see is already in the value returned by ToDecimal64().

In the Java implementation, toDecimal(int q, long c, int dk) yields s = 49, k+dk = -325 (the rounding to 50/-325 is not performed), and that is what is used to generate the decimal representation.

The C++ version does the rounding but with s = 4, a boolean (?) is added to yield s + roundup = 5, k = -324.
So the value dec = ToDecimal64(significand, exponent) is { digits: 5, exponent: -324 }.

I don't know the algorithm enough to see why it's different, but I can probably trace where it starts to diverge. I'm not sure whether it's important or not.

blueglyph · 2023-03-09T18:56:38Z

I doubt that it changes anything about the conclusion for your project though, it's only a small difference of output for the smallest value. 😉

blueglyph · 2023-03-09T19:05:45Z

Actually, I think it's because of this particular mode in the Java implementation, so maybe not an issue, or something that can be added if someone requires the extra precision:

In DoubleDecimal.java, toDecimal(double v), line 341

                // subnormal value
                return t < C_TINY
                       ? toDecimal(Q_MIN, 10 * t, -1)
                       : toDecimal(Q_MIN, t, 0);

with C_TINY = 3, and for this value, t = 1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect representation with Schubfach? #13

Incorrect representation with Schubfach? #13

blueglyph commented Mar 9, 2023

abolz commented Mar 9, 2023

blueglyph commented Mar 9, 2023

blueglyph commented Mar 9, 2023

blueglyph commented Mar 9, 2023

Incorrect representation with Schubfach? #13

Incorrect representation with Schubfach? #13

Comments

blueglyph commented Mar 9, 2023

abolz commented Mar 9, 2023

blueglyph commented Mar 9, 2023

blueglyph commented Mar 9, 2023

blueglyph commented Mar 9, 2023