Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numbers with small mantissa widths #866

Open
mnieper opened this issue Aug 26, 2024 · 12 comments
Open

Numbers with small mantissa widths #866

mnieper opened this issue Aug 26, 2024 · 12 comments

Comments

@mnieper
Copy link
Contributor

mnieper commented Aug 26, 2024

Page 16 of the R6RS says that x|p represents the best binary floating-point approximation of x using a p-bit significand. Chez Scheme does not seem to comply with it. For example, at the REPL:

> (exact 0.1|1)
3602879701896397/36028797018963968

The denominator is the value of (expt 2 55). However, the best floating-point approximation of 0.1 with a mantissa width of 1 bit has a very small power of two in the denominator.

@burgerrg
Copy link
Contributor

From https://www.r6rs.org/final/html/r6rs/r6rs-Z-H-7.html#node_sec_4.2.8:

A representation of a number object with nonempty mantissa width, x|p, represents the best binary floating-point approximation of x using a p-bit significand. For example, 1.1|53 is a representation of the best approximation of 1.1 in IEEE double precision. If x is an external representation of an inexact real number object that contains no vertical bar, then its numerical value should be computed as though it had a mantissa width of 53 or more.

Implementations that use binary floating-point representations of real number objects should represent x|p using a p-bit significand if practical, or by a greater precision if a p-bit significand is not practical, or by the largest available precision if p or more bits of significand are not practical within the implementation.

Chez Scheme uses the precision of IEEE double-precision floating-point numbers.

According to https://scheme.com/tspl4/objects.html#./objects:s76,

A mantissa width |w may appear as the suffix of a real number or the real components of a complex number written in floating-point or scientific notation. The mantissa width w represents the number of significant bits in the representation of the number. The mantissa width defaults to 53, the number of significant bits in a normalized IEEE double floating-point number, or more. For denormalized IEEE double floating-point numbers, the mantissa width is less than 53. If an implementation cannot represent a number with the mantissa width specified, it uses a representation with at least as many significant bits as requested if possible, otherwise it uses its representation with the largest mantissa width.

@mnieper
Copy link
Contributor Author

mnieper commented Aug 26, 2024

From https://www.r6rs.org/final/html/r6rs/r6rs-Z-H-7.html#node_sec_4.2.8:

A representation of a number object with nonempty mantissa width, x|p, represents the best binary floating-point approximation of x using a p-bit significand. For example, 1.1|53 is a representation of the best approximation of 1.1 in IEEE double precision. If x is an external representation of an inexact real number object that contains no vertical bar, then its numerical value should be computed as though it had a mantissa width of 53 or more.
Implementations that use binary floating-point representations of real number objects should represent x|p using a p-bit significand if practical, or by a greater precision if a p-bit significand is not practical, or by the largest available precision if p or more bits of significand are not practical within the implementation.

This is consistent with what I reported. The binary floating-point number denoted by 0.1|1 can be represented by an IEEE float (whose mantissa will have at most one bit set). We have two uses of the word representation in the standard: an abstract binary floating-point number is represented (denoted) by lexical syntax; this binary floating-point number, in turn, is represented by, in the case of Chez Scheme, an IEEE double-precision floating-point number.

The issue is relevant for accurate writing and reading of inexact numbers (which has a great tradition in Scheme 😉) across Schemes with varying precisions. A Scheme with a mantissa width of one could write 0.125 as 0.1|1. A Scheme with a mantissa width greater than one would not read it correctly if it ignored the given mantissa width. (Of course, a Scheme with a mantissa width of one is not a realistic assumption; this assumption is just for demonstration. The principle, however, remains the same when exchanging data between a Scheme that uses single-precision IEEE floats and a Scheme using double-precision IEEE floats.)

@burgerrg
Copy link
Contributor

It looks as though s/strnum.ss discards the p of "x | p" on input, which is allowed by R6RS since it uses the largest available precision, that of the IEEE double float.

Regarding exchanging floating-point numbers between systems, I'd strongly recommend using a binary protocol like Google's protobuf. I've been burned many times when using base-10 representations. Accurately printing and reading floating-point numbers is difficult, and I've encountered many libraries that don't get all the details correct.

@burgerrg
Copy link
Contributor

burgerrg commented Aug 26, 2024

If you'd like to implement what you're proposing, you can find the code that converts exact numbers to floating-point in S_floatify of c/number.c.

@mnieper
Copy link
Contributor Author

mnieper commented Aug 27, 2024

It looks as though s/strnum.ss discards the p of "x | p" on input, which is allowed by R6RS since it uses the largest available precision, that of the IEEE double float.

Regarding exchanging floating-point numbers between systems, I'd strongly recommend using a binary protocol like Google's protobuf. I've been burned many times when using base-10 representations. Accurately printing and reading floating-point numbers is difficult, and I've encountered many libraries that don't get all the details correct.

I am more concerned about the correctness of Chez Scheme with respect to R6RS. If a Scheme implementation were allowed to simply drop the mantissa width on reading, it would be mostly meaningless. I am going to try to reach the authors of SRFI 77, which became part of R6RS.

If you'd like to implement what you're proposing, you can find the code that converts exact numbers to floating-point in S_floatify of c/number.c.

Thank you for pointing this out to me. It seems to me that lines like

(s (if i? (inexact n) n))))
need to truncate the exact number according to the mantissa width before passing it to inexact, so that the code in c/number.c won't have to be touched.

@mnieper
Copy link
Contributor Author

mnieper commented Aug 28, 2024

Here is the thread I opened on the SRFI 77 mailing list: https://srfi-email.schemers.org/srfi-77/threads/2024/08/

Independently, it occurred to me that we can simplify the discussion about the meaning of 0.1|1 by prefixing it with #e. In this case, it is an exact number, so the actual floating point format used by Chez should not matter.

I expect

> #e0.1|1
1/8

but I get

> #e0.1|1
1/10

@burgerrg
Copy link
Contributor

What would your proposal return for #e0.1|53?

@mnieper
Copy link
Contributor Author

mnieper commented Aug 28, 2024

What would your proposal return for #e0.1|53?

The same as for (exact 0.1) (in a Scheme like Chez that approximates 0.1 by an IEEE double).

> #e0.1|53
3602879701896397/36028797018963968
> (exact 0.1)
3602879701896397/36028797018963968

@burgerrg
Copy link
Contributor

But #e0.1 is 1/10?

@mnieper
Copy link
Contributor Author

mnieper commented Aug 28, 2024

But #e0.1 is 1/10?

Yes. When there is no explicit mantissa width, nothing changes compared to the current behaviour of Chez Scheme. (Virtually all code does not use explicit mantissa widths, so this is an important point.)

This is consistent with R6RS, which demands:

If x is an external representation of an inexact real number object that contains no vertical bar, then its numerical value should be computed as though it had a mantissa width of 53 or more.

Chez Scheme already computes inexact real number objects using a mantissa width of 53 (barring subnormal numbers); exact numbers are subject to some default mantissa width.

@burgerrg
Copy link
Contributor

So you're proposing that the vertical bar has higher precedence than the #e prefix. If parentheses could be used, we would have #e(0.1|1). Does the standard say anything about the interaction of #e and |?

@mnieper
Copy link
Contributor Author

mnieper commented Aug 29, 2024

So you're proposing that the vertical bar has higher precedence than the #e prefix. If parentheses could be used, we would have #e(0.1|1). Does the standard say anything about the interaction of #e and |?

Yes, it does. Let me quote the third paragraph of section 4.2.8. of R6RS:

A representation of a number object may be specified to be either exact or inexact by a prefix. The prefixes are #e for exact, and #i for inexact. An exactness prefix may appear before or after any radix prefix that is used. If the representation of a number object has no exactness prefix, the constant is inexact if it contains a decimal point, an exponent, or a nonempty mantissa width; otherwise it is exact.

It follows from that paragraph that #e0.1|1 represents an exact number, notwithstanding the decimal point or the | symbol. Because an exact number is represented, the Scheme implementation has to represent the mathematical number represented by 0.1|1 exactly (by an exact number object). The mathematical number represented by 0.1|1 is "the best binary floating-point approximation of x using a p-bit significant", which is the number 0.125 in usual notation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@mnieper @burgerrg and others