Skip to content

Commit

Permalink
Add support for allow-precision-loss in decimal operations (facebooki…
Browse files Browse the repository at this point in the history
…ncubator#10383)

Summary:
Each of the decimal operation functions is registered as two functions such as `add_deny_precision_loss` and `add`.
When allowing precision loss, establishing the result type of an arithmetic operation happens according to Hive behavior and SQL ANSI 2011 specification, i.e. rounding the decimal part of the result if an exact representation is not possible. Otherwise, NULL is returned in those cases, as previously.
When not allowing precision loss, not rounding the decimal part.

For example,
  | decimal(38, 7) + decimal(10, 0) result type | 1.1232154   + 1| decimal(38, 18) * decimal(38, 18)| 0.1234567891011 * 1234.1
-- | -- | -- | -- | --
allow   precision loss | decimal(38, 6) | 2.123215 | decimal(38, 6) | 152.358023
deny precision   loss | decimal(38, 7) | 2.1232154 | decimal(38, 36) | NULL

```
spark-sql (default)> set spark.sql.decimalOperations.allowPrecisionLoss=true;

spark-sql (default)> select cast(0.1234567891011 as decimal(38, 18)) * cast(1234.1 as decimal(38, 18));
152.358023

spark-sql (default)> set spark.sql.decimalOperations.allowPrecisionLoss=false;

spark-sql (default)> select cast(0.1234567891011 as decimal(38, 18)) * cast(1234.1 as decimal(38, 18));
NULL
```

Spark implementation: https://github.com/apache/spark/blob/branch-3.5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L814

Pull Request resolved: facebookincubator#10383

Reviewed By: pedroerp

Differential Revision: D65612198

Pulled By: kevinwilfong

fbshipit-source-id: 4910aaeb0e375dbe8817c5f3fb41185c67c6dd5b
  • Loading branch information
jinchengchenghh authored and facebook-github-bot committed Nov 12, 2024
1 parent fcce674 commit 3884939
Show file tree
Hide file tree
Showing 5 changed files with 332 additions and 68 deletions.
90 changes: 78 additions & 12 deletions velox/docs/functions/spark/decimal.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,49 +2,115 @@
Decimal Operators
=================

When calculating the result precision and scale of arithmetic operators,
the formulas follow Hive which is based on the SQL standard and MS SQL:
The result precision and scale computation of arithmetic operators contains two stages.
First stage computes precision and scale using formulas based on the SQL standard and Hive when allow-precision-loss is true.
The result may exceed maximum allowed precision of 38.

Second stage caps precision at 38 and either reduces the scale or not depending on allow-precision-loss flag.

For example, addition of decimal(38, 7) and decimal(10, 0) requires precision of 39 and scale of 7.
Since precision exceeds 38 it needs to be capped. When allow-precision-loss, precision is capped at 38 and scale is reduced by 1 to 6.
When allow-precision-loss is false, precision is capped at 38 as well, but scale is kept at 7.
With allow-precision-loss all additions will succeed, but accuracy (number of digits after period) of some operations will be reduced.
Without allow-precision-loss, some additions will return NULL.

For example,

The following queries keep accuracy or return NULL when allow-precision-loss is false:

::

select cast('1.1232154' as decimal(38, 7)) + cast('1' as decimal(10, 0)); -- 2.123215
select cast('9999999999999999999999999999999.2345678' as decimal(38, 7)) + cast('1' as decimal(10, 0)); -- NULL

These same operations succeed when allow-precision-loss is true:

::

select cast('1.1232154' as decimal(38, 7)) + cast('1' as decimal(10, 0)); -- 2.12321, lost the last digit
select cast('9999999999999999999999999999999.2345678' as decimal(38, 7)) + cast('1' as decimal(10, 0)); -- 10000000000000000000000000000000.234568

Decimal Precision and Scale Computation Formulas
------------------------------------------------

The HiveQL behavior:

https://cwiki.apache.org/confluence/download/attachments/27362075/Hive_Decimal_Precision_Scale_Support.pdf

https://msdn.microsoft.com/en-us/library/ms190476.aspx
Additionally, the computation of decimal division adapts to the allow-precision-loss flag,
while the decimal addition, subtraction, and multiplication do not.

Addition and Subtraction
------------------------
~~~~~~~~~~~~~~~~~~~~~~~~

::

p = max(p1 - s1, p2 - s2) + max(s1, s2) + 1
s = max(s1, s2)

Multiplication
--------------
~~~~~~~~~~~~~~

::

p = p1 + p2 + 1
s = s1 + s2

Division
--------
~~~~~~~~
When allow-precision-loss is true:

::

p = p1 - s1 + s2 + max(6, s1 + p2 + 1)
s = max(6, s1 + p2 + 1)

For above arithmetic operators, when the precision of result exceeds 38,
caps p at 38 and reduces the scale, in order to prevent the truncation of
the integer part of the decimals. Below formula illustrates how the result
precision and scale are adjusted.
When allow-precision-loss is false:

::

wholeDigits = min(38, p1 - s1 + s2);
fractionalDigits = min(38, max(6, s1 + p2 + 1));
p = wholeDigits + fractionalDigits
s = fractionalDigits

Decimal Precision and Scale Adjustment
--------------------------------------

When allow-precision-loss is true, rounds the decimal part of the result if an exact representation is not possible.
Otherwise, returns NULL.
Notice: some operations succeed if precision loss is allowed and return NULL if not.

For example,

::

select cast(0.1234567891011 as decimal(38, 18)) * cast(1234.1 as decimal(38, 18));
-- 152.358023 if allow-precision-loss, NULL otherwise.

Below formula illustrates how the result precision and scale are adjusted.

::

precision = 38
scale = max(38 - (p - s), min(s, 6))

Users experience runtime errors when the actual result cannot be represented
with the calculated decimal type.
When precision loss is not allowed, caps p at 38, and keeps scale as is.
The below formula shows how the precision and scale are adjusted for decimal addition, subtraction, and multiplication.

::

precision = 38
scale = min(38, s)

Decimal division uses a different formula:

::

precision = 38
scale = fractionalDigits - (wholeDigits + fractionalDigits - 38) / 2 - 1

Returns NULL when the actual result cannot be represented with the calculated decimal type.

Decimal Functions
-----------------
Expand Down
Loading

0 comments on commit 3884939

Please sign in to comment.