Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split change outputs with asset quantities exceeding the maximum #2536

Merged
merged 17 commits into from
Feb 27, 2021

Conversation

jonathanknowles
Copy link
Contributor

@jonathanknowles jonathanknowles commented Feb 25, 2021

Issue Number

#2532
ADP-726

Overview

Although quantities of individual assets are effectively unlimited, a transaction on the blockchain can never include an asset quantity greater than maxBound :: Word64.

This PR tweaks the coin selection algorithm to detect change bundles containing excessively large asset quantities.

If such a change bundle is detected, we now split the change bundle up into smaller bundles using equipartitioning.

Example

Let's suppose the maximum allowable token quantity is 5, and we have a change map with the following quantities:

[("a", 11), ("b", 5), ("c", 12)]

In this case, we must divide the map into at least three smaller maps in order to not exceed the maximum allowable token quantity.

Under the equipartitioning scheme, this would give us:

[("a", 3), ("b", 1), ("c", 4)]
[("a", 4), ("b", 2), ("c", 4)]
[("a", 4), ("b", 2), ("c", 4)]

Note that while the overall sum is preserved, the individual bundles are almost equal, but not quite: this is because 11 and 5 are not divisible by 3. We must therefore accept a small loss of proportionality in the result.

Details

An equipartition of a bundle b is a partition into multiple bundles, where for every asset a in the set of assets contained in b, the difference between the following quantities is either zero or one :

  • The smallest quantity of asset a in the resultant bundles
  • The greatest quantity of asset a in the resultant bundles

In order to determine the number of parts in which to split a given bundle, we choose the smallest number of parts that still allows us satisfy the goal of not exceeding the maximum allowable quantity in any given bundle.

Performance

In order to avoid evaluating a partition for every single change output, we short circuit in the event that there is no token quantity greater than the maximum allowable quantity:

equipartitionTokenMapWithMaxQuantity m (TokenQuantity maxQuantity)
    | maxQuantity == 0 =
        maxQuantityZeroError
    | currentMaxQuantity <= maxQuantity =
        m :| []
    | otherwise =
        equipartitionTokenMap m (() :| replicate extraPartCount ())

Testing

Property tests

Equipartitioning behaviour is tested by the following property tests:

  • prop_equipartitionTokenBundle*
  • prop_equipartitionTokenMap*

Unit tests

As a sanity check, this PR also provides unit tests for performSelection with inputs containing token quantities that are close to the maximum, and demonstrates that change bundles are correctly partitioned in the results.

@jonathanknowles jonathanknowles force-pushed the jonathanknowles/upper-bound-for-change-outputs branch from d0daabd to 8e82d62 Compare February 25, 2021 11:07
@jonathanknowles jonathanknowles self-assigned this Feb 25, 2021
@jonathanknowles jonathanknowles changed the title Split change ouputs with asset quantities exceeding the maximum Split change outputs with asset quantities exceeding the maximum Feb 25, 2021
@jonathanknowles jonathanknowles linked an issue Feb 25, 2021 that may be closed by this pull request
@jonathanknowles jonathanknowles force-pushed the jonathanknowles/upper-bound-for-change-outputs branch 2 times, most recently from 2956042 to 93bd84b Compare February 26, 2021 02:53
Copy link
Contributor

@rvl rvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will work but it was confusing to review and seems like a lot of code for something which is basically this:

example :: [(String, Natural)]
example = [("a", 11), ("b", 5), ("c", 12)]

-- >>> limitSize 5 example
-- [[("a",5),("b",5),("c",5)],[("a",5),("c",5)],[("a",1),("c",2)]]
limitSize :: (Traversable t, Integral n) => n -> [t n] -> [[t n]]
limitSize lim = group . map (fmap (splitNum lim))
  where
    group = L.transpose . map sequence

splitNum :: Integral n => n -> n -> [n]
splitNum q a = replicate (fromIntegral n) q ++ if r == 0 then [] else [r]
  where
    n = a `div` q
    r = a `mod` q

So is it possible to adjust this limitSize slightly so that it can be applied to TokenMap?

.hlint.yaml Outdated Show resolved Hide resolved
-- ^ Represents the number of portions in which to partition the coin.
-> NonEmpty Coin
-- ^ The partitioned coins.
equipartitionCoin c =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the max supply of ada, there's no need to apply this fix to Coin.

Copy link
Contributor Author

@jonathanknowles jonathanknowles Feb 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the max supply of ada, there's no need to apply this fix to Coin.

The point here is that we need to divide output coins together with change maps, so that we can retain proportionality. (Each change map is paired with an output coin, so that when we assign coins to change maps, we can assign them in proportion to the sizes of the coins in the original outputs.)

Let's say we need to partition the following token map into two:

  • TokenMap A:
    • Banana: 100
    • Apple: 10

This would give us two token maps with the following tokens:

  • TokenMap A1:
    • Banana: 50
    • Apple: 5
  • TokenMap A2:
    • Banana: 50
    • Apple: 5

However, the change generation algorithm pairs each change token map with a coin from the original outputs, so that when we eventually assign ada values to change maps (to make change bundles), we can assign them in proportion to the sizes of the ada in the original outputs.

So we also need to divide the coin value.

Let's say this particular change map was paired with an output coin value of 100. That will mean that each divided change map will be paired with a coin value of 50.

For the sake of convenience, when dividing the pair (change map, output coin value), we treat it as a token bundle, so that we can use the equipartitionTokenBundleWithMaxQuantity function.

jonathanknowles added a commit that referenced this pull request Feb 26, 2021
This function is now redundant.

In response to review feedback:

#2536 (comment)
jonathanknowles added a commit that referenced this pull request Feb 26, 2021
@jonathanknowles
Copy link
Contributor Author

@rvl wrote:

So is it possible to adjust this limitSize slightly so that it can be applied to TokenMap?

-- >>> limitSize 5 example
-- [[("a",5),("b",5),("c",5)],[("a",5),("c",5)],[("a",1),("c",2)]]
limitSize :: (Traversable t, Integral n) => n -> [t n] -> [[t n]]
limitSize lim = group . map (fmap (splitNum lim))
  where
    group = L.transpose . map sequence

I deliberately avoided the use of L.transpose here, as from past experience in using this function, it often gives poor performance in comparison to a single strict fold with an appropriately-written accumulator function.

This is what we've done in equipartitionTokenMap:

equipartitionTokenMap m count =
    F.foldl' accumulate (TokenMap.empty <$ count) (TokenMap.toFlatList m)
  where
    accumulate
        :: NonEmpty TokenMap
        -> (AssetId, TokenQuantity)
        -> NonEmpty TokenMap
    accumulate maps (asset, quantity) = NE.zipWith (<>) maps $
        TokenMap.singleton asset <$>
            equipartitionTokenQuantity quantity count

This is called with the appropriate value of count by equipartitionTokenMapWithMaxQuantity.

Use this function to replace the common functionality of:

- `unsafePartitionCoin`
- `unsafePartitionTokenQuantity`
This function is useful for writing unit tests.
These functions share the same general properties, so it makes sense to
document these properties in a single place.
This change adds unit tests for `performSelection` to demonstrate what
should happen when inputs are selected whose total quantity of a given
asset is close to the maximum token quantity.

It should not be possible to produce a `SelectionResult` where one or
more change outputs contains a token quantity greater than the maximum
(maxBound :: Word64).

These unit tests serve as a simple sanity check: the primary testing of
the change map partitioning behaviour is handled by the following
property tests:

 - prop_equipartitionCoin_*
 - prop_equipartitionTokenBundle_*
 - prop_equipartitionTokenMap_*
This function is now redundant.

In response to review feedback:

#2536 (comment)
@jonathanknowles jonathanknowles force-pushed the jonathanknowles/upper-bound-for-change-outputs branch from 0cfb86f to d5093c2 Compare February 26, 2021 05:07
This change extracts out the list-processing part of
`splitMapsWithExcessiveQuantities` into a separate function
`equipartitionTokenBundlesWithMaxQuantity` that can be tested.

It also adds property tests for this function.

In response to review feedback:

#2536 (comment)
@jonathanknowles jonathanknowles force-pushed the jonathanknowles/upper-bound-for-change-outputs branch from 8ac1bf8 to 9cf72cb Compare February 26, 2021 06:16
@jonathanknowles jonathanknowles requested a review from rvl February 26, 2021 06:20
@jonathanknowles
Copy link
Contributor Author

jonathanknowles commented Feb 26, 2021

@rvl wrote:

example :: [(String, Natural)]
example = [("a", 11), ("b", 5), ("c", 12)]

-- >>> limitSize 5 example
-- [[("a",5),("b",5),("c",5)],[("a",5),("c",5)],[("a",1),("c",2)]]
limitSize :: (Traversable t, Integral n) => n -> [t n] -> [[t n]]
limitSize lim = group . map (fmap (splitNum lim))
  where
    group = L.transpose . map sequence

Having looked at this in more detail: your suggestion is different from what this PR is doing, which is to retain the proportionality between quantities of different tokens in a bundle to be partitioned.

So to take your example bundle (above), the equipartitionTokenBundle function would split it into the following bundles:

[("a", 3), ("b", 1), ("c", 4)]
[("a", 4), ("b", 2), ("c", 4)]
[("a", 4), ("b", 2), ("c", 4)]

Note that the original proportions are maintained (modulo rounding).

@rvl
Copy link
Contributor

rvl commented Feb 26, 2021

OK thanks for the answers. I am trying to find out why it needs to be much more complicated than the example code I showed. In that example code, the splitNum function could/should be modified to divide the value evenly, and List.transpose can easily be swapped for something efficient. What is this proportionality and why is it needed? I thought that "equipartition" meant that the weights are [1,1,1,..], which is even not proportional.

@jonathanknowles
Copy link
Contributor Author

jonathanknowles commented Feb 26, 2021

@rvl wrote:

OK thanks for the answers.

You're welcome.

I am trying to find out why it needs to be much more complicated than the example code I showed. In that example code, the splitNum function could/should be modified to divide the value evenly, and List.transpose can easily be swapped for something efficient.

You're right. Though in doing so, I'm guessing we would still end up with something that is a special case of the already-existing partitionNatural function, which is already well-tested, and has all the properties we need, including:

  • preservation of the sum
  • exact proportionality where possible, but with rounding in the case where values are not evenly divisible.

Reusing this pre-existing function means we can worry a bit less about checking that rounding is correct.

What is this proportionality and why is it needed?

Short answer

This was the solution proposed in the original discussion thread for this PR. However, since Slack threads are bad for recording design decisions, I've copied and pasted the essential part of that discussion into https://jira.iohk.io/browse/ADP-726. (My apologies for copying it over a bit late.)

Longer answer

We're trying, to the greatest extent possible, to retain the ratios between asset quantities in the original outputs. If a user specifies an output of [("assetA", 500), ("assetB", 1000)] (in a 1:2 ratio), then when generating a change bundle for this output, we try to generate a bundle that has assets in the same 1:2 ratio. Usually, this will also mean that the asset quantities themselves are also similar to those in the original outputs, as we aim to select around 200% of each asset during the initial selection phase. The idea behind this is that over time, the user's UTxO set distribution will contain entries that are similar to the payments they commonly make.

If we reach the situation where we absolutely have to split up a change bundle to avoid overflow, then there doesn't seem to be a good reason to break this ratio.

Furthermore, retaining the ratio between assets might help to minimize the production of dust quantities.

To give an illustration: if we overflow by just 1, then we'd probably want to avoid partitioning a token quantity into [maxBound :: Word64, 1], because 1 is a value that isn't likely to be very useful when making future coin selections.

@rvl wrote:

I thought that "equipartition" meant that the weights are [1,1,1,..], which is even not proportional.

You're right, in that the weights are [1, 1, 1 ..]: this means we'll generate bundles that are approximately equal to one another.

But proportionality here refers to the fact that every bundle in the partitioned result has a similar ratio of token quantities. For example, if the original bundle has tokenA:tokenB in a 1:2 ratio, then the partitioned bundles will also have these tokens in a similar ratio.

So let's take an easy example:

Example 1

[("a", 30), ("b", 24), ("c", 18)]

The tokens 'a', 'b', and 'c' are in a 5:4:3 ratio.

Suppose the maximum token quantity is 10. In this case, we'd need to partition the bundle into three smaller bundles:

[("a", 10), ("b", 8), ("c", 6)]
[("a", 10), ("b", 8), ("c", 6)]
[("a", 10), ("b", 8), ("c", 6)]

The quantities of 'a', 'b', and 'c' in the resulting bundles are still in a 5:4:3 ratio. Since all the numbers are divisible by 3, the resulting bundles are identical.

Example 2

Suppose though that the maximum token quantity is 6. In this case, we'd need to partition into five smaller bundles:

[("a", 6), ("b", 4), ("c", 3)]
[("a", 6), ("b", 5), ("c", 3)]
[("a", 6), ("b", 5), ("c", 4)]
[("a", 6), ("b", 5), ("c", 4)]
[("a", 6), ("b", 5), ("c", 4)]

Now we hit the situation where 24 and 18 are not evenly divisible by 5, so we have to round some values down in the result, and some values up. (This is handled automatically by partitionNatural.)

TokenQuantity currentMaxQuantity = TokenMap.maximumQuantity m

extraPartCount :: Int
extraPartCount = floor $ pred currentMaxQuantity % maxQuantity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a slightly odd way to compute this and quite confusing at first. But ok with the construction of the non-empty list above ✔️

That'd be clearer with a let-binding because this calculation only makes sense with the surrounding context 😏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a slightly odd way to compute this and quite confusing at first. But ok with the construction of the non-empty list above heavy_check_mark

Yes, it does seem odd at first. Though you're absolutely right -- it's this way because of the non-empty list.

The test case for this uses a more straightforward calculation:

-- | Computes the number of parts that 'equipartitionTokenMapWithMaxQuantity'
--   should return.
--
equipartitionTokenMapWithMaxQuantity_expectedLength
    :: TokenMap -> TokenQuantity -> Int
equipartitionTokenMapWithMaxQuantity_expectedLength
    m (TokenQuantity maxQuantity) =
        max 1 $ ceiling $ currentMaxQuantity % maxQuantity
  where
    TokenQuantity currentMaxQuantity = TokenMap.maximumQuantity m

, (Coin 1_000_000, [(assetA, q2)])
]
boundaryTestChange =
[ (Coin 500_000, [(assetA, maxTxOutTokenQuantity)]) ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

, (Coin 1_000_000, [(assetA, q2)])
]
boundaryTestChange =
[ (Coin 500_000, [(assetA, maxTxOutTokenQuantity)]) ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

boundaryTestChange =
[ (Coin 250_000, [(assetA, q1)])
, (Coin 250_000, [(assetA, q2)])
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

boundaryTestChange =
[ (Coin 250_000, [(assetA, maxTxOutTokenQuantity)])
, (Coin 250_000, [(assetA, maxTxOutTokenQuantity)])
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

[ (Coin 250_000, [(assetA, maxTxOutTokenQuantity)])
, (Coin 250_000, [(assetA, maxTxOutTokenQuantity)])
]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd have been nice to also show an example where some assets are specified in the outputs, but the resulting change is still going above limit.

Copy link
Member

@KtorZ KtorZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors merge

iohk-bors bot added a commit that referenced this pull request Feb 26, 2021
2536: Split change outputs with asset quantities exceeding the maximum r=KtorZ a=jonathanknowles

# Issue Number

#2532
ADP-726

# Overview

Although quantities of individual assets are effectively unlimited, a transaction on the blockchain can never include an asset quantity greater than `maxBound :: Word64`.

This PR tweaks the coin selection algorithm to detect change bundles containing excessively large asset quantities.

If such a change bundle is detected, we now split the change bundle up into smaller bundles using [equipartitioning](https://en.wiktionary.org/wiki/equipartition).

# Example

Let's suppose the maximum allowable token quantity is **`5`**, and we have a change map with the following quantities:

```haskell
[("a", 11), ("b", 5), ("c", 12)]
```

In this case, we must divide the map into **_at least three_** smaller maps in order to not exceed the maximum allowable token quantity.

Under the equipartitioning scheme, this would give us:

```haskell
[("a", 3), ("b", 1), ("c", 4)]
[("a", 4), ("b", 2), ("c", 4)]
[("a", 4), ("b", 2), ("c", 4)]
```

Note that while the overall sum is preserved, the individual bundles are almost equal, **_but not quite_**: this is because `11` and `5` are not divisible by `3`. We must therefore accept a small loss of proportionality in the result.

# Details

An **_equipartition_** of a bundle **_b_** is a _partition_ into multiple bundles, where for every asset **_a_** in the set of assets contained in **_b_**, the difference between the following quantities is either _zero_ or _one_ :

- The smallest quantity of asset **_a_** in the resultant bundles
- The greatest quantity of asset **_a_** in the resultant bundles

In order to determine the number of parts in which to split a given bundle, we choose the **_smallest_** number of parts that still allows us satisfy the goal of not exceeding the maximum allowable quantity in any given bundle.

# Performance

In order to avoid evaluating a partition for every single change output, we **_short circuit_** in the event that there is no token quantity greater than the maximum allowable quantity:

```haskell
equipartitionTokenMapWithMaxQuantity m (TokenQuantity maxQuantity)
    | maxQuantity == 0 =
        maxQuantityZeroError
    | currentMaxQuantity <= maxQuantity =
        m :| []
    | otherwise =
        equipartitionTokenMap m (() :| replicate extraPartCount ())
```

# Testing

## Property tests

Equipartitioning behaviour is tested by the following property tests:
- `prop_equipartitionTokenBundle*`
- `prop_equipartitionTokenMap*`

## Unit tests

As a sanity check, this PR also provides unit tests for `performSelection` with inputs containing token quantities that are close to the maximum, and demonstrates that change bundles are correctly partitioned in the results.

Co-authored-by: Jonathan Knowles <[email protected]>
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Feb 26, 2021

Timed out.

Bors itself timed out waiting on hydra.

@KtorZ
Copy link
Member

KtorZ commented Feb 26, 2021

bors retry

iohk-bors bot added a commit that referenced this pull request Feb 26, 2021
2536: Split change outputs with asset quantities exceeding the maximum r=KtorZ a=jonathanknowles

# Issue Number

#2532
ADP-726

# Overview

Although quantities of individual assets are effectively unlimited, a transaction on the blockchain can never include an asset quantity greater than `maxBound :: Word64`.

This PR tweaks the coin selection algorithm to detect change bundles containing excessively large asset quantities.

If such a change bundle is detected, we now split the change bundle up into smaller bundles using [equipartitioning](https://en.wiktionary.org/wiki/equipartition).

# Example

Let's suppose the maximum allowable token quantity is **`5`**, and we have a change map with the following quantities:

```haskell
[("a", 11), ("b", 5), ("c", 12)]
```

In this case, we must divide the map into **_at least three_** smaller maps in order to not exceed the maximum allowable token quantity.

Under the equipartitioning scheme, this would give us:

```haskell
[("a", 3), ("b", 1), ("c", 4)]
[("a", 4), ("b", 2), ("c", 4)]
[("a", 4), ("b", 2), ("c", 4)]
```

Note that while the overall sum is preserved, the individual bundles are almost equal, **_but not quite_**: this is because `11` and `5` are not divisible by `3`. We must therefore accept a small loss of proportionality in the result.

# Details

An **_equipartition_** of a bundle **_b_** is a _partition_ into multiple bundles, where for every asset **_a_** in the set of assets contained in **_b_**, the difference between the following quantities is either _zero_ or _one_ :

- The smallest quantity of asset **_a_** in the resultant bundles
- The greatest quantity of asset **_a_** in the resultant bundles

In order to determine the number of parts in which to split a given bundle, we choose the **_smallest_** number of parts that still allows us satisfy the goal of not exceeding the maximum allowable quantity in any given bundle.

# Performance

In order to avoid evaluating a partition for every single change output, we **_short circuit_** in the event that there is no token quantity greater than the maximum allowable quantity:

```haskell
equipartitionTokenMapWithMaxQuantity m (TokenQuantity maxQuantity)
    | maxQuantity == 0 =
        maxQuantityZeroError
    | currentMaxQuantity <= maxQuantity =
        m :| []
    | otherwise =
        equipartitionTokenMap m (() :| replicate extraPartCount ())
```

# Testing

## Property tests

Equipartitioning behaviour is tested by the following property tests:
- `prop_equipartitionTokenBundle*`
- `prop_equipartitionTokenMap*`

## Unit tests

As a sanity check, this PR also provides unit tests for `performSelection` with inputs containing token quantities that are close to the maximum, and demonstrates that change bundles are correctly partitioned in the results.

Co-authored-by: Jonathan Knowles <[email protected]>
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Feb 26, 2021

Build failed:

  BYRON_WALLETS
    BYRON_GET_04, DELETE_01 - Deleted wallet is not available
    BYRON_LIST_01 - Byron Wallets are listed from oldest to newest
    BYRON_LIST_01 - Interleave of Icarus and Random wallets
    BYRON_RESTORE_01, GET_01, LIST_01 - Restore a wallet
      random 9 words
      random 12 words
      random 15 words
      random 18 words
      random 21 words
      random 24 words
      icarus 9 words
      icarus 12 words
      icarus 15 words
      icarus 18 words
      icarus 21 words
      icarus 24 words
      trezor 9 words
      trezor 12 words
      trezor 15 words
      trezor 18 words
      trezor 21 words
      trezor 24 words
      ledger 9 words
      ledger 12 words
building of '/nix/store/9dya53jjbzk5d3d101gxhqc136q7inhl-cardano-wallet-test-integration-2021.2.15-check' timed out after 900 seconds of silence

#2540

@gitmachtl
Copy link

I apologize to give you such headache with this issue, i was just testing out the limits. But i think the solution will be worth it. 🙂

@jonathanknowles
Copy link
Contributor Author

bors r+

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Feb 27, 2021

Build succeeded:

@iohk-bors iohk-bors bot merged commit 1b42a42 into master Feb 27, 2021
@iohk-bors iohk-bors bot deleted the jonathanknowles/upper-bound-for-change-outputs branch February 27, 2021 03:28
@jonathanknowles
Copy link
Contributor Author

@gitmachtl wrote:

I apologize to give you such headache with this issue, i was just testing out the limits. But i think the solution will be worth it. slightly_smiling_face

No need to apologise! We are very grateful that you took the time to create a report. It's much better that we can fix this early on, before it inconveniences people attempting to use the wallet.

Thank-you again for your report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Major Wallet locking Issue with NativeAssets overflow in UTXOs
4 participants