MASP storage optimization #2343

grarco · 2023-12-28T14:48:00Z

Currently the MASP storage is managed in a suboptimal way. More specifically, the following can be observed:

All the keys are written under the subspace, meaning they are subject to diffs and merklization which is unneeded because we only need the most updated value of the masp keys and we already manage the masp merkle trees in a separate way
We are currently writing the Transfer and Transaction objects (plus some other metadata) into storage to allow clients to scan the history of masp transactions and construct their internal states to correctly produce future transactions

The conversion_state

namada/core/src/ledger/masp_conversions.rs

Lines 25 to 38 in 6c6b3bf

    
           pub struct ConversionState { 
        
               /// The last amount of the native token distributed 
        
               pub normed_inflation: Option<u128>, 
        
               /// The tree currently containing all the conversions 
        
               pub tree: FrozenCommitmentTree<Node>, 
        
               /// A map from token alias to actual address. 
        
               pub tokens: BTreeMap<String, Address>, 
        
               /// Map assets to their latest conversion and position in Merkle tree 
        
               #[allow(clippy::type_complexity)] 
        
               pub assets: BTreeMap< 
        
                   AssetType, 
        
                   ((Address, MaspDenom), Epoch, AllowedConversion, usize), 
        
               >, 
        
           }

is kept in memory under the Storage object

Issue 1 should be solved by #2265.

Issue 2 could be solved by not writing this data to storage since we already have it in the block storage. Clients could query the blocks instead (possibly with the help of an indexer), to get the same results. We should also make sure to keep the blocks around in case of a chain upgrade (see discussion at #2248 (comment)).

Issue 3 is probably not as major as the previous two. The ConversionState is effectively only queried by clients and updated in protocol once per epoch, so we could probably move it to storage to reduce the memory consumption (hopefully without increasing the query time too much because of the need to retrieve data from storage first)

The text was updated successfully, but these errors were encountered:

grarco · 2023-12-28T14:51:18Z

@cwgoes for reference

cwgoes · 2024-01-02T02:12:33Z

Thanks for the writeup. I think we should definitely fix 1 & 2. Regarding 3, I'm not too worried about memory consumption, as long as we have a reasonable understanding of the bounds and total Namada memory usage does not exceed single-digit GBs. Do we have a sense of how large (concretely) this struct would be?

grarco · 2024-01-02T16:30:06Z

I've tried to run some approximate calculations. So the ConversionState struct is composed of four fields:

normed_inflation, which is an Option<u128>, is constant at ~ 17 bytes
tree, which is a FrozenCommtimentTree that we update once per epoch
tokens, which is a map from a token alias to its address should remain low in size (BTreeMap<String, Address>)
assets, which is a map between AssetType and some data about it, will actually grow once per epoch, more specifically we'll need two inserts per epoch per every unique (token, denom) couple, one for the old asset and one for the new asset, but the one for the old asset is actually an overwrite of the previous value, so we can count just one insert

Here are some estimations on the size of the types involved (these are based only on the types, I'm not taking into account optimizations/paddings that the compiler might perform, e.g. to align types in memory):

Alias ~ 3 UTF8 chars, (suppose ASCII chars, so one byte per char), 3 bytes, ignore the size of the String smart pointer
AssetType ~ 33 bytes
Address ~ 20 bytes
MaspDenom = 1 byte (we have 4 denominations)
Epoch = 8 bytes
AllowedConversions = I28Sum + jujub::ExtendedPoint
	I128Sum = BTreeMap<AssetType, u128> => but actually this map is always empty for us
	ExtendPoint = 5 * 4 * 8 bytes = 160 bytes
usize = 8 bytes on 64 bits machines

Node = 32 bytes
FrozenCommitmentTree = (Vec<Node>, usize)
conv_notes = number of conversion notes, one for every value (so asset type) in the Map of assets in ConversionState
Capacity of Vec<Node> = 31 + 2 * conv_notes

So I believe the cost in memory of the conversion state could be divided into two parts: a constant (or semi-constant) part and a variable part that gets updated with every epoch change.

The constant part would be the cost of the normed_inflation plus the cost of the tokens map, as a function of the number of token addresses $t$:

$M_c(t) = 17 + t * (3 + 20) = 37 + 3t = C + 3t$

The epoch-variable part (at a given epoch $e$) is composed of the merkle tree and the assets fields. Assuming no new tokens are added in the current epoch, the number of new asset types (where every AssetType is the concatenation of a token address, a masp denomination and the epoch) that will be inserted, based on the number of token addresses, is:

$n_a(t) = (t * 4) * (33 + 20 + 1 + 8 + 160 + 8)$

And the delta increase of the frozen tree compared to its value at the previous epoch will be:

$d_{tree}(n_a) = 32 * (31 + 2 * n_a)$

The variable part can then be expressed as:

$M_{v, e}(t)= d_{tree} + n_a = 32 * (31 + 2 * n_a) + n_a = 992 + 64n_a + n_a = 992 + 65n_a = 992 + 65 * (4t * 230) = 992 + 59800t$

Ignoring the constant parts, the increase in memory consumption given by the variable part would be approximately 59800 bytes per token address per epoch, so ~ 58kB

cwgoes · 2024-01-02T17:43:14Z

Thanks! Excellent write-up. I think that those memory costs are fine for the foreseeable future.

grarco · 2024-04-11T15:25:37Z

Closing as completed

grarco added storage MASP labels Dec 28, 2023

grarco mentioned this issue Jan 9, 2024

Removes MASP record from storage #2363

Merged

2 tasks

brentstone added this to the To evaluate milestone Apr 6, 2024

Fraccaman modified the milestones: To evaluate, Phase 3: activate IBC and shielded pool Apr 11, 2024

grarco closed this as completed Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MASP storage optimization #2343

MASP storage optimization #2343

grarco commented Dec 28, 2023 •

edited

Loading

grarco commented Dec 28, 2023

cwgoes commented Jan 2, 2024

grarco commented Jan 2, 2024 •

edited

Loading

cwgoes commented Jan 2, 2024

grarco commented Apr 11, 2024

MASP storage optimization #2343

MASP storage optimization #2343

Comments

grarco commented Dec 28, 2023 • edited Loading

grarco commented Dec 28, 2023

cwgoes commented Jan 2, 2024

grarco commented Jan 2, 2024 • edited Loading

cwgoes commented Jan 2, 2024

grarco commented Apr 11, 2024

grarco commented Dec 28, 2023 •

edited

Loading

grarco commented Jan 2, 2024 •

edited

Loading