-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stem based flat db for verkle #7778
Conversation
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
82f562e
to
ae8ab83
Compare
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
…ccessfully created to transaction start Signed-off-by: Luis Pinto <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
df4ba5b
to
5b452ae
Compare
@@ -76,6 +79,7 @@ public static DataStorageOptions create() { | |||
*/ | |||
public void validate(final CommandLine commandLine) { | |||
diffBasedSubStorageOptions.validate(commandLine, dataStorageFormat); | |||
verkleSubStorageOptions.validate(commandLine, dataStorageFormat); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you call validate
? It's a noop, not even part of an interface or abstract
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just to ensure that all the logic is coded in the same way so that if tomorrow someone adds a flag that requires a check, they don’t have to wonder why it doesn’t work or how to do it. They just need to follow the same approach as in other parts of the code and simply implement the check in validate
. But if you think it's better to remove why not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no strong opinions, you can leave it
interface VerkleUnstable { | ||
|
||
VerkleSubStorageConfiguration.VerkleUnstable DEFAULT = | ||
ImmutableVerkleSubStorageConfiguration.VerkleUnstable.builder().build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEFAULT is essentially == DISABLED right? So why have both? Just do:
@Value.Default
default VerkleUnstable getUnstable() {
return VerkleUnstable.DISABLED;
}
And remove the DEFAULT field.
@Override | ||
public Optional<Long> getCodeSize() { | ||
return Optional.empty(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how can codeHash != Hash.EMPTY
and code size be empty? It doesn't seem to make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional.empty is not == 0 it's a way to say that the codesize is not available but I modified the code and this method is now abstract and in PmtStateTrieAccountValue I throw an exception
@Override
public Optional<Long> getCodeSize() {
throw new UnsupportedOperationException("Retrieving code size from state trie account value is not possible with Patricia Merkle Trie");
}
noOpSegmentedTx, noOpTx, worldStateKeyValueStorage.getFlatDbStrategy())), | ||
noOpSegmentedTx, | ||
noOpTx, | ||
(BonsaiFlatDbStrategy) worldStateKeyValueStorage.getFlatDbStrategy())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you really need to cast it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that we might not need a specialized type. What about doing this?
diff --git a/ethereum/core/src/main/java/org/hyperledger/besu/ethereum/trie/diffbased/bonsai/storage/BonsaiWorldStateKeyValueStorage.java b/ethereum/core/src/main/java/org/hyperledger/besu/ethereum/trie/diffbased/bonsai/storage/BonsaiWorldStateKeyValueStorage.java
index e55bc7175b..605a8a50de 100644
--- a/ethereum/core/src/main/java/org/hyperledger/besu/ethereum/trie/diffbased/bonsai/storage/BonsaiWorldStateKeyValueStorage.java
+++ b/ethereum/core/src/main/java/org/hyperledger/besu/ethereum/trie/diffbased/bonsai/storage/BonsaiWorldStateKeyValueStorage.java
@@ -28,6 +28,7 @@ import org.hyperledger.besu.ethereum.trie.common.PmtStateTrieAccountValue;
import org.hyperledger.besu.ethereum.trie.diffbased.bonsai.storage.flat.BonsaiFlatDbStrategy;
import org.hyperledger.besu.ethereum.trie.diffbased.bonsai.storage.flat.BonsaiFlatDbStrategyProvider;
import org.hyperledger.besu.ethereum.trie.diffbased.common.storage.DiffBasedWorldStateKeyValueStorage;
+import org.hyperledger.besu.ethereum.trie.diffbased.common.storage.flat.FlatDbStrategy;
import org.hyperledger.besu.ethereum.worldstate.DataStorageConfiguration;
import org.hyperledger.besu.ethereum.worldstate.FlatDbMode;
import org.hyperledger.besu.ethereum.worldstate.WorldStateKeyValueStorage;
@@ -192,7 +193,7 @@ public class BonsaiWorldStateKeyValueStorage extends DiffBasedWorldStateKeyValue
private final SegmentedKeyValueStorageTransaction composedWorldStateTransaction;
private final KeyValueStorageTransaction trieLogStorageTransaction;
- private final BonsaiFlatDbStrategy flatDbStrategy;
+ private final FlatDbStrategy flatDbStrategy;
public Updater(
final SegmentedKeyValueStorageTransaction composedWorldStateTransaction,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or are you building this for the future, to use more specialized methods? I don't get any compilation error if I do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I prefer the current method. everything related to Diffbased or the Bonsai/Verkle world. if I know we're in a Bonsai class, I directly specify it and the subclasses should also be Bonsai as they will be required to follow that structure. This ensures that in the future, there's no need to question when a specific Bonsai method needs to be called, and it also avoids others having to question it later if needed. The right types are already in place.
* <p>This class uses a thread-safe {@link ConcurrentMap} to store data, allowing concurrent | ||
* modifications. | ||
*/ | ||
public class CodeConsumingMap extends ForwardingMap<Address, DiffBasedValue<Bytes>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like AccountConsumingMap
is almost a complete copy of this class and you can achieve the same reusing that class. Why not renaming AccountConsumingMap
-> ConsumingMap
and declare it like:
codeToUpdate = new ConsumingMap<DiffBasedValue<Bytes>>(new ConcurrentHashMap<>(), codePreloader);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not completly the same, there is a difference in the consumer.process part (put method)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah true! missed that, in CodeConsumingMap
only the updated value is processed by the consumer. But is there any reason for the Consumer not to do that itself and only use the updated value, rather than doing it inside the CodeConsumingMap
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really , more because I didn't wanted to push the logic here
public DiffBasedWorldStateUpdateAccumulator(
final DiffBasedWorldView world,
final Consumer<DiffBasedValue<ACCOUNT>> accountPreloader,
final Consumer<StorageSlotKey> storagePreloader,
final Consumer<Bytes> codePreloader,
final EvmConfiguration evmConfiguration) {
super(world, evmConfiguration);
this.accountsToUpdate = new AccountConsumingMap<>(new ConcurrentHashMap<>(), accountPreloader);
this.codeToUpdate = new CodeConsumingMap(new ConcurrentHashMap<>(), codePreloader);
this.accountPreloader = accountPreloader;
this.storagePreloader = storagePreloader;
this.codePreloader = codePreloader;
this.isAccumulatorStateChanged = false;
this.evmConfiguration = evmConfiguration;
but I don't have a strong opinion about that
out.writeUInt256Scalar(balance); | ||
out.writeBytes(storageRoot); | ||
out.writeLongScalar(nonce); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you swap nonce with balance for both reading and writing? Any reason in particular?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it was to be compatible with the VerkleStateTrieAccountValue encoding part as I decided to implement it like that. no more reason than that I think I can revert this change
} else { | ||
this.codeSize = code.size(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there is a kind of code smell to me here. When you construct VerkleAccount
you have a codeSize != 0
but getCode() == null
this is not clear at all unless you are totally familiar with Verkle. And even then...
I understand that code
and codeSize
are 2 different regions of the DB and so is codeHash
and that performance wise we should avoid touching slots unnecessarily but why not having a code provider with a lambda function instead of the field Code code
that only populates it and goes to the DB if someone calls getCode
? That would be more intuitive to me and consistent. Don't you know at least a way to get the code from the DB when constructing a VerkleAccount
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already working like that, it's using the same code as Bonsai, VerkleAccount is calling the context (MutableWorldstate) to retrieve the code when we call getCode
See in DiffBasedAccount
public Bytes getCode() {
if (code == null) {
code = context.getCode(address, codeHash).orElse(Bytes.EMPTY);
}
return code;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I understand, this is not something just with Verkle but Bonsai also has the same problem. Maybe nothing to do for now, but it would be better to clarify.
Right now, there's so many ways you can set code in both: using a context
, calling setCode
directly, doing a copy constructor way. I feel this has potential to create problems and at least it's hard for me to deterministically say how the code is set into an account. Why can't you have code set always at construct time? Agree needs some thought though.
But maybe it's because we're saving on object allocation and reusing *Account
objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's an optmization, we don't have to read a code if we don't need it . the idea is to read in the database only when we really want to access the code
also I don't see a potential problem with that. The context is used when you read the code from the database, after when you modify the code you will always pass via the account. the worldstate (via accumulator) will check the code in the account and save in the database
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I will have in the futur to explain all of this logic in a doc (accumulator, worldstate, updater etc)
@@ -440,7 +429,15 @@ public Map<Bytes32, Bytes> getAllAccountStorage(final Address address, final Has | |||
} | |||
|
|||
private VerkleTrie createTrie(final NodeLoader nodeLoader, final Bytes32 rootHash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rootHash
unused var, can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks done
this.dataStorageConfiguration = dataStorageConfiguration; | ||
this.flatDbStrategyProvider = | ||
new VerkleFlatDbStrategyProvider(metricsSystem, dataStorageConfiguration); | ||
flatDbStrategyProvider.loadFlatDbStrategy(composedWorldStateStorage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that the FlatDbStrategyProvider
constructor not calling loadFlatDbStrategy
? Is the object of any use without anything loaded on it?
Found out later that getFlatDbStategy
needs the same argument as the load which looks weird. After checking the getter I can see that it calls loadFlatDbStrategy
. Any particular reason for that? Why can't we call load in the constructor and make sure `flatDbStrategy is never null inside? Do we need lazy initialization? Only if it makes sense to create a provider and not use it...
BTW in the case of Bonsai we construct and load sequentially all the same: https://github.com/hyperledger/besu/blob/main/ethereum/core/src/main/java/org/hyperledger/besu/ethereum/trie/diffbased/bonsai/storage/BonsaiWorldStateKeyValueStorage.java#L64-L66
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated it was needed before but I think we had some refactoring of this part and it's not needed anymore in bonsai and verkle, I updated the code
flatDbStrategy.removeFlatAccountStorageValueByStorageSlotHash( | ||
composedWorldStateTransaction, accountHash, slotHash); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
putStateTrieNode
and removeStateTrieNode
are unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -164,6 +207,19 @@ public Updater putAccountInfoState(final Hash accountHash, final Bytes accountVa | |||
return this; | |||
} | |||
|
|||
public synchronized Updater putStorageValueBySlotHash( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these methods synchronized
? Then others are not. Does it make sense to have synchronized
at all? If so others are also writing to the same object as these.
Actually I suspect it's not needed since this is called to compute the root hash which should only happen at the end on a single thread
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's just a copy/paste from Bonsai code, looking at the code I also don't see a need for a synchronized for the moment
@@ -107,7 +108,21 @@ public static void writeTo(final TrieLog layer, final RLPOutput output) { | |||
if (accountChange == null || accountChange.isUnchanged()) { | |||
output.writeNull(); | |||
} else { | |||
writeRlp(accountChange, output, (o, sta) -> sta.writeTo(o)); | |||
writeRlp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: there's 2 TrieLogFactoryImpl
classes that share big part of the code. Maybe there's some code reuse we can do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do some refactoring and avoid as much as possible duplication for this class
noOpSegmentedTx, noOpTx, worldStateKeyValueStorage.getFlatDbStrategy())), | ||
noOpSegmentedTx, | ||
noOpTx, | ||
(VerkleStemFlatDbStrategy) worldStateKeyValueStorage.getFlatDbStrategy())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to define frontierRootHash
? From what I can see this is only used to define transaction receipts compliant with Frontier but Berlin is already overriding that. I can't see why we need to have verkle with Frontier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to keep it this way for now because I’d like to sync the mainnet with Verkle for testing purposes, and I might need this method. We can remove it later when it’s no longer useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment that you want it for that?
public boolean accountExists(final Address address) { | ||
return worldUpdater.get(address) != null; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused, I would remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why it's here 🤔 I will remove it
|
||
protected final Counter getStorageValueNotFoundInFlatDatabaseCounter; | ||
|
||
public VerkleLegacyFlatDbStrategy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could use some unit tests to make sure metrics work as expected at least
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did
* and code. This allows direct access to the leaves of the Verkle trie for stems, reducing | ||
* duplication of data. | ||
*/ | ||
public class VerkleStemFlatDbStrategy extends FlatDbStrategy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could use some unit tests to make sure metrics work as expected at least
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} else { | ||
sta.writeTo(o); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add a unit test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add tests for that
ImmutableVerkleSubStorageConfiguration.VerkleUnstable.builder() | ||
.stemFlatDbEnabled(true) | ||
.build(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DISABLED is not used and it seems DEFAULT == DISABLED
. Maybe remove DEFAULT
and just do:
@Value.Default
default VerkleUnstable getUnstable() {
return VerkleUnstable.DISABLED;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No blockers but it could use some cleanups
… impl Signed-off-by: Luis Pinto <[email protected]>
… impl Signed-off-by: Luis Pinto <[email protected]>
Signed-off-by: Luis Pinto <[email protected]>
Signed-off-by: Luis Pinto <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR description
This PR introduces a new strategy for managing Verkle accounts using a stem-based flat database structure. Instead of saving by account, slot, and code like in the legacy approach, this implementation stores data by stem and code. This shift allows for direct access to the leaves of the Verkle tree, significantly reducing data duplication and improving storage efficiency.
The stem-based flat database strategy is designed to align better with Verkle tree structures, enabling more streamlined storage and access patterns. This should make it more effective in scenarios that require quick access to account data without the overhead of managing redundant information.
Key Changes:
Introduces the VerkleStemFlatDbStrategy class, which handles storage using a stem-based approach.
Reduces duplication by focusing on storing data at the stem level, improving overall efficiency.
Next Steps:
Once this PR is merged, the optimization of Pedersen hash calculations will be handled in a subsequent PR.
Testing will be conducted after the pre-image is available to determine the usability of this implementation in its current state.
Considerations:
This new approach is expected to be necessary for stateless clients, but further verification will be needed for regular nodes to ensure that it meets the requirements effectively.
Fixed Issue(s)
Thanks for sending a pull request! Have you done the following?
doc-change-required
label to this PR if updates are required.Locally, you can run these tests to catch failures early:
./gradlew build
./gradlew acceptanceTest
./gradlew integrationTest
./gradlew ethereum:referenceTests:referenceTests