Skip to content

Commit

Permalink
Aristo cull journal related stuff (#2288)
Browse files Browse the repository at this point in the history
* Remove all journal related stuff

* Refactor function names journal*() => delta*(), filter*() => delta*()

* remove `trg` fileld from `FilterRef`

why:
  Same as `kMap[$1]`

* Re-type FilterRef.src as `HashKey`

why:
  So it is directly comparable to `kMap[$1]`

* Moved `vGen[]` field from `LayerFinalRef` to `LayerDeltaRef`

why:
  Then a separate `FilterRef` type is not needed, anymore

* Rename `roFilter` field in `AristoDbRef` => `balancer`

why:
  New name more appropriate.

* Replace `FilterRef` by `LayerDeltaRef` type

why:
  This allows to avoid copying into the `balancer` (see next patch set)
  most of the time. Typically, only one instance is running on the backend
  and the `balancer` is only used as a stage before saving data.

* Refactor way how to store data persistently

why:
  Avoid useless copy when staging `top` layer for persistently saving to
  backend.

* Fix copyright header?
  • Loading branch information
mjfh authored Jun 3, 2024
1 parent 7f76586 commit f926222
Show file tree
Hide file tree
Showing 55 changed files with 389 additions and 4,231 deletions.
1 change: 0 additions & 1 deletion nimbus/db/aristo.nim
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@ export
AristoError,
AristoTxRef,
MerkleSignRef,
QidLayoutRef,
isValid

# End
90 changes: 7 additions & 83 deletions nimbus/db/aristo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,8 @@ Contents
+ [4.5 Leaf record payload serialisation for RLP encoded data](#ch4x5)
+ [4.6 Leaf record payload serialisation for unstructured data](#ch4x6)
+ [4.7 Serialisation of the list of unused vertex IDs](#ch4x7)
+ [4.8 Backend filter record serialisation](#ch4x8)
+ [4.9 Serialisation of a list of filter IDs](#ch4x92)
+ [4.10 Serialisation of a last saved state record](#ch4x10)
+ [4.11 Serialisation record identifier identification](#ch4x11)
+ [4.8 Serialisation of a last saved state record](#ch4x8)
+ [4.9 Serialisation record identifier identification](#ch4x9)

* [5. *Patricia Trie* implementation notes](#ch5)
+ [5.1 Database decriptor representation](#ch5x1)
Expand Down Expand Up @@ -372,79 +370,7 @@ be used as vertex IDs. If this record is missing, the value *(1u64,0x01)* is
assumed, i.e. the list with the single vertex ID *1*.

<a name="ch4x8"></a>
### 4.8 Backend filter record serialisation

0 +--+--+--+--+--+ .. --+
| | -- filter ID
8 +--+--+--+--+--+ .. --+--+ .. --+
| | -- 32 bytes filter source hash
40 +--+--+--+--+--+ .. --+--+ .. --+
| | -- 32 bytes filter target hash
72 +--+--+--+--+--+ .. --+--+ .. --+
| | -- number of unused vertex IDs
76 +--+--+--+--+
| | -- number of structural triplets
80 +--+--+--+--+--+ .. --+
| | -- first unused vertex ID
88 +--+--+--+--+--+ .. --+
... -- more unused vertex ID
N1 +--+--+--+--+
|| | -- flg(2) + vtxLen(30), 1st triplet
+--+--+--+--+--+ .. --+
| | -- vertex ID of first triplet
+--+--+--+--+--+ .. --+--+ .. --+
| | -- optional 32 bytes hash key
+--+--+--+--+--+ .. --+--+ .. --+
... -- optional vertex record
N2 +--+--+--+--+
|| | -- flg(2) + vtxLen(30), 2nd triplet
+--+--+--+--+
...
+--+
| | -- marker(8), 0x7d
+--+

where
+ minimum size of an empty filter is 72 bytes

+ the flg(2) represents a bit tuple encoding the serialised storage
modes for the optional 32 bytes hash key:

0 -- not encoded, to be ignored
1 -- not encoded, void => considered deleted
2 -- present, encoded as-is (32 bytes)
3 -- present, encoded as (len(1),data,zero-padding)

+ the vtxLen(30) is the number of bytes of the optional vertex record
which has maximum size 2^30-2 which is short of 1 GiB. The value
2^30-1 (i.e. 0x3fffffff) is reserverd for indicating that there is
no vertex record following and it should be considered deleted.

+ there is no blind entry, i.e. either flg(2) != 0 or vtxLen(30) != 0.

+ the marker(8) is the eight bit array *0111-1101*

<a name="ch4x9"></a>
### 4.9 Serialisation of a list of filter IDs

0 +-- ..
... -- some filter ID
+--+--+--+--+--+--+--+--+
| | -- last filter IDs
+--+--+--+--+--+--+--+--+
| | -- marker(8), 0x7e
+--+

where
marker(8) is the eight bit array *0111-1110*

This list is used to control the filters on the database. By holding some IDs
in a dedicated list (e.g. the latest filters) one can quickly access particular
entries without searching through the set of filters. In the current
implementation this list comes in ID pairs i.e. the number of entries is even.

<a name="ch4x9"></a>
### 4.10 Serialisation of a last saved state record
### 4.8 Serialisation of a last saved state record

0 +--+--+--+--+--+ .. --+--+ .. --+
| | -- 32 bytes source state hash
Expand All @@ -460,7 +386,7 @@ implementation this list comes in ID pairs i.e. the number of entries is even.
marker(8) is the eight bit array *0111-111f*

<a name="ch4x10"></a>
### 4.10 Serialisation record identifier tags
### 4.9 Serialisation record identifier tags

Any of the above records can uniquely be identified by its trailing marker,
i.e. the last byte of a serialised record.
Expand All @@ -474,9 +400,7 @@ i.e. the last byte of a serialised record.
| 0110 1010 | 0x6a | RLP encoded payload | [4.5](#ch4x5) |
| 0110 1011 | 0x6b | Unstructured payload | [4.6](#ch4x6) |
| 0111 1100 | 0x7c | List of vertex IDs | [4.7](#ch4x7) |
| 0111 1101 | 0x7d | Filter record | [4.8](#ch4x8) |
| 0111 1110 | 0x7e | List of filter IDs | [4.9](#ch4x9) |
| 0111 1111 | 0x7f | Last saved state | [4.10](#ch4x10) |
| 0111 1111 | 0x7f | Last saved state | [4.8](#ch4x8) |

<a name="ch5"></a>
5. *Patricia Trie* implementation notes
Expand All @@ -495,15 +419,15 @@ i.e. the last byte of a serialised record.
| | stack[0] | | successively recover the top layer)
| +----------+ v
| +----------+
| | roFilter | optional read-only backend filter
| | balancer | optional read-only backend filter
| +----------+
| +----------+
| | backend | optional physical key-value backend database
v +----------+

There is a three tier access to a key-value database entry as in

top -> roFilter -> backend
top -> balancer -> backend

where only the *top* layer is obligatory.

Expand Down
92 changes: 7 additions & 85 deletions nimbus/db/aristo/aristo_api.nim
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,11 @@


import
std/[options, times],
std/times,
eth/[common, trie/nibbles],
results,
./aristo_desc/desc_backend,
./aristo_init/memory_db,
./aristo_journal/journal_get,
"."/[aristo_delete, aristo_desc, aristo_fetch, aristo_get, aristo_hashify,
aristo_hike, aristo_init, aristo_merge, aristo_path, aristo_profile,
aristo_serialise, aristo_tx, aristo_vid]
Expand Down Expand Up @@ -164,11 +163,11 @@ type
## transaction.
##
## If `backLevel` is `-1`, a database descriptor with empty transaction
## layers will be provided where the `roFilter` between database and
## layers will be provided where the `balancer` between database and
## transaction layers are kept in place.
##
## If `backLevel` is `-2`, a database descriptor with empty transaction
## layers will be provided without an `roFilter`.
## layers will be provided without a `balancer`.
##
## The returned database descriptor will always have transaction level one.
## If there were no transactions that could be squashed, an empty
Expand Down Expand Up @@ -220,31 +219,6 @@ type
## Getter, returns `true` if the argument `tx` referes to the current
## top level transaction.

AristoApiJournalGetFilterFn* =
proc(be: BackendRef;
inx: int;
): Result[FilterRef,AristoError]
{.noRaise.}
## Fetch filter from journal where the argument `inx` relates to the
## age starting with `0` for the most recent.

AristoApiJournalGetInxFn* =
proc(be: BackendRef;
fid: Option[FilterID];
earlierOK = false;
): Result[JournalInx,AristoError]
{.noRaise.}
## For a positive argument `fid`, find the filter on the journal with ID
## not larger than `fid` (i e. the resulting filter might be older.)
##
## If the argument `earlierOK` is passed `false`, the function succeeds
## only if the filter ID of the returned filter is equal to the argument
## `fid`.
##
## In case that the argument `fid` is zera (i.e. `FilterID(0)`), the
## filter with the smallest filter ID (i.e. the oldest filter) is
## returned. In that case, the argument `earlierOK` is ignored.

AristoApiLevelFn* =
proc(db: AristoDbRef;
): int
Expand Down Expand Up @@ -303,7 +277,7 @@ type

AristoApiPersistFn* =
proc(db: AristoDbRef;
nxtFid = none(FilterID);
nxtSid = 0u64;
chunkedMpt = false;
): Result[void,AristoError]
{.noRaise.}
Expand All @@ -315,13 +289,9 @@ type
## backend stage area. After that, the top layer cache is cleared.
##
## Finally, the staged data are merged into the physical backend
## database and the staged data area is cleared. Wile performing this
## last step, the recovery journal is updated (if available.)
## database and the staged data area is cleared.
##
## If the argument `nxtFid` is passed non-zero, it will be the ID for
## the next recovery journal record. If non-zero, this ID must be greater
## than all previous IDs (e.g. block number when storing after block
## execution.)
## The argument `nxtSid` will be the ID for the next saved state record.
##
## Staging the top layer cache might fail with a partial MPT when it is
## set up from partial MPT chunks as it happens with `snap` sync
Expand Down Expand Up @@ -419,8 +389,6 @@ type
hasPath*: AristoApiHasPathFn
hikeUp*: AristoApiHikeUpFn
isTop*: AristoApiIsTopFn
journalGetFilter*: AristoApiJournalGetFilterFn
journalGetInx*: AristoApiJournalGetInxFn
level*: AristoApiLevelFn
nForked*: AristoApiNForkedFn
merge*: AristoApiMergeFn
Expand Down Expand Up @@ -454,8 +422,6 @@ type
AristoApiProfHasPathFn = "hasPath"
AristoApiProfHikeUpFn = "hikeUp"
AristoApiProfIsTopFn = "isTop"
AristoApiProfJournalGetFilterFn = "journalGetFilter"
AristoApiProfJournalGetInxFn = "journalGetInx"
AristoApiProfLevelFn = "level"
AristoApiProfNForkedFn = "nForked"
AristoApiProfMergeFn = "merge"
Expand All @@ -472,16 +438,12 @@ type

AristoApiProfBeGetVtxFn = "be/getVtx"
AristoApiProfBeGetKeyFn = "be/getKey"
AristoApiProfBeGetFilFn = "be/getFil"
AristoApiProfBeGetIdgFn = "be/getIfg"
AristoApiProfBeGetLstFn = "be/getLst"
AristoApiProfBeGetFqsFn = "be/getFqs"
AristoApiProfBePutVtxFn = "be/putVtx"
AristoApiProfBePutKeyFn = "be/putKey"
AristoApiProfBePutFilFn = "be/putFil"
AristoApiProfBePutIdgFn = "be/putIdg"
AristoApiProfBePutLstFn = "be/putLst"
AristoApiProfBePutFqsFn = "be/putFqs"
AristoApiProfBePutEndFn = "be/putEnd"

AristoApiProfRef* = ref object of AristoApiRef
Expand Down Expand Up @@ -509,8 +471,6 @@ when AutoValidateApiHooks:
doAssert not api.hasPath.isNil
doAssert not api.hikeUp.isNil
doAssert not api.isTop.isNil
doAssert not api.journalGetFilter.isNil
doAssert not api.journalGetInx.isNil
doAssert not api.level.isNil
doAssert not api.nForked.isNil
doAssert not api.merge.isNil
Expand Down Expand Up @@ -564,8 +524,6 @@ func init*(api: var AristoApiObj) =
api.hasPath = hasPath
api.hikeUp = hikeUp
api.isTop = isTop
api.journalGetFilter = journalGetFilter
api.journalGetInx = journalGetInx
api.level = level
api.nForked = nForked
api.merge = merge
Expand Down Expand Up @@ -602,8 +560,6 @@ func dup*(api: AristoApiRef): AristoApiRef =
hasPath: api.hasPath,
hikeUp: api.hikeUp,
isTop: api.isTop,
journalGetFilter: api.journalGetFilter,
journalGetInx: api.journalGetInx,
level: api.level,
nForked: api.nForked,
merge: api.merge,
Expand Down Expand Up @@ -717,16 +673,6 @@ func init*(
AristoApiProfIsTopFn.profileRunner:
result = api.isTop(a)

profApi.journalGetFilter =
proc(a: BackendRef; b: int): auto =
AristoApiProfJournalGetFilterFn.profileRunner:
result = api.journalGetFilter(a, b)

profApi.journalGetInx =
proc(a: BackendRef; b: Option[FilterID]; c = false): auto =
AristoApiProfJournalGetInxFn.profileRunner:
result = api.journalGetInx(a, b, c)

profApi.level =
proc(a: AristoDbRef): auto =
AristoApiProfLevelFn.profileRunner:
Expand Down Expand Up @@ -754,7 +700,7 @@ func init*(
result = api.pathAsBlob(a)

profApi.persist =
proc(a: AristoDbRef; b = none(FilterID); c = false): auto =
proc(a: AristoDbRef; b = 0u64; c = false): auto =
AristoApiProfPersistFn.profileRunner:
result = api.persist(a, b, c)

Expand Down Expand Up @@ -810,12 +756,6 @@ func init*(
result = be.getKeyFn(a)
data.list[AristoApiProfBeGetKeyFn.ord].masked = true

beDup.getFilFn =
proc(a: QueueID): auto =
AristoApiProfBeGetFilFn.profileRunner:
result = be.getFilFn(a)
data.list[AristoApiProfBeGetFilFn.ord].masked = true

beDup.getIdgFn =
proc(): auto =
AristoApiProfBeGetIdgFn.profileRunner:
Expand All @@ -828,12 +768,6 @@ func init*(
result = be.getLstFn()
data.list[AristoApiProfBeGetLstFn.ord].masked = true

beDup.getFqsFn =
proc(): auto =
AristoApiProfBeGetFqsFn.profileRunner:
result = be.getFqsFn()
data.list[AristoApiProfBeGetFqsFn.ord].masked = true

beDup.putVtxFn =
proc(a: PutHdlRef; b: openArray[(VertexID,VertexRef)]) =
AristoApiProfBePutVtxFn.profileRunner:
Expand All @@ -846,12 +780,6 @@ func init*(
be.putKeyFn(a,b)
data.list[AristoApiProfBePutKeyFn.ord].masked = true

beDup.putFilFn =
proc(a: PutHdlRef; b: openArray[(QueueID,FilterRef)]) =
AristoApiProfBePutFilFn.profileRunner:
be.putFilFn(a,b)
data.list[AristoApiProfBePutFilFn.ord].masked = true

beDup.putIdgFn =
proc(a: PutHdlRef; b: openArray[VertexID]) =
AristoApiProfBePutIdgFn.profileRunner:
Expand All @@ -864,12 +792,6 @@ func init*(
be.putLstFn(a,b)
data.list[AristoApiProfBePutLstFn.ord].masked = true

beDup.putFqsFn =
proc(a: PutHdlRef; b: openArray[(QueueID,QueueID)]) =
AristoApiProfBePutFqsFn.profileRunner:
be.putFqsFn(a,b)
data.list[AristoApiProfBePutFqsFn.ord].masked = true

beDup.putEndFn =
proc(a: PutHdlRef): auto =
AristoApiProfBePutEndFn.profileRunner:
Expand Down
Loading

0 comments on commit f926222

Please sign in to comment.