Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use napi-rs pubkey-index-map #7091

Merged
merged 4 commits into from
Sep 30, 2024
Merged

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Sep 18, 2024

Motivation

  • reduce heap memory for Pubkey-Index map
  • no intermediate string creation when looking for index of an Uint8Array pubkey

Description

  • consume "@chainsafe/pubkey-index-map napi-rs implementation

benchmark on Mac M1 is comparable to the current PubkeyIndexMap:

  get/set
    ✓ get values - 1000                                                    2808989 ops/s    356.0000 ns/op   x0.450    1313217 runs  0.707 s
    ✓ get values - naive - 1000                                            3802281 ops/s    263.0000 ns/op   x0.358    1423603 runs  0.606 s
    ✓ set values - 1000                                                    2932551 ops/s    341.0000 ns/op   x0.432    2900838 runs   1.51 s
    ✓ set values - naive - 1000                                            2136752 ops/s    468.0000 ns/op   x0.497     637433 runs  0.404 s
    ✓ get values - 1000000                                                948766.6 ops/s    1.054000 us/op   x0.568     407133 runs  0.505 s
    ✓ get values - naive - 1000000                                         1254705 ops/s    797.0000 ns/op   x0.680     627761 runs  0.606 s
    ✓ set values - 1000000                                                972762.6 ops/s    1.028000 us/op   x0.544     499266 runs  0.606 s
    ✓ set values - naive - 1000000                                        637755.1 ops/s    1.568000 us/op   x0.805     346993 runs  0.606 s

see also #7022 (comment)

Copy link
Contributor

github-actions bot commented Sep 18, 2024

⚠️ Performance Alert ⚠️

Possible performance regression was detected for some benchmarks.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold.

Benchmark suite Current: db5d962 Previous: cd98c23 Ratio
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 9.0531 ms/op 2.7366 ms/op 3.31
Array.fill - length 1000000 8.3394 ms/op 2.4610 ms/op 3.39
Array push - length 1000000 52.419 ms/op 14.302 ms/op 3.67
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 7.4445 ms/op 1.4955 ms/op 4.98
Buffer.compare 123687377 12.803 ms/op 3.7482 ms/op 3.42
Full benchmark results
Benchmark suite Current: db5d962 Previous: cd98c23 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 2.0463 ms/op 1.8504 ms/op 1.11
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 57.846 us/op 38.162 us/op 1.52
BLS verify - blst 933.05 us/op 881.84 us/op 1.06
BLS verifyMultipleSignatures 3 - blst 1.2229 ms/op 1.2653 ms/op 0.97
BLS verifyMultipleSignatures 8 - blst 1.6690 ms/op 2.2057 ms/op 0.76
BLS verifyMultipleSignatures 32 - blst 5.0739 ms/op 4.4707 ms/op 1.13
BLS verifyMultipleSignatures 64 - blst 9.7137 ms/op 8.2785 ms/op 1.17
BLS verifyMultipleSignatures 128 - blst 17.747 ms/op 15.960 ms/op 1.11
BLS deserializing 10000 signatures 694.75 ms/op 628.39 ms/op 1.11
BLS deserializing 100000 signatures 7.1191 s/op 6.2640 s/op 1.14
BLS verifyMultipleSignatures - same message - 3 - blst 1.0087 ms/op 834.37 us/op 1.21
BLS verifyMultipleSignatures - same message - 8 - blst 1.2011 ms/op 1.1042 ms/op 1.09
BLS verifyMultipleSignatures - same message - 32 - blst 1.8084 ms/op 1.7002 ms/op 1.06
BLS verifyMultipleSignatures - same message - 64 - blst 2.9001 ms/op 2.5742 ms/op 1.13
BLS verifyMultipleSignatures - same message - 128 - blst 5.0871 ms/op 4.1876 ms/op 1.21
BLS aggregatePubkeys 32 - blst 20.911 us/op 18.278 us/op 1.14
BLS aggregatePubkeys 128 - blst 73.696 us/op 63.653 us/op 1.16
notSeenSlots=1 numMissedVotes=1 numBadVotes=10 71.317 ms/op 45.506 ms/op 1.57
notSeenSlots=1 numMissedVotes=0 numBadVotes=4 67.771 ms/op 38.289 ms/op 1.77
notSeenSlots=2 numMissedVotes=1 numBadVotes=10 37.244 ms/op 30.478 ms/op 1.22
getSlashingsAndExits - default max 112.77 us/op 69.209 us/op 1.63
getSlashingsAndExits - 2k 355.98 us/op 236.00 us/op 1.51
proposeBlockBody type=full, size=empty 6.7436 ms/op 4.9830 ms/op 1.35
isKnown best case - 1 super set check 511.00 ns/op 487.00 ns/op 1.05
isKnown normal case - 2 super set checks 554.00 ns/op 468.00 ns/op 1.18
isKnown worse case - 16 super set checks 330.00 ns/op 473.00 ns/op 0.70
InMemoryCheckpointStateCache - add get delete 3.6640 us/op 2.6330 us/op 1.39
updateUnfinalizedPubkeys - updating 10 pubkeys 1.4196 ms/op 568.65 us/op 2.50
updateUnfinalizedPubkeys - updating 100 pubkeys 4.6168 ms/op 2.5670 ms/op 1.80
updateUnfinalizedPubkeys - updating 1000 pubkeys 56.658 ms/op 38.027 ms/op 1.49
validate api signedAggregateAndProof - struct 1.6231 ms/op 1.8025 ms/op 0.90
validate gossip signedAggregateAndProof - struct 1.6264 ms/op 1.9103 ms/op 0.85
validate gossip attestation - vc 640000 1.0784 ms/op 985.22 us/op 1.09
batch validate gossip attestation - vc 640000 - chunk 32 147.70 us/op 117.14 us/op 1.26
batch validate gossip attestation - vc 640000 - chunk 64 127.75 us/op 105.41 us/op 1.21
batch validate gossip attestation - vc 640000 - chunk 128 120.10 us/op 100.21 us/op 1.20
batch validate gossip attestation - vc 640000 - chunk 256 134.81 us/op 97.151 us/op 1.39
pickEth1Vote - no votes 1.1984 ms/op 930.13 us/op 1.29
pickEth1Vote - max votes 6.3301 ms/op 9.2127 ms/op 0.69
pickEth1Vote - Eth1Data hashTreeRoot value x2048 16.920 ms/op 18.737 ms/op 0.90
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 29.637 ms/op 26.059 ms/op 1.14
pickEth1Vote - Eth1Data fastSerialize value x2048 646.40 us/op 368.10 us/op 1.76
pickEth1Vote - Eth1Data fastSerialize tree x2048 5.0824 ms/op 4.2315 ms/op 1.20
bytes32 toHexString 857.00 ns/op 556.00 ns/op 1.54
bytes32 Buffer.toString(hex) 284.00 ns/op 418.00 ns/op 0.68
bytes32 Buffer.toString(hex) from Uint8Array 559.00 ns/op 522.00 ns/op 1.07
bytes32 Buffer.toString(hex) + 0x 278.00 ns/op 418.00 ns/op 0.67
Object access 1 prop 0.22200 ns/op 0.30800 ns/op 0.72
Map access 1 prop 0.14300 ns/op 0.30900 ns/op 0.46
Object get x1000 7.0240 ns/op 4.8930 ns/op 1.44
Map get x1000 6.7550 ns/op 5.5820 ns/op 1.21
Object set x1000 60.605 ns/op 26.472 ns/op 2.29
Map set x1000 40.056 ns/op 18.790 ns/op 2.13
Return object 10000 times 0.33480 ns/op 0.28140 ns/op 1.19
Throw Error 10000 times 3.7209 us/op 2.5401 us/op 1.46
toHex 194.69 ns/op 103.41 ns/op 1.88
Buffer.from 181.21 ns/op 93.817 ns/op 1.93
shared Buffer 114.59 ns/op 64.011 ns/op 1.79
fastMsgIdFn sha256 / 200 bytes 2.5290 us/op 1.9020 us/op 1.33
fastMsgIdFn h32 xxhash / 200 bytes 338.00 ns/op 387.00 ns/op 0.87
fastMsgIdFn h64 xxhash / 200 bytes 301.00 ns/op 447.00 ns/op 0.67
fastMsgIdFn sha256 / 1000 bytes 8.2650 us/op 5.8110 us/op 1.42
fastMsgIdFn h32 xxhash / 1000 bytes 454.00 ns/op 517.00 ns/op 0.88
fastMsgIdFn h64 xxhash / 1000 bytes 377.00 ns/op 498.00 ns/op 0.76
fastMsgIdFn sha256 / 10000 bytes 74.170 us/op 48.720 us/op 1.52
fastMsgIdFn h32 xxhash / 10000 bytes 2.1690 us/op 1.8810 us/op 1.15
fastMsgIdFn h64 xxhash / 10000 bytes 1.3720 us/op 1.3070 us/op 1.05
send data - 1000 256B messages 18.522 ms/op 10.639 ms/op 1.74
send data - 1000 512B messages 24.303 ms/op 12.658 ms/op 1.92
send data - 1000 1024B messages 40.143 ms/op 20.930 ms/op 1.92
send data - 1000 1200B messages 37.909 ms/op 12.899 ms/op 2.94
send data - 1000 2048B messages 42.692 ms/op 27.732 ms/op 1.54
send data - 1000 4096B messages 38.952 ms/op 24.031 ms/op 1.62
send data - 1000 16384B messages 91.603 ms/op 64.315 ms/op 1.42
send data - 1000 65536B messages 298.09 ms/op 247.31 ms/op 1.21
enrSubnets - fastDeserialize 64 bits 1.6420 us/op 1.2150 us/op 1.35
enrSubnets - ssz BitVector 64 bits 494.00 ns/op 512.00 ns/op 0.96
enrSubnets - fastDeserialize 4 bits 235.00 ns/op 323.00 ns/op 0.73
enrSubnets - ssz BitVector 4 bits 546.00 ns/op 499.00 ns/op 1.09
prioritizePeers score -10:0 att 32-0.1 sync 2-0 266.42 us/op 117.19 us/op 2.27
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 263.18 us/op 149.43 us/op 1.76
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 428.04 us/op 197.07 us/op 2.17
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 565.93 us/op 319.34 us/op 1.77
prioritizePeers score 0:0 att 64-1 sync 4-1 1.2163 ms/op 446.28 us/op 2.73
array of 16000 items push then shift 2.1771 us/op 1.2813 us/op 1.70
LinkedList of 16000 items push then shift 15.125 ns/op 6.5920 ns/op 2.29
array of 16000 items push then pop 173.61 ns/op 65.339 ns/op 2.66
LinkedList of 16000 items push then pop 11.423 ns/op 6.4510 ns/op 1.77
array of 24000 items push then shift 2.8511 us/op 1.8914 us/op 1.51
LinkedList of 24000 items push then shift 8.9930 ns/op 6.5360 ns/op 1.38
array of 24000 items push then pop 203.49 ns/op 136.73 ns/op 1.49
LinkedList of 24000 items push then pop 8.1660 ns/op 6.4370 ns/op 1.27
intersect bitArray bitLen 8 7.3370 ns/op 5.4530 ns/op 1.35
intersect array and set length 8 74.823 ns/op 40.600 ns/op 1.84
intersect bitArray bitLen 128 39.225 ns/op 26.640 ns/op 1.47
intersect array and set length 128 1.0329 us/op 591.97 ns/op 1.74
bitArray.getTrueBitIndexes() bitLen 128 2.5560 us/op 1.5280 us/op 1.67
bitArray.getTrueBitIndexes() bitLen 248 4.9060 us/op 3.0990 us/op 1.58
bitArray.getTrueBitIndexes() bitLen 512 9.2320 us/op 7.1400 us/op 1.29
Buffer.concat 32 items 1.2150 us/op 1.0190 us/op 1.19
Uint8Array.set 32 items 1.8610 us/op 2.2020 us/op 0.85
Buffer.copy 2.1320 us/op 2.3710 us/op 0.90
Uint8Array.set - with subarray 4.0420 us/op 2.9220 us/op 1.38
Uint8Array.set - without subarray 1.9990 us/op 2.1660 us/op 0.92
getUint32 - dataview 347.00 ns/op 400.00 ns/op 0.87
getUint32 - manual 234.00 ns/op 338.00 ns/op 0.69
Set add up to 64 items then delete first 2.4349 us/op 1.8062 us/op 1.35
OrderedSet add up to 64 items then delete first 4.0775 us/op 2.7996 us/op 1.46
Set add up to 64 items then delete last 2.8734 us/op 2.0487 us/op 1.40
OrderedSet add up to 64 items then delete last 5.4998 us/op 3.1410 us/op 1.75
Set add up to 64 items then delete middle 3.8992 us/op 2.0586 us/op 1.89
OrderedSet add up to 64 items then delete middle 7.1664 us/op 4.4819 us/op 1.60
Set add up to 128 items then delete first 7.4846 us/op 4.0249 us/op 1.86
OrderedSet add up to 128 items then delete first 10.967 us/op 6.3124 us/op 1.74
Set add up to 128 items then delete last 7.4172 us/op 3.8805 us/op 1.91
OrderedSet add up to 128 items then delete last 11.486 us/op 5.9322 us/op 1.94
Set add up to 128 items then delete middle 7.3934 us/op 3.8867 us/op 1.90
OrderedSet add up to 128 items then delete middle 18.831 us/op 11.930 us/op 1.58
Set add up to 256 items then delete first 14.938 us/op 7.8725 us/op 1.90
OrderedSet add up to 256 items then delete first 24.918 us/op 12.624 us/op 1.97
Set add up to 256 items then delete last 16.327 us/op 7.6494 us/op 2.13
OrderedSet add up to 256 items then delete last 23.214 us/op 11.767 us/op 1.97
Set add up to 256 items then delete middle 14.756 us/op 7.6054 us/op 1.94
OrderedSet add up to 256 items then delete middle 54.529 us/op 34.698 us/op 1.57
transfer serialized Status (84 B) 1.5460 us/op 1.3660 us/op 1.13
copy serialized Status (84 B) 1.4070 us/op 1.2750 us/op 1.10
transfer serialized SignedVoluntaryExit (112 B) 1.6190 us/op 1.6670 us/op 0.97
copy serialized SignedVoluntaryExit (112 B) 1.5960 us/op 1.3480 us/op 1.18
transfer serialized ProposerSlashing (416 B) 2.0110 us/op 2.0090 us/op 1.00
copy serialized ProposerSlashing (416 B) 2.0600 us/op 1.9000 us/op 1.08
transfer serialized Attestation (485 B) 2.0250 us/op 2.0880 us/op 0.97
copy serialized Attestation (485 B) 2.0320 us/op 2.0440 us/op 0.99
transfer serialized AttesterSlashing (33232 B) 2.0820 us/op 2.1590 us/op 0.96
copy serialized AttesterSlashing (33232 B) 10.093 us/op 4.5730 us/op 2.21
transfer serialized Small SignedBeaconBlock (128000 B) 3.8900 us/op 3.1140 us/op 1.25
copy serialized Small SignedBeaconBlock (128000 B) 30.560 us/op 11.335 us/op 2.70
transfer serialized Avg SignedBeaconBlock (200000 B) 4.0280 us/op 3.8590 us/op 1.04
copy serialized Avg SignedBeaconBlock (200000 B) 42.904 us/op 16.335 us/op 2.63
transfer serialized BlobsSidecar (524380 B) 5.0530 us/op 3.8800 us/op 1.30
copy serialized BlobsSidecar (524380 B) 129.11 us/op 81.961 us/op 1.58
transfer serialized Big SignedBeaconBlock (1000000 B) 4.2910 us/op 3.4240 us/op 1.25
copy serialized Big SignedBeaconBlock (1000000 B) 208.43 us/op 366.92 us/op 0.57
pass gossip attestations to forkchoice per slot 2.9794 ms/op 2.8942 ms/op 1.03
forkChoice updateHead vc 100000 bc 64 eq 0 579.69 us/op 613.26 us/op 0.95
forkChoice updateHead vc 600000 bc 64 eq 0 3.8773 ms/op 2.5123 ms/op 1.54
forkChoice updateHead vc 1000000 bc 64 eq 0 5.8805 ms/op 4.1537 ms/op 1.42
forkChoice updateHead vc 600000 bc 320 eq 0 3.3787 ms/op 2.5116 ms/op 1.35
forkChoice updateHead vc 600000 bc 1200 eq 0 3.4155 ms/op 2.6132 ms/op 1.31
forkChoice updateHead vc 600000 bc 7200 eq 0 4.7502 ms/op 3.0073 ms/op 1.58
forkChoice updateHead vc 600000 bc 64 eq 1000 11.446 ms/op 9.9439 ms/op 1.15
forkChoice updateHead vc 600000 bc 64 eq 10000 12.012 ms/op 9.5763 ms/op 1.25
forkChoice updateHead vc 600000 bc 64 eq 300000 18.114 ms/op 11.895 ms/op 1.52
computeDeltas 500000 validators 300 proto nodes 3.9334 ms/op 2.9949 ms/op 1.31
computeDeltas 500000 validators 1200 proto nodes 4.1718 ms/op 2.8998 ms/op 1.44
computeDeltas 500000 validators 7200 proto nodes 4.7012 ms/op 2.9031 ms/op 1.62
computeDeltas 750000 validators 300 proto nodes 6.7033 ms/op 4.2027 ms/op 1.59
computeDeltas 750000 validators 1200 proto nodes 7.2961 ms/op 4.2632 ms/op 1.71
computeDeltas 750000 validators 7200 proto nodes 6.6749 ms/op 4.2885 ms/op 1.56
computeDeltas 1400000 validators 300 proto nodes 13.743 ms/op 8.3256 ms/op 1.65
computeDeltas 1400000 validators 1200 proto nodes 11.964 ms/op 8.3951 ms/op 1.43
computeDeltas 1400000 validators 7200 proto nodes 11.602 ms/op 8.2188 ms/op 1.41
computeDeltas 2100000 validators 300 proto nodes 19.614 ms/op 12.432 ms/op 1.58
computeDeltas 2100000 validators 1200 proto nodes 18.219 ms/op 12.772 ms/op 1.43
computeDeltas 2100000 validators 7200 proto nodes 18.552 ms/op 12.687 ms/op 1.46
altair processAttestation - 250000 vs - 7PWei normalcase 3.0122 ms/op 2.4128 ms/op 1.25
altair processAttestation - 250000 vs - 7PWei worstcase 3.8183 ms/op 2.1688 ms/op 1.76
altair processAttestation - setStatus - 1/6 committees join 144.70 us/op 71.702 us/op 2.02
altair processAttestation - setStatus - 1/3 committees join 259.79 us/op 295.54 us/op 0.88
altair processAttestation - setStatus - 1/2 committees join 305.52 us/op 180.42 us/op 1.69
altair processAttestation - setStatus - 2/3 committees join 398.43 us/op 383.77 us/op 1.04
altair processAttestation - setStatus - 4/5 committees join 580.21 us/op 377.16 us/op 1.54
altair processAttestation - setStatus - 100% committees join 677.12 us/op 461.75 us/op 1.47
altair processBlock - 250000 vs - 7PWei normalcase 9.1927 ms/op 4.6896 ms/op 1.96
altair processBlock - 250000 vs - 7PWei normalcase hashState 30.536 ms/op 26.495 ms/op 1.15
altair processBlock - 250000 vs - 7PWei worstcase 42.396 ms/op 33.838 ms/op 1.25
altair processBlock - 250000 vs - 7PWei worstcase hashState 107.53 ms/op 64.807 ms/op 1.66
phase0 processBlock - 250000 vs - 7PWei normalcase 3.4799 ms/op 1.7563 ms/op 1.98
phase0 processBlock - 250000 vs - 7PWei worstcase 29.460 ms/op 21.313 ms/op 1.38
altair processEth1Data - 250000 vs - 7PWei normalcase 641.65 us/op 233.09 us/op 2.75
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 9.8280 us/op 4.4050 us/op 2.23
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 50.453 us/op 28.457 us/op 1.77
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 15.399 us/op 7.7580 us/op 1.98
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 12.190 us/op 5.2310 us/op 2.33
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 226.31 us/op 124.94 us/op 1.81
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 1.6678 ms/op 854.68 us/op 1.95
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 2.4897 ms/op 1.0833 ms/op 2.30
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 2.5073 ms/op 1.1467 ms/op 2.19
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 6.6275 ms/op 2.7043 ms/op 2.45
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 3.1488 ms/op 1.1882 ms/op 2.65
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 9.0531 ms/op 2.7366 ms/op 3.31
Tree 40 250000 create 789.25 ms/op 176.86 ms/op 4.46
Tree 40 250000 get(125000) 246.75 ns/op 106.59 ns/op 2.31
Tree 40 250000 set(125000) 3.0896 us/op 512.85 ns/op 6.02
Tree 40 250000 toArray() 38.438 ms/op 11.794 ms/op 3.26
Tree 40 250000 iterate all - toArray() + loop 35.901 ms/op 12.681 ms/op 2.83
Tree 40 250000 iterate all - get(i) 87.012 ms/op 40.093 ms/op 2.17
Array 250000 create 5.9009 ms/op 2.5364 ms/op 2.33
Array 250000 clone - spread 6.0579 ms/op 1.2515 ms/op 4.84
Array 250000 get(125000) 0.61100 ns/op 0.56300 ns/op 1.09
Array 250000 set(125000) 0.73300 ns/op 0.60700 ns/op 1.21
Array 250000 iterate all - loop 107.49 us/op 77.053 us/op 1.40
phase0 afterProcessEpoch - 250000 vs - 7PWei 129.24 ms/op 74.660 ms/op 1.73
Array.fill - length 1000000 8.3394 ms/op 2.4610 ms/op 3.39
Array push - length 1000000 52.419 ms/op 14.302 ms/op 3.67
Array.get 0.38856 ns/op 0.25061 ns/op 1.55
Uint8Array.get 0.49877 ns/op 0.33016 ns/op 1.51
phase0 beforeProcessEpoch - 250000 vs - 7PWei 26.913 ms/op 16.156 ms/op 1.67
altair processEpoch - mainnet_e81889 440.49 ms/op 282.40 ms/op 1.56
mainnet_e81889 - altair beforeProcessEpoch 27.008 ms/op 18.445 ms/op 1.46
mainnet_e81889 - altair processJustificationAndFinalization 20.776 us/op 9.7680 us/op 2.13
mainnet_e81889 - altair processInactivityUpdates 9.1003 ms/op 4.1268 ms/op 2.21
mainnet_e81889 - altair processRewardsAndPenalties 60.565 ms/op 46.934 ms/op 1.29
mainnet_e81889 - altair processRegistryUpdates 3.7280 us/op 1.9450 us/op 1.92
mainnet_e81889 - altair processSlashings 1.1940 us/op 737.00 ns/op 1.62
mainnet_e81889 - altair processEth1DataReset 634.00 ns/op 684.00 ns/op 0.93
mainnet_e81889 - altair processEffectiveBalanceUpdates 2.0765 ms/op 992.58 us/op 2.09
mainnet_e81889 - altair processSlashingsReset 7.8820 us/op 2.1260 us/op 3.71
mainnet_e81889 - altair processRandaoMixesReset 8.2580 us/op 2.5490 us/op 3.24
mainnet_e81889 - altair processHistoricalRootsUpdate 1.2110 us/op 664.00 ns/op 1.82
mainnet_e81889 - altair processParticipationFlagUpdates 6.8340 us/op 1.5560 us/op 4.39
mainnet_e81889 - altair processSyncCommitteeUpdates 1.0150 us/op 651.00 ns/op 1.56
mainnet_e81889 - altair afterProcessEpoch 111.24 ms/op 72.806 ms/op 1.53
capella processEpoch - mainnet_e217614 1.6894 s/op 1.1971 s/op 1.41
mainnet_e217614 - capella beforeProcessEpoch 131.07 ms/op 64.615 ms/op 2.03
mainnet_e217614 - capella processJustificationAndFinalization 32.572 us/op 12.091 us/op 2.69
mainnet_e217614 - capella processInactivityUpdates 23.018 ms/op 11.824 ms/op 1.95
mainnet_e217614 - capella processRewardsAndPenalties 339.30 ms/op 258.78 ms/op 1.31
mainnet_e217614 - capella processRegistryUpdates 23.766 us/op 11.294 us/op 2.10
mainnet_e217614 - capella processSlashings 1.0620 us/op 779.00 ns/op 1.36
mainnet_e217614 - capella processEth1DataReset 794.00 ns/op 753.00 ns/op 1.05
mainnet_e217614 - capella processEffectiveBalanceUpdates 22.079 ms/op 3.2673 ms/op 6.76
mainnet_e217614 - capella processSlashingsReset 5.9620 us/op 1.4090 us/op 4.23
mainnet_e217614 - capella processRandaoMixesReset 8.4240 us/op 3.0300 us/op 2.78
mainnet_e217614 - capella processHistoricalRootsUpdate 2.0750 us/op 698.00 ns/op 2.97
mainnet_e217614 - capella processParticipationFlagUpdates 4.1860 us/op 1.7280 us/op 2.42
mainnet_e217614 - capella afterProcessEpoch 329.58 ms/op 187.65 ms/op 1.76
phase0 processEpoch - mainnet_e58758 513.25 ms/op 339.02 ms/op 1.51
mainnet_e58758 - phase0 beforeProcessEpoch 138.59 ms/op 86.057 ms/op 1.61
mainnet_e58758 - phase0 processJustificationAndFinalization 33.405 us/op 11.171 us/op 2.99
mainnet_e58758 - phase0 processRewardsAndPenalties 53.938 ms/op 36.631 ms/op 1.47
mainnet_e58758 - phase0 processRegistryUpdates 18.338 us/op 6.1400 us/op 2.99
mainnet_e58758 - phase0 processSlashings 1.0240 us/op 763.00 ns/op 1.34
mainnet_e58758 - phase0 processEth1DataReset 1.0750 us/op 718.00 ns/op 1.50
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 4.9836 ms/op 1.3628 ms/op 3.66
mainnet_e58758 - phase0 processSlashingsReset 8.8970 us/op 2.8190 us/op 3.16
mainnet_e58758 - phase0 processRandaoMixesReset 10.465 us/op 2.9030 us/op 3.60
mainnet_e58758 - phase0 processHistoricalRootsUpdate 814.00 ns/op 654.00 ns/op 1.24
mainnet_e58758 - phase0 processParticipationRecordUpdates 7.0880 us/op 2.7200 us/op 2.61
mainnet_e58758 - phase0 afterProcessEpoch 92.413 ms/op 61.266 ms/op 1.51
phase0 processEffectiveBalanceUpdates - 250000 normalcase 2.2452 ms/op 937.30 us/op 2.40
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 7.4445 ms/op 1.4955 ms/op 4.98
altair processInactivityUpdates - 250000 normalcase 20.285 ms/op 16.347 ms/op 1.24
altair processInactivityUpdates - 250000 worstcase 18.877 ms/op 15.919 ms/op 1.19
phase0 processRegistryUpdates - 250000 normalcase 12.683 us/op 5.1620 us/op 2.46
phase0 processRegistryUpdates - 250000 badcase_full_deposits 314.04 us/op 289.32 us/op 1.09
phase0 processRegistryUpdates - 250000 worstcase 0.5 139.80 ms/op 107.16 ms/op 1.30
altair processRewardsAndPenalties - 250000 normalcase 40.815 ms/op 43.265 ms/op 0.94
altair processRewardsAndPenalties - 250000 worstcase 43.584 ms/op 36.257 ms/op 1.20
phase0 getAttestationDeltas - 250000 normalcase 9.3365 ms/op 5.8463 ms/op 1.60
phase0 getAttestationDeltas - 250000 worstcase 8.0951 ms/op 6.3152 ms/op 1.28
phase0 processSlashings - 250000 worstcase 112.07 us/op 80.421 us/op 1.39
altair processSyncCommitteeUpdates - 250000 139.00 ms/op 100.38 ms/op 1.38
BeaconState.hashTreeRoot - No change 235.00 ns/op 448.00 ns/op 0.52
BeaconState.hashTreeRoot - 1 full validator 108.76 us/op 143.56 us/op 0.76
BeaconState.hashTreeRoot - 32 full validator 1.7075 ms/op 1.4851 ms/op 1.15
BeaconState.hashTreeRoot - 512 full validator 18.744 ms/op 15.293 ms/op 1.23
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 177.73 us/op 140.58 us/op 1.26
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 2.2769 ms/op 1.8417 ms/op 1.24
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 32.727 ms/op 24.501 ms/op 1.34
BeaconState.hashTreeRoot - 1 balances 142.68 us/op 103.70 us/op 1.38
BeaconState.hashTreeRoot - 32 balances 1.4604 ms/op 975.06 us/op 1.50
BeaconState.hashTreeRoot - 512 balances 10.262 ms/op 9.9586 ms/op 1.03
BeaconState.hashTreeRoot - 250000 balances 236.26 ms/op 170.44 ms/op 1.39
aggregationBits - 2048 els - zipIndexesInBitList 37.502 us/op 40.991 us/op 0.91
byteArrayEquals 32 58.588 ns/op 45.537 ns/op 1.29
Buffer.compare 32 19.772 ns/op 14.669 ns/op 1.35
byteArrayEquals 1024 1.7401 us/op 1.1987 us/op 1.45
Buffer.compare 1024 27.586 ns/op 22.981 ns/op 1.20
byteArrayEquals 16384 27.053 us/op 19.059 us/op 1.42
Buffer.compare 16384 219.56 ns/op 186.51 ns/op 1.18
byteArrayEquals 123687377 207.05 ms/op 144.50 ms/op 1.43
Buffer.compare 123687377 12.803 ms/op 3.7482 ms/op 3.42
byteArrayEquals 32 - diff last byte 59.357 ns/op 45.659 ns/op 1.30
Buffer.compare 32 - diff last byte 22.456 ns/op 14.990 ns/op 1.50
byteArrayEquals 1024 - diff last byte 1.7447 us/op 1.2008 us/op 1.45
Buffer.compare 1024 - diff last byte 33.399 ns/op 21.624 ns/op 1.54
byteArrayEquals 16384 - diff last byte 27.295 us/op 19.065 us/op 1.43
Buffer.compare 16384 - diff last byte 298.87 ns/op 166.93 ns/op 1.79
byteArrayEquals 123687377 - diff last byte 202.91 ms/op 144.16 ms/op 1.41
Buffer.compare 123687377 - diff last byte 8.8015 ms/op 5.2920 ms/op 1.66
byteArrayEquals 32 - random bytes 5.9860 ns/op 4.7280 ns/op 1.27
Buffer.compare 32 - random bytes 29.824 ns/op 15.032 ns/op 1.98
byteArrayEquals 1024 - random bytes 6.7850 ns/op 4.7320 ns/op 1.43
Buffer.compare 1024 - random bytes 24.598 ns/op 14.850 ns/op 1.66
byteArrayEquals 16384 - random bytes 6.3050 ns/op 4.7650 ns/op 1.32
Buffer.compare 16384 - random bytes 18.801 ns/op 14.833 ns/op 1.27
byteArrayEquals 123687377 - random bytes 7.8300 ns/op 7.5300 ns/op 1.04
Buffer.compare 123687377 - random bytes 23.560 ns/op 17.610 ns/op 1.34
regular array get 100000 times 46.806 us/op 29.502 us/op 1.59
wrappedArray get 100000 times 43.426 us/op 29.508 us/op 1.47
arrayWithProxy get 100000 times 14.334 ms/op 10.295 ms/op 1.39
ssz.Root.equals 48.552 ns/op 43.730 ns/op 1.11
byteArrayEquals 48.616 ns/op 43.341 ns/op 1.12
Buffer.compare 11.225 ns/op 9.1030 ns/op 1.23
shuffle list - 16384 els 7.0266 ms/op 5.4516 ms/op 1.29
shuffle list - 250000 els 100.26 ms/op 80.723 ms/op 1.24
processSlot - 1 slots 14.354 us/op 13.893 us/op 1.03
processSlot - 32 slots 3.4117 ms/op 2.7640 ms/op 1.23
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 45.131 ms/op 43.182 ms/op 1.05
getCommitteeAssignments - req 1 vs - 250000 vc 2.2412 ms/op 1.8163 ms/op 1.23
getCommitteeAssignments - req 100 vs - 250000 vc 4.3563 ms/op 3.5448 ms/op 1.23
getCommitteeAssignments - req 1000 vs - 250000 vc 4.7134 ms/op 3.8255 ms/op 1.23
findModifiedValidators - 10000 modified validators 313.38 ms/op 231.90 ms/op 1.35
findModifiedValidators - 1000 modified validators 223.83 ms/op 153.61 ms/op 1.46
findModifiedValidators - 100 modified validators 202.98 ms/op 144.33 ms/op 1.41
findModifiedValidators - 10 modified validators 238.33 ms/op 138.04 ms/op 1.73
findModifiedValidators - 1 modified validators 298.90 ms/op 129.35 ms/op 2.31
findModifiedValidators - no difference 281.21 ms/op 124.53 ms/op 2.26
compare ViewDUs 3.8588 s/op 3.2207 s/op 1.20
compare each validator Uint8Array 1.8521 s/op 1.6729 s/op 1.11
compare ViewDU to Uint8Array 1.5219 s/op 679.23 ms/op 2.24
migrate state 1000000 validators, 24 modified, 0 new 1.2387 s/op 834.81 ms/op 1.48
migrate state 1000000 validators, 1700 modified, 1000 new 1.3509 s/op 1.0836 s/op 1.25
migrate state 1000000 validators, 3400 modified, 2000 new 1.7839 s/op 1.2836 s/op 1.39
migrate state 1500000 validators, 24 modified, 0 new 1.2185 s/op 853.07 ms/op 1.43
migrate state 1500000 validators, 1700 modified, 1000 new 1.6077 s/op 1.0885 s/op 1.48
migrate state 1500000 validators, 3400 modified, 2000 new 2.1227 s/op 1.2930 s/op 1.64
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 5.6000 ns/op 5.9300 ns/op 0.94
state getBlockRootAtSlot - 250000 vs - 7PWei 745.66 ns/op 969.78 ns/op 0.77
computeProposers - vc 250000 9.5096 ms/op 6.1903 ms/op 1.54
computeEpochShuffling - vc 250000 130.09 ms/op 75.646 ms/op 1.72
getNextSyncCommittee - vc 250000 177.92 ms/op 102.17 ms/op 1.74
computeSigningRoot for AttestationData 30.781 us/op 22.530 us/op 1.37
hash AttestationData serialized data then Buffer.toString(base64) 2.1211 us/op 1.1187 us/op 1.90
toHexString serialized data 1.8387 us/op 713.84 ns/op 2.58
Buffer.toString(base64) 260.80 ns/op 131.73 ns/op 1.98
nodejs block root to RootHex using toHex 231.73 ns/op 103.20 ns/op 2.25
nodejs block root to RootHex using toRootHex 150.02 ns/op 68.267 ns/op 2.20
browser block root to RootHex using the deprecated toHexString 460.36 ns/op 194.57 ns/op 2.37
browser block root to RootHex using toHex 326.74 ns/op 157.38 ns/op 2.08
browser block root to RootHex using toRootHex 301.26 ns/op 137.96 ns/op 2.18

by benchmarkbot/action

Copy link

codecov bot commented Sep 23, 2024

Codecov Report

Attention: Patch coverage is 46.15385% with 7 lines in your changes missing coverage. Please review.

Project coverage is 50.82%. Comparing base (cd98c23) to head (0edc38f).
Report is 13 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #7091      +/-   ##
============================================
- Coverage     50.84%   50.82%   -0.02%     
============================================
  Files           597      597              
  Lines         39835    39827       -8     
  Branches       2069     2060       -9     
============================================
- Hits          20256    20244      -12     
- Misses        19579    19583       +4     

@twoeths
Copy link
Contributor Author

twoeths commented Sep 24, 2024

gc has been reduced with this PR. I guess that's because we don't have to convert to string for the map thanks to the default trait implementation Rust, this is amazing @wemeetagain

this is on a mainnet node

Screenshot 2024-09-24 at 14 17 49

@twoeths twoeths marked this pull request as ready for review September 24, 2024 07:18
@twoeths twoeths requested a review from a team as a code owner September 24, 2024 07:18
@@ -187,7 +188,7 @@ export function getStateValidatorIndex(

// typeof id === Uint8Array
const validatorIndex = pubkey2index.get(id);
if (validatorIndex === undefined) {
if (validatorIndex === null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its a little annoying that the return type is changed from undefined to null, is it worth changing that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is generated by napi-rs so I just follow the interface there https://github.com/ChainSafe/pubkey-index-map/blob/b2f2ba42850aaf131559c8790174260e996b475a/index.d.ts#L9

we can have a proxy map to return undefined instead of null to have less code change here, but I think doing that would make it a bit hard to understand in the future for the maintainer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, lets just keep it as is, returning number | null

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tend to like null to differentiate "intended non-result" vs "no value set" which could be a lot of other things

@wemeetagain
Copy link
Member

Only thing we need to ensure is that @chainsafe/pubkey-index-map supports all the necessary platforms we want to support

@twoeths
Copy link
Contributor Author

twoeths commented Sep 27, 2024

Only thing we need to ensure is that @chainsafe/pubkey-index-map supports all the necessary platforms we want to support

my understand is @chainsafe/pubkey-index-map does not have native dependencies like in @chainsafe/blst or @chainsafe/hashtree so perhaps we're safe

would like @matthewkeil to confirm this, this is the same situation to the napi-rs work we're gonna do for epoch shuffling computation

@matthewkeil
Copy link
Member

Only thing we need to ensure is that @chainsafe/pubkey-index-map supports all the necessary platforms we want to support

my understand is @chainsafe/pubkey-index-map does not have native dependencies like in @chainsafe/blst or @chainsafe/hashtree so perhaps we're safe

would like @matthewkeil to confirm this, this is the same situation to the napi-rs work we're gonna do for epoch shuffling computation

Platforms look ok. I would think we do not want to support musl because of the performance aspect but not sure how important that is relative to having "support for everything"... 🤷‍♂️ I think its ok as is though.

Copy link
Member

@matthewkeil matthewkeil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎸

@@ -164,7 +165,7 @@ export function getBeaconStateApi({
}
balances.push({index: id, balance: state.balances.get(id)});
} else {
const index = headState.epochCtx.pubkey2index.get(id);
const index = headState.epochCtx.pubkey2index.get(fromHex(id));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

soo nice that we dont need to stringify anymore!!! rust for the win!

@@ -187,7 +188,7 @@ export function getStateValidatorIndex(

// typeof id === Uint8Array
const validatorIndex = pubkey2index.get(id);
if (validatorIndex === undefined) {
if (validatorIndex === null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tend to like null to differentiate "intended non-result" vs "no value set" which could be a lot of other things

@@ -1,10 +1,11 @@
import {itBench, setBenchOpts} from "@dapplion/benchmark";
import {Map} from "immutable";
import {Map as ImmutableMap} from "immutable";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good change. We should not shadow namespaces

@wemeetagain wemeetagain merged commit fe7e21b into unstable Sep 30, 2024
18 of 20 checks passed
@wemeetagain wemeetagain deleted the te/pubkey_index_map branch September 30, 2024 16:09
@twoeths
Copy link
Contributor Author

twoeths commented Oct 1, 2024

metrics after ~9h of deployment on the unstable mainnet node

Screenshot 2024-10-01 at 09 18 35

both heap and gc are improved a little bit

philknows pushed a commit that referenced this pull request Oct 18, 2024
* feat: use napi-rs pubkey-index-map

* fix: export toMemoryEfficientHexStr()

* fix: state-transition CachedBeaconState unit test

* chore: remove commented code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants