Deploy two hosts for benchmarking nimbus-eth1 import #194

jakubgs · 2024-08-20T16:25:53Z

The development of nimbus-eth1 is ramping up, and for that reason we will need to perform benchmarking of the process of importing the network state data with validation from ERA files. Currently this process is not optimized, as it's in it's early stages of development, which means a full import of mainnet would probably take more than a week. Despite that we need to start measuring results in order to figure out progress in import process optimization.

This benchmarking will require two kind of tests on two hosts:

A long running test that lasts a week.
A "short" running test that lasts 24 hours.

Both of those will not finish, so they will have to be aborted, but the amount of blocks they are able to sync will be the measure of performance. These performance reports will have to be archived in some way, simplest way would be to commit them to a dedicated repository. In addition to the reports gained this way the import process will make available a /metrics endpoint which we can scrape with Prometheus.

The two hosts can be purchased from Hetzner as the hosts will not be using external connections. The storage required will need to be at least 2x the size of Mainnet ERA and ERA1 files, which is currently ~1 2B, so a 2 TB additional NVMe would suffice. Aside from that more than 16 GB of RAM and 4 cores is enough.

update as of 28 Oct

Short test must begin with a template DB which contains blocks from 20M since measuring the import process from these blocks is what matters to the nimbus team. Jacek to provide this template db.

The long test will begin with no template DB and is also an import only test, it usually takes around a week.

The goal is to measure time taken to complete import in both cases.

The text was updated successfully, but these errors were encountered:

siddarthkay · 2024-08-22T12:01:19Z

How urgently do we need the 2 hosts?
I could either get a cheaper host from auction via Hetzner

OR

I could get a dedicated host which would be comparatively more expensive
example : https://www.hetzner.com/dedicated-rootserver/ax52/

My assumption is that the host from Auction might take longer to get compared to the dedicated one.

siddarthkay · 2024-08-22T12:38:09Z

After discussing with @jakubgs, I finally went ahead with the following

2 x Dedicated Server AX42
* Location: Finland, HEL1
* For Finland, HEL1, support is only available in English.
* Rescue system (English)
* 1 x Primary IPv4
* 1 x 2 TB NVMe SSD
* 8 Core CPU
* 64 GB DDR5 ECC RAM

Order Details :

Possible wait times :

siddarthkay · 2024-08-27T11:43:40Z

These 2 AX42 hosts have been activated by Hetzner and currently boot into rescue system.
I'll bootstrap these hosts and add them to our inventory.

This commit adds 2 hetzner AX42 hosts for eth1 benchmarking to our network. related issue: #194

siddarthkay · 2024-10-23T04:56:25Z

I will use bech-01 for the short 24 hour test and bech-02 for the long running 1 week test.

Next steps are as follows :

Get ERA1 files from nel-01.ih-eu-mda1.nimbus.mainnet and move them to bech hosts.

 [email protected]:~ % sudo du -hsc /docker/era1
 428G    /docker/era1
 428G    total

Get ERA files from nel-01.ih-eu-mda1.nimbus.mainnet and move them to bech hosts.
Geth nimbus-eth1 running on bench-01.he-eu-hel1.nimbus.eth1 and bench-02.he-eu-hel1.nimbus.eth1 and make sure the node is running as expected and nimbus_eth1_network is set to mainnet.
On bench-01 set up the template DB which contains 20m blocks.
On bench-01 set up a systemd timer to trigger an import of the state by passing --era1-dir and --era-dir and log the time taken for the short sync to complete. This process should be terminated if it crosses more than 24 hours and when termination happens we have to log the progress percentage of the import.
The terminating script on bench-01 should also replace the existing db with the "template db" so that when the short test is run again its run from the state of 20m blocks.
On bench-02 Set up a systemd timer to trigger an "import" of the state by passing --era1-dir and --era-dir and log the time taken for the long sync to complete. This process should be terminated if it crosses more than 1 week and when termination happens we have to log the progress percentage of the import.
The terminating script on bench-02 should also clean up the existing db so that when import is run again, it is run from a clean slate.
Discuss 1st few results with Jacek and eth1 team in Discord.
Implement a process to auto publish this results by commiting to github.

jakubgs · 2024-10-23T07:00:11Z

Sounds correct. Remember that the timer will have to do several things:

Measure and save progress and time it took.
Stop the nimbus-eth1 service.
Purge already synced data.
Restart the nimbus-eth1 service.

siddarthkay · 2024-10-28T11:22:12Z

as per @arnetheduck :

syncing = creating a state from blocks, usually sourced from the network
import = a method of syncing that reads era files instead of sourcing the blocks from the network - it's the same blocks, just in a file instead of requesting from nodes on the network

what we want to measure is the performance of turning blocks into a state - using import for this purpose eliminates the networking aspect of it focusing on the block processing component

siddarthkay · 2024-11-09T08:01:53Z

The short benchmark was run and it completed in ~ 12 hours

Nov  4 23:32:38 bench-01.he-eu-hel1.nimbus.eth1 nimbus-eth1-mainnet-short-benchmark[147056]: 
INF 2024-11-04 23:32:38.837+00:00 Imported blocks                            
blockNumber=21005282 blocks=1005281 importedSlot=10223616 txs=160021843 mgas=15219979.109 
bps=24.889 tps=4328.405 mgps=376.486 avgBps=23.451 avgTps=3732.972 avgMGps=355.050 elapsed=11h54m27s

This was however run on a RAID 0 setup between 3 drives.
A side effect of RAID 0 is that its only used in cases where enhanced performance is needed.

So further benchmarks will be run on devices that do not have RAID 0.
A risk of RAID 0 is that if any one of the 3 drives go down, the entire storage will be lost so it seems risky to continue down this path if we will frequently run benchmarks.

siddarthkay · 2024-11-27T07:24:08Z

I made this github repo to hold results of exported csv for short benchmark test.
https://github.com/status-im/nimbus-eth1-benchmarks

The format of reports is still a work in progress, for now the systemd service just pushes the csv exported by the import process to this github repo.

I also made this github role for building of nimbus eth1 and cleaning up the bench-01 host and restarting the short benchmark from a clean state.
https://github.com/status-im/infra-role-nimbus-bench-eth1

siddarthkay · 2024-11-27T07:35:19Z

Thinking further on the expected report the following items should be covered:

runtime or duration of the entire benchmark
start block and end block
command used to run this benchmark
hardware information the benchmark was run on

The folder structure could be

short benchmark
│
└───2024-11-27
│   │   {HH:MM:SS-git-short-commit}-metrics-export.csv
│   │   {HH:MM:SS-git-short-commit}-build-environment.json
└───2024-11-26
│   │   {HH:MM:SS-git-short-commit}-metrics-export.csv
│   │   {HH:MM:SS-git-short-commit}-build-environment.json
└───2024-11-25
│   │   {HH:MM:SS-git-short-commit}-metrics-export.csv
│   │   {HH:MM:SS-git-short-commit}-build-environment.json

and similar for long benchmark

jakubgs · 2024-11-28T17:41:51Z

I wonder if the folder structure shouldn't be 2024/11/25/{HH:MM:SS-git-short-commit}-metrics-export.csv

Since if we go for each day then the root of the repo will be quite a big list very quickly.

siddarthkay · 2024-11-29T16:19:50Z

Hmm indeed, also thinking from search standpoint, I believe nimbus team would be more interested in searching for the performance of the short benchmark for a particular commit, so a folder for commit would also not be a bad idea and then within that folder we could have various files which have the timestamp identifier attached.
Like this :

short benchmark
│
└─{git-commit-hash}
│   │           └──{ISO Timestamp}-metrics-export.csv
│   │           └──{ISO Timestamp}-build-environment.log
└─{git-commit-hash}
│   │           └──{ISO Timestamp}-metrics-export.csv
│   │           └──{ISO Timestamp}-build-environment.log
└─{git-commit-hash}
│   │           └──{ISO Timestamp}-metrics-export.csv
│   │           └──{ISO Timestamp}-build-environment.log


long benchmark
│
└─{git-commit-hash}
│   │           └──{ISO Timestamp}-metrics-export.csv
│   │           └──{ISO Timestamp}-build-environment.log
└─{git-commit-hash}
│   │           └──{ISO Timestamp}-metrics-export.csv
│   │           └──{ISO Timestamp}-build-environment.log
└─{git-commit-hash}
│   │           └──{ISO Timestamp}-metrics-export.csv
│   │           └──{ISO Timestamp}-build-environment.log

jakubgs · 2024-12-02T18:35:28Z

I think using dates in folder structure will make for a nicer format. Using commits for folders is not great because when you call find | sort in the repo you won't get a correctly ordered list based off of timestamps, since commits will break that ordering.

arnetheduck · 2024-12-03T15:54:21Z

Additional feature: when a benchmark runs, it should be compared against the previous commit using https://github.com/status-im/nimbus-eth1/blob/master/scripts/block-import-stats.py - this script compares two CSV files and outputs a comparison table as can be seen in this comment: status-im/nimbus-eth1#2413 (comment)

See also: https://github.com/status-im/nimbus-eth1/tree/master/scripts#block-import-statspy

Its output could be saved to a text file together with the other outputs

siddarthkay · 2024-12-16T08:56:35Z

Example of comparing current run with previous run for short benchmark :
https://github.com/status-im/nimbus-eth1-benchmarks/blob/master/short-benchmark/20241215T163401_650fec5a/build-environment.log

I will clean up all old / incomplete benchmarks from the repository now.
The repo does contain multiple reports of the same commit, they were run in order to test the setup of benchmarking automation.

siddarthkay · 2024-12-27T15:02:01Z

short benchmarking reports have been stable for a while.
a recent long benchmarking report was pushed here status-im/nimbus-eth1-benchmarks@cd8d97c

I consider this task as done, unless there are any more changes or bugs in the reports.

siddarthkay · 2025-01-02T09:19:27Z

another requirement is redirecting output of blocks-import python script to README.md so that it can be rendered on Github. This would be a benchmark level Readme.

A main repo README.md is also required and could be generated like this:

cat README.tmpl > README.md
grep Time $(find -name build-environment.log | sort -r) | sed -rn 's~./(.*)/build.*Time.*: (.*), (.*)~|[\1](\1/)|\2|\3|~p' > README.md

Generated README.md would look like this :

# Benchmarks

Benchmarks for nimbus-eth1 .. bla bla

## Results

| Benchmark | Time | Diff |
| --- | ---: | ---: |

jakubgs · 2025-01-02T12:19:42Z

grep Time $(find -name build-environment.log | sort -r) | sed -rn 's~./(.*)/build.*Time.*: (.*), (.*)~|[\1](\1/)|\2|\3|~p' > README.md

What kind of monstrocity is this? You know you can template files in bash using envsubst and just env variables?

siddarthkay · 2025-01-02T12:29:04Z

yes, just copy pasting what Jacek had mentioned in chat to keep track.

arnetheduck · 2025-01-02T13:43:59Z

What kind of monstrocity is this

"whatever" as long as it puts an overview table in the "top-level" readme

siddarthkay · 2025-01-07T09:27:52Z

main readme is generated here : https://github.com/status-im/nimbus-eth1-benchmarks/blob/master/README.md

uses this template : https://github.com/status-im/nimbus-eth1-benchmarks/blob/master/README-TEMPLATE.md and generated on each benchmark with envsubst

siddarthkay · 2025-01-17T06:40:54Z

I consider this as completed.
We could use open other issues to track enhancements OR bug fixes if they are required.

jakubgs assigned siddarthkay Aug 20, 2024

siddarthkay mentioned this issue Sep 4, 2024

Add 2 hetzner hosts for eth1 benchmarking #196

Merged

siddarthkay added a commit that referenced this issue Sep 4, 2024

nimbus-eth1-bench: add 2 hetzner hosts

8a9b575

This commit adds 2 hetzner AX42 hosts for eth1 benchmarking to our network. related issue: #194

siddarthkay added a commit that referenced this issue Sep 5, 2024

nimbus-eth1-bench: add 2 hetzner hosts

1f07940

This commit adds 2 hetzner AX42 hosts for eth1 benchmarking to our network. related issue: #194

siddarthkay added a commit that referenced this issue Sep 5, 2024

nimbus-eth1-bench: add 2 hetzner hosts

dd099fd

This commit adds 2 hetzner AX42 hosts for eth1 benchmarking to our network. related issue: #194

siddarthkay closed this as completed Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploy two hosts for benchmarking nimbus-eth1 import #194

Deploy two hosts for benchmarking nimbus-eth1 import #194

jakubgs commented Aug 20, 2024 •

edited by siddarthkay

Loading

siddarthkay commented Aug 22, 2024

siddarthkay commented Aug 22, 2024

siddarthkay commented Aug 27, 2024

siddarthkay commented Oct 23, 2024 •

edited

Loading

jakubgs commented Oct 23, 2024

siddarthkay commented Oct 28, 2024

siddarthkay commented Nov 9, 2024 •

edited

Loading

siddarthkay commented Nov 27, 2024

siddarthkay commented Nov 27, 2024 •

edited

Loading

jakubgs commented Nov 28, 2024

siddarthkay commented Nov 29, 2024

jakubgs commented Dec 2, 2024

arnetheduck commented Dec 3, 2024

siddarthkay commented Dec 16, 2024

siddarthkay commented Dec 27, 2024 •

edited

Loading

siddarthkay commented Jan 2, 2025 •

edited

Loading

jakubgs commented Jan 2, 2025 •

edited

Loading

siddarthkay commented Jan 2, 2025

arnetheduck commented Jan 2, 2025

siddarthkay commented Jan 7, 2025

siddarthkay commented Jan 17, 2025

Deploy two hosts for benchmarking nimbus-eth1 import #194

Deploy two hosts for benchmarking nimbus-eth1 import #194

Comments

jakubgs commented Aug 20, 2024 • edited by siddarthkay Loading

update as of 28 Oct

siddarthkay commented Aug 22, 2024

siddarthkay commented Aug 22, 2024

siddarthkay commented Aug 27, 2024

siddarthkay commented Oct 23, 2024 • edited Loading

jakubgs commented Oct 23, 2024

siddarthkay commented Oct 28, 2024

siddarthkay commented Nov 9, 2024 • edited Loading

siddarthkay commented Nov 27, 2024

siddarthkay commented Nov 27, 2024 • edited Loading

jakubgs commented Nov 28, 2024

siddarthkay commented Nov 29, 2024

jakubgs commented Dec 2, 2024

arnetheduck commented Dec 3, 2024

siddarthkay commented Dec 16, 2024

siddarthkay commented Dec 27, 2024 • edited Loading

siddarthkay commented Jan 2, 2025 • edited Loading

jakubgs commented Jan 2, 2025 • edited Loading

siddarthkay commented Jan 2, 2025

arnetheduck commented Jan 2, 2025

siddarthkay commented Jan 7, 2025

siddarthkay commented Jan 17, 2025

jakubgs commented Aug 20, 2024 •

edited by siddarthkay

Loading

siddarthkay commented Oct 23, 2024 •

edited

Loading

siddarthkay commented Nov 9, 2024 •

edited

Loading

siddarthkay commented Nov 27, 2024 •

edited

Loading

siddarthkay commented Dec 27, 2024 •

edited

Loading

siddarthkay commented Jan 2, 2025 •

edited

Loading

jakubgs commented Jan 2, 2025 •

edited

Loading