deployment: support comparative performance measurement #3058

erwanor · 2023-09-19T18:15:10Z

Is your feature request related to a problem? Please describe.
We currently do not have a good way to measure the impact of a change on the empirical performance of our full nodes. This is because every full node in the testnet cluster runs at the exact same version of the software. Testing a performance improvement means spinning up a separate devnet, and checking that everything still works, but because we do not have tooling to replicate workloads, this has limited analytic value and is extremely time consuming.

Describe the solution you'd like
I would like to be able to deploy a commit/branch to a fn-2 node in the testnet cluster that is behind the public rpc load-balancer. This would allow us to:

directly compare resource and performance usage (e.g. via the grafana dashboard) between a candidate improvement and prod
gain confidence in the quality of a change by directly comparing resource and performance differentials between a release candidate and the production line
get us closer to a performance-aware CI pipeline

Describe alternatives you've considered
We could build tooling to create consistent workload scripts ("read-intensive", "write-intensive", etc.) and use this to test user defined metrics across versions of the software. However, this seems like a much larger amount of work than just deploying a new full node.

Additional context
This is not urgent, there are many milestones that we want to hit before circling back to performance improvements to the full node software.

The text was updated successfully, but these errors were encountered:

conorsch · 2023-09-19T18:30:28Z

Good news: the new deploy logic described in #2983, and slated for use on Testnet 61, largely affords this. In fact, it was your stated use cases that motivated the rewrite. There aren't point-and-click interfaces for these deployments yet, but I can help you to:

identity target network we want to join (e.g. public testnet, or bespoke devnet)
open draft pr
build container for draft pr (this gives us the commit/branch pinning you describe)
deploy an additional set of nodes running that commit into the testnet cluster

Naturally we'll need to be careful about compatibility, depending on the network we want to connect to, but that's always been the case. As part of #3056, I'm going to raise the number of nodes in the cluster, which will give us a bit more headroom on adding additional nodes.

hdevalence · 2024-02-08T23:25:06Z

Marking this as P-V2 since we should focus on performance after mainnet.

conorsch added the A-CI/CD Relates to continuous integration & deployment of Penumbra label Oct 3, 2023

hdevalence added _P-V2 Priority: after mainnet _P-low Priority: low labels Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deployment: support comparative performance measurement #3058

deployment: support comparative performance measurement #3058

erwanor commented Sep 19, 2023 •

edited

Loading

conorsch commented Sep 19, 2023

hdevalence commented Feb 8, 2024

deployment: support comparative performance measurement #3058

deployment: support comparative performance measurement #3058

Comments

erwanor commented Sep 19, 2023 • edited Loading

conorsch commented Sep 19, 2023

hdevalence commented Feb 8, 2024

erwanor commented Sep 19, 2023 •

edited

Loading