Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare v0.77.3 changeset #4648

Merged
merged 12 commits into from
Jun 20, 2024
Merged

Prepare v0.77.3 changeset #4648

merged 12 commits into from
Jun 20, 2024

Conversation

conorsch
Copy link
Contributor

Describe your changes

This PR backports several fixes for inclusion in the 0.77.x release series, as v0.77.3.

Issue ticket number and link

Specifically, the relevant PRs are:

Checklist before requesting a review

  • If this code contains consensus-breaking changes, I have added the "consensus-breaking" label. Otherwise, I declare my belief that there are not consensus-breaking changes, for the following reason:

    Careful attention was paid to selecting only non-breaking changes, for compatibility with the running testnet chain and its nodes.

cratelyn and others added 12 commits June 20, 2024 09:00
see the module-level docs of `pd::metrics::sleep_worker`. this
introduces a submodule to `pd::metrics`, which observes tokio scheduler
latency and records it for the Prometheus exporter.

the core idea in this worker is that it will invoke
`tokio::time::sleep(..)` to put the task to sleep for one second. upon
being woken up by the scheduler, it will check the *new* wall-clock time
and calculate the actual amount of time it spent waiting. if this took
longer than expected, it increments the
`pd_async_sleep_drift_milliseconds` counter by the observed latency,
measured in milliseconds.

as the module-level docs note, this is a very useful tool to identify
when the runtime is being disrupted by blocking I/O or other expensive
computation.

pd: 🚪 `metrics` module can be `pub`

this will make it slightly nicer to expose facilities for telemetry in
the `pd` binary, without clouding the top-level namespace.

(cherry picked from commit 10421b8)
this commit moves the logic responsible for configuring dex metrics into
the dex's metrics module.

there are some lurking footguns here, so we can avoid easily forgotten
non-local reasoning and move the regex/prefix logic next to the metrics.

(cherry picked from commit da1d31e)
Pulls in the `metrics-process` crate [0] to bolt on CPU usage and other OS-level info
to the metrics emitted by pd. Doing so will support a better first-run
experience for node operators who lack a comprehensive pre-existing setup for monitoring node health,
so they can get a sense of how much load pd is under.

Adds a cursory first-pass on a new dashboard, but will need to follow up
on that once the new metrics actually land on hosts that are receiving
load. Also included a reference to metrics addd in #4581, as a treat.

[0] https://docs.rs/metrics-process/2.0.0/metrics_process/index.html

(cherry picked from commit 3045645)
we can't add counters or gauges named `penumbra_dex_` because of the use
of `Matcher::Prefix`. rather, we can tweak this and provide the explicit
list of metrics we want buckets for.

(cherry picked from commit 2b6621b)
## Describe your changes
This re-centers the DEX buckets around the expected latency region

## Checklist before requesting a review

- [x] If this code contains consensus-breaking changes, I have added the
"consensus-breaking" label. Otherwise, I declare my belief that there
are not consensus-breaking changes, for the following reason:

(cherry picked from commit 06a08f7)
Adds a TimestampByHeightRequest to SCT QueryService.

There is no corresponding migration added, so any block heights lower
than the height at which this is deployed will have missing data.

#4522

- [ ] If this code contains consensus-breaking changes, I have added the
"consensus-breaking" label. Otherwise, I declare my belief that there
are not consensus-breaking changes, for the following reason:

> This adds a pb API endpoint and should not affect consensus. New data
is tracked in nonconsensus storage only.

---------

Co-authored-by: turbocrime <[email protected]>
Co-authored-by: Chris Czub <[email protected]>
(cherry picked from commit 989d3a9)
…ailures (#4642)

## Describe your changes

The `pcli tx position close-all` and `pcli tx position withdraw-all`
commands were failing to plan for me when I had a significant (100+)
number of positions.

This PR splits them into chunks, currently 30 positions, and fixed the
planning failures for me.

## Checklist before requesting a review

- [x] If this code contains consensus-breaking changes, I have added the
"consensus-breaking" label. Otherwise, I declare my belief that there
are not consensus-breaking changes, for the following reason:

  > pcli only

(cherry picked from commit 1693c20)
when a pcli user initializes their configuration, they provide the
`init` command with the grpc url of a fullnode, which is stored in
pcli's configuration file. by default, this is found at
`~/.local/share/pcli/config.toml`.

if that fullnode is later encountering issues, this can render many pcli
commands unusable, without a clear workaround. this is a small patch,
providing a top-level `--grpc-url` option that will override the config
file's GRPC url.

this can help users temporarily send requests to a different fullnode,
until their preferred default comes back online.

(cherry picked from commit 3f7ccbd)
This should fix an issue wherein a load balanced RPC can cause data
corruption by delivering the same block twice in the stream.

## Issue ticket number and link

This should close #4577.

## Checklist before requesting a review

- [x] If this code contains consensus-breaking changes, I have added the
"consensus-breaking" label. Otherwise, I declare my belief that there
are not consensus-breaking changes, for the following reason:

  > Just a client change

(cherry picked from commit 4245ebc)
Copy link
Member

@erwanor erwanor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@conorsch conorsch merged commit 11ddb05 into release/v0.77.x Jun 20, 2024
11 checks passed
@conorsch conorsch deleted the metrics-fixes-for-77 branch June 20, 2024 18:11
conorsch added a commit that referenced this pull request Jun 20, 2024
conorsch added a commit that referenced this pull request Jun 20, 2024
Refs #4648. Also, given #4562, CI should be fast on this PR.
conorsch added a commit that referenced this pull request Jun 20, 2024
conorsch added a commit that referenced this pull request Jun 20, 2024
Refs #4648. Also, given #4562, CI should be fast on this PR.
conorsch added a commit that referenced this pull request Jun 20, 2024
Refs #4648. Also, given #4562, CI should be fast on this PR.
avahowell pushed a commit that referenced this pull request Jun 24, 2024
Refs #4648. Also, given #4562, CI should be fast on this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants