Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd: 🔨 rework RootCommand::start auto-https logic #3652

Merged
merged 4 commits into from
Jan 26, 2024

Conversation

cratelyn
Copy link
Contributor

@cratelyn cratelyn commented Jan 23, 2024

👀 overview

fixes #3627.

this reorganizes the logic in pd's startup code related to automatically
managed https functionality.

🎨 background & motivation

this PR, besides cleaning up the rustls-acme-related auto-https logic, is
also interested in creating a state-of-affairs that will dovetail into
pr #3522
. in particular, this expression to start the GRPC serve given a
bound listener...

tokio::task::Builder::new()
.name("grpc_server")
.spawn(grpc_server.serve_with_incoming(tls_incoming))
.expect("failed to spawn grpc server")

...should be adjusted so as to work with an axum::Router.

⚖️ rustls-acme and tokio-rustls-acme

quoth the #3627 description, citing an earlier comment:

In the ~year since this code was written, there may be better options.
tokio-rustls-acme seems promising
- #3522 (comment)

for reference, the repositories for each live here, and here:

after some comparison, i have come to the opinion that rustls-acme will still
be adequate for our purposes. the latter is a fork of the former, but active
development appears to have continued in the former, and i did not see any
particular "must-have" features for us in the latter.

🎴 changes

this commit moves some of the auto-https related code from the main
entrypoint, into standalone functions in pd::main.

some constants are defined, to keep control flow clear and to help
facilitate the addition of future options e.g. a flag to control the
LetsEncrypt environment to use.

🚰 dropping down to axum; a brief note on future upgrades

as stated above, we want to switch to an axum::Router. this means that
we won't be able to use the AcmeConfig::incoming function. the
rustls-acme library provides some "low-level" examples this work is
based upon

we also use tonic 0.10.2 in pd, and elsewhere in the penumbra
monorepo. tonic isn't using hyper 1.x yet. this was being worked on in
hyperium/tonic#1583, continued on in hyperium/tonic#1595, and tracked in
hyperium/tonic#1579. that work also depends upon hyperium/hyper#3461.

before we can use a new version of axum-server, rustls-acme, [..],
we will need to be wait for tonic to finish its migration over to hyper 1.0, see:
hyperium/tonic#1579 (comment)

this is understandable, but i make note of this situation as a
signpost for our future selves when considering a migration to
recent versions of these libraries.

for now, it's easiest to stay in lock-step with tonic, and we can revisit
the upgrade path(s) at a future date.

===

@cratelyn cratelyn self-assigned this Jan 23, 2024
@cratelyn cratelyn changed the title pd: 🛻 auto-https via tokio_rustls_acme pd: 🔨 rework RootCommand::start auto-https logic Jan 24, 2024
@cratelyn
Copy link
Contributor Author

update 24-01-24: wrote up an update on my progress in 32b6457. this is turning out to be trickier than expected, unfortunately, but i think i understand what to do tomorrow.

 ## 👀 overview

fixes #3627.

this reorganizes the logic in pd's startup code related to automatically
managed https functionality.

 ## 🎨 background & motivation

this PR, besides cleaning up the `rustls-acme`-related auto-https logic, is
also interested in *creating a state-of-affairs that will dovetail into
pr #3522*. in particular, this expression to start the GRPC serve given a
bound listener...

```rust
tokio::task::Builder::new()
    .name("grpc_server")
    .spawn(grpc_server.serve_with_incoming(tls_incoming))
    .expect("failed to spawn grpc server")
```

...should be adjusted so as to work with an `axum::Router`.

 ### ⚖️  `rustls-acme` and `tokio-rustls-acme`

quoth the #3627 description, citing an earlier comment:

> In the ~year since this code was written, there may be better options.
> `tokio-rustls-acme` seems promising
\- <#3522 (comment)>

for reference, the repositories for each live here, and here:
- <https://github.com/FlorianUekermann/rustls-acme>
- <https://github.com/n0-computer/tokio-rustls-acme>

after some comparison, i have come to the opinion that `rustls-acme` will still
be adequate for our purposes. the latter is a fork of the former, but active
development appears to have continued in the former, and i did not see any
particular "_must-have_" features for us in the latter.

 ## 🎴 changes

this commit moves some of the auto-https related code from the `main`
entrypoint, into standalone functions in `pd::main`.

some constants are defined, to keep control flow clear and to help
facilitate the addition of future options e.g. a flag to control the
LetsEncrypt environment to use.

 ## 🚰 dropping down to `axum`; a brief note on future upgrades

as stated above, we want to switch to an `axum::Router`. this means that
we won't be able to use the `AcmeConfig::incoming` function. the
`rustls-acme` library provides some "low-level" examples this work is
based upon

- <https://github.com/FlorianUekermann/rustls-acme/blob/main/examples/low_level_tokio.rs>
- <https://github.com/FlorianUekermann/rustls-acme/blob/main/examples/low_level_axum.rs>

we also use `tonic` 0.10.2 in pd, and elsewhere in the penumbra
monorepo. tonic isn't using hyper 1.x yet. this was being worked on in
hyperium/tonic#1583, continued on in hyperium/tonic#1595, and tracked in
hyperium/tonic#1579. that work also depends upon hyperium/hyper#3461.

we will need to be wait for tonic to finish its migration over to hyper
1.0, see:
hyperium/tonic#1579 (comment)

this is understandable, but i make note of this situation as a
signpost for our future selves when considering a migration to
recent versions of axum-server, axum, rustls-acme, or hyper.

for now, it's easiest to stay in lock-step with tonic, and we can revisit
the upgrade path(s) at a future date.

===

Refs: #3627
Refs: #3646
Refs: #3522
@cratelyn cratelyn added A-node Area: System design and implementation for node software E-week labels Jan 25, 2024
we call into axum via axum-server for now, so we don't need this
dependency. (...yet)
@cratelyn cratelyn marked this pull request as ready for review January 25, 2024 22:05
@cratelyn cratelyn added the C-chore Codebase maintenance that doesn't fix bugs or add features, and isn't urgent or blocking. label Jan 25, 2024
Copy link
Member

@hdevalence hdevalence left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks clean!

Until we land #3336 we don't have a great way to test it. Would it be possible to manually exercise this, by:

  1. Using a scratch box with a domain name to join testnet-preview, exposing pd's RPC to the world
  2. Aiming the web wallet at the resulting https URL and checking that it works? (It should be the web wallet specifically, to exercise grpc-web compat).

@conorsch
Copy link
Contributor

This works great!

Using a scratch box with a domain name to join testnet-preview, exposing pd's RPC to the world

This is exactly how we tested it together. Initially, I'd only tested pcli interacting with the auto-https endpoint, so thanks for pointing out that we also needed to exercise

Aiming the web wallet at the resulting https URL and checking that it works? (It should be the web wallet specifically, to exercise grpc-web compat).

We did that too, based on these wonderful local-env dev docs maintained by the web team: https://github.com/penumbra-zone/web/blob/main/docs/spinning-up-locally.md

There are still a few places I'd like to see some cleanup, and I'll comment on those in-line, but won't block on them. We'll revisit this code frequently in a few forthcoming tickets, so plenty of opportunity to improve as we good.

Awesome PR submission text, @cratelyn—thank you for taking the time!

crates/bin/pd/src/auto_https.rs Show resolved Hide resolved
crates/bin/pd/src/main.rs Show resolved Hide resolved
crates/bin/pd/src/main.rs Show resolved Hide resolved
@conorsch conorsch merged commit 6f5f434 into main Jan 26, 2024
7 checks passed
@conorsch conorsch deleted the katie/auto-https.issue-3627 branch January 26, 2024 19:07
cratelyn added a commit that referenced this pull request Jan 31, 2024
fixes #3119. see also, #1886.

this pulls the auto-https code (see #3627, #3652) into a standalone
library crate.
conorsch pushed a commit that referenced this pull request Jan 31, 2024
fixes #3119. see also, #1886.

this pulls the auto-https code (see #3627, #3652) into a standalone
library crate.
TalDerei pushed a commit that referenced this pull request Feb 8, 2024
fixes #3119. see also, #1886.

this pulls the auto-https code (see #3627, #3652) into a standalone
library crate.
conorsch added a commit that referenced this pull request Feb 22, 2024
This change restores the serving of CORS headers in HTTP responses by pd.
During the refactor of auto-https logic in #3652, we overlooked
that the tower [CorsLayer](https://docs.rs/tower-http/0.4.4/tower_http/cors/struct.CorsLayer.html#)
was no longer being carried through after type conversions between
tower, tonic, and axum. Simply moving the layer attachment
to the already-built axum router, rather than the previously
constructed tonic router, resolves the problem.

We now include an integration test to check for the headers,
so we don't inadvertently drop them again.

Closes #3865.
conorsch added a commit that referenced this pull request Feb 23, 2024
This change restores the serving of CORS headers in HTTP responses by pd.
During the refactor of auto-https logic in #3652, we overlooked
that the tower [CorsLayer](https://docs.rs/tower-http/0.4.4/tower_http/cors/struct.CorsLayer.html#)
was no longer being carried through after type conversions between
tower, tonic, and axum. Simply moving the layer attachment
to the already-built axum router, rather than the previously
constructed tonic router, resolves the problem.

We now include an integration test to check for the headers,
so we don't inadvertently drop them again.

Closes #3865.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-node Area: System design and implementation for node software C-chore Codebase maintenance that doesn't fix bugs or add features, and isn't urgent or blocking. E-week
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Clean up pd auto-https code
3 participants