Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for negotiating IMAP, SMTP & HTTP on 443 #255

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

s0ph0s-dog
Copy link

@s0ph0s-dog s0ph0s-dog commented Nov 25, 2024

This PR adds support for negotiating IMAP, SMTP & HTTP on the configured HTTPS port using TLS ALPN. This is intended to be useful for deploying Mox as a chatmail server. The upstream implementation of chatmail servers uses imap and smtp as the “next protocol” values for IMAP and SMTP, respectively: https://github.com/deltachat/chatmail/blob/main/cmdeploy/src/cmdeploy/nginx/nginx.conf.j2#L16-L17

(These testing instructions have been updated as of 775c2f2)

To test, configure Mox as is standard for the mox localserve mode, but make these three changes to Listeners.local:

  1.  	Submissions:
     		Enabled: true
     		Port: 1465
     		EnableOnHTTPS: true
    
  2.  	IMAPS:
     		Enabled: true
     		Port: 1993
     		EnableOnHTTPS: true
    
  3. s/1443/443/g (skip this one to verify that the warning message appears)

Then run Mox and use OpenSSL’s s_client mode to connect to the local instance:

> openssl s_client -quiet -connect -crlf localhost:443 -alpn smtp
depth=0 O = mox localserve, CN = localhost
verify error:num=18:self signed certificate
verify return:1
depth=0 O = mox localserve, CN = localhost
verify return:1
220 localhost ESMTP mox 7f5e1087d42d5d7bf57649c5c6a2155c3706ce39+modifications
HELO test
550 5.5.0 your ehlo domain does not resolve to an IP address (htqp11_GJOmHEhSs_Y03eg)
QUIT
221 2.0.0 okay thanks bye
^C
> openssl s_client -quiet -crlf -connect localhost:443 -alpn imap
depth=0 O = mox localserve, CN = localhost
verify error:num=18:self signed certificate
verify return:1
depth=0 O = mox localserve, CN = localhost
verify return:1
* OK [CAPABILITY IMAP4rev2 IMAP4rev1 ENABLE LITERAL+ IDLE SASL-IR BINARY UNSELECT UIDPLUS ESEARCH SEARCHRES MOVE UTF8=ACCEPT LIST-EXTENDED SPECIAL-USE LIST-STATUS AUTH=SCRAM-SHA-256-PLUS AUTH=SCRAM-SHA-256 AUTH=SCRAM-SHA-1-PLUS AUTH=SCRAM-SHA-1 AUTH=CRAM-MD5 ID APPENDLIMIT=9223372036854775807 CONDSTORE QRESYNC STATUS=SIZE QUOTA QUOTA=RES-STORAGE AUTH=PLAIN] mox imap
c1 STARTTLS
c1 BAD STARTTLS unrecognized syntax/command: tls already active
^C
> openssl s_client -quiet -crlf -connect localhost:443
depth=0 O = mox localserve, CN = localhost
verify error:num=18:self signed certificate
verify return:1
depth=0 O = mox localserve, CN = localhost
verify return:1
GET / HTTP/1.1
Host: localhost

HTTP/1.1 404 Not Found
Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff
Date: Mon, 25 Nov 2024 07:28:00 GMT
Content-Length: 19

404 page not found
^C

As part of this change, I had to go get golang.org/x/net/http2. (The default http.Server supports HTTP2 if you leave it alone, but that built-in support is deactivated if I set TLSNextProto to a non-nil value. To ensure that Mox continues to support HTTP2, the Go documentation directs folks with "…more complex configurations…" to import x/net/http2.) Go decided that it also wanted to update a bunch of other dependencies while it was at it. This has caused the large number of dependency updates. I can revert these and attempt a more surgical addition of the http2 library, if you’d like.

There are also two major deficiencies in this code that I’d like advice on correcting:

  1. Right now, this ALPN feature is enabled when Mox is configured to provide Autoconfigure services to mail clients. I chose to do this because it was relatively straightforward to implement. However, the resulting behavior is extremely non-obvious. How would you recommend exposing the ALPN feature in the configuration?
  2. I’m not sure what the best way to expose the private serve() functions in imapserver and smtpserver are. The current implementation creates a public function called ServeConn() in each module that just calls the private serve() function with all the same arguments, but this feels redundant. Would you recommend making the serve() functions public, using public wrappers but with a more limited set of parameters, or something else entirely?

Thanks!

This PR adds support for negotiating IMAP, SMTP & HTTP on the configured HTTPS port using TLS ALPN. This is intended to be useful for deploying Mox as a chatmail server. The upstream implementation of chatmail servers uses `imap` and `smtp` as the “next protocol” values for IMAP and SMTP, respectively: https://github.com/deltachat/chatmail/blob/main/cmdeploy/src/cmdeploy/nginx/nginx.conf.j2#L16-L17

To test, configure Mox as is standard for the `mox localserve` mode, but add this block under `Listeners.local`:
```sconf
		AutoconfigHTTPS:
			Enabled: true
			Port: 1443
```

Then run Mox and use OpenSSL’s s_client mode to connect to the local instance:
```
> openssl s_client -quiet -connect localhost:1443 -alpn smtp
depth=0 O = mox localserve, CN = localhost
verify error:num=18:self signed certificate
verify return:1
depth=0 O = mox localserve, CN = localhost
verify return:1
220 localhost ESMTP mox 7f5e108+modifications
HELO test
550 5.5.0 your ehlo domain does not resolve to an IP address (htqp11_GJOmHEhSs_Y03eg)
QUIT
221 2.0.0 okay thanks bye
^C
> openssl s_client -quiet -crlf -connect localhost:1443 -alpn imap
depth=0 O = mox localserve, CN = localhost
verify error:num=18:self signed certificate
verify return:1
depth=0 O = mox localserve, CN = localhost
verify return:1
* OK [CAPABILITY IMAP4rev2 IMAP4rev1 ENABLE LITERAL+ IDLE SASL-IR BINARY UNSELECT UIDPLUS ESEARCH SEARCHRES MOVE UTF8=ACCEPT LIST-EXTENDED SPECIAL-USE LIST-STATUS AUTH=SCRAM-SHA-256-PLUS AUTH=SCRAM-SHA-256 AUTH=SCRAM-SHA-1-PLUS AUTH=SCRAM-SHA-1 AUTH=CRAM-MD5 ID APPENDLIMIT=9223372036854775807 CONDSTORE QRESYNC STATUS=SIZE QUOTA QUOTA=RES-STORAGE AUTH=PLAIN] mox imap
c1 STARTTLS
c1 BAD STARTTLS unrecognized syntax/command: tls already active
^C
> openssl s_client -quiet -crlf -connect localhost:1443
depth=0 O = mox localserve, CN = localhost
verify error:num=18:self signed certificate
verify return:1
depth=0 O = mox localserve, CN = localhost
verify return:1
GET / HTTP/1.1
Host: localhost

HTTP/1.1 404 Not Found
Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff
Date: Mon, 25 Nov 2024 07:28:00 GMT
Content-Length: 19

404 page not found
^C
```

As part of this change, I had to `go get golang.org/x/net/http2`. (The default `http.Server` supports HTTP2 if you leave it alone, but that built-in support is deactivated if I set `TLSNextProto` to a non-nil value. To ensure that Mox continues to support HTTP2, [the Go documentation](https://pkg.go.dev/net/[email protected]#hdr-HTTP_2) directs folks with "…more complex configurations…" to import x/net/http2.) Go decided that it also wanted to update a bunch of other dependencies while it was at it. This has caused the large number of dependency updates. I can revert these and attempt a more surgical addition of the http2 library, if you’d like.

There are also two major deficiencies in this code that I’d like advice on correcting:
1. Right now, this ALPN feature is enabled when Mox is configured to provide Autoconfigure services to mail clients. I chose to do this because it was relatively straightforward to implement. However, the resulting behavior is extremely non-obvious. **How would you recommend exposing the ALPN feature in the configuration?**
2. I’m not sure what the best way to expose the private `serve()` functions in `imapserver` and `smtpserver` are. The current implementation creates a public function called `ServeConn()` in each module that just calls the private `serve()` function with all the same arguments, but this feels redundant. **Would you recommend making the `serve()` functions public, using public wrappers but with a more limited set of parameters, or something else entirely?**

Thanks!
@mjl-
Copy link
Owner

mjl- commented Nov 25, 2024

Nice, that's quick.

About enabling the ALPN feature: What about a new field in the configs for "submissions" and "imaps", to enable it on the https port. https://github.com/mjl-/mox/blob/v0.0.13/config/config.go#L168 and IMAPS just below. "EnableOnHTTPS bool" or something, with comment explaining ALPN.
It would enable this on any https on port 443 for that listener. And should probably give an error if there is no https on that listener. The http/web.go code is collecting servers on different ports, with tls settings. I think there could be a step about adding to NextProtos at the end of gathering all the servers.

About the serve() functions. I think we can make them public for now. We can change it later, we don't have to guarantee that those functions are completely stable. We should pass in the parameters from the listener. That's why it would be good to enable this ALPN feature in each submissions/imaps section of the listener config. The parameters should come from that listener.

Adding golang.org/x/net/http2 and the dependency updates are fine. I thought we could perhaps just use a regular tls listeners, and accept connections ourselves and look at the nextproto to call the right function. But that's probably not the best approach. The net/http Server has methods for shutting down etc. I don't think we're using them at the moment, but may in the future. And there may be additional http/2 state management in the http2 package that we would miss if handling connections ourselves.

I saw the TODO where NextProtos is set. We should probably clone the TLS config before modifying, and only append to nextprotos, not overwrite it. The config may come from the autotls/ package, with adds the acme alpn nextproto for verification.

@s0ph0s-dog
Copy link
Author

Nice, that's quick.

I started with the one I thought might be easiest, to introduce myself to the codebase :P

About enabling the ALPN feature: What about a new field in the configs for "submissions" and "imaps", to enable it on the https port. https://github.com/mjl-/mox/blob/v0.0.13/config/config.go#L168 and IMAPS just below. "EnableOnHTTPS bool" or something, with comment explaining ALPN. […] We should pass in the parameters from the listener.

Yeah, that's what I was thinking about too—I hesitated because the listener configurations for IMAP and Submissions both need to somehow inform the HTTPS listener to change its behavior, which is a bit architecturally messy. What do you think about this plan:

  1. Update smtpserver.Listen() and imapserver.Listen() to return a struct containing the options for the serve function, or nil if ALPN negotiation is not enabled.
  2. Pass the two structs (or nils) returned by the IMAP/SMTP listeners into http.Listen() as parameters (lines 57–59 of serve.go), so that it can configure the underlying TLS socket correctly.

The http/web.go code is collecting servers on different ports, with tls settings. I think there could be a step about adding to NextProtos at the end of gathering all the servers. […] I saw the TODO where NextProtos is set. We should probably clone the TLS config before modifying, and only append to nextprotos, not overwrite it. The config may come from the autotls/ package, with adds the acme alpn nextproto for verification.

Ahhh, okay! Makes sense. If my suggestion above for returning parameters from the IMAP and SMTP listeners to pass into the HTTP listener sounds good, I can move the NextProtos modification into the HTTP server's listen1() function. It already clones the TLS configuration, so that would ensure I don't waste memory cloning it more than necessary. Also, I think that would let most of the changes be pulled out into a separate function called by listen1(), which feels very tidy and organized.

About the serve() functions. I think we can make them public for now. We can change it later, we don't have to guarantee that those functions are completely stable.

Actually, rather than returning a struct, the IMAP and SMTP listeners could return a function pointer. Then that function needs only to ask for the TLS configuration and the network connection, and it could contain the other arguments configured by the listener as part of its closure. Then the serve() functions wouldn't need to be public. Does that fit into Mox's established coding style and architectural conventions well, or should I avoid that kind of functional-programming-inspired abstraction and make the two functions public?

Adding golang.org/x/net/http2 and the dependency updates are fine. I thought we could perhaps just use a regular tls listeners, and accept connections ourselves and look at the nextproto to call the right function. But that's probably not the best approach. The net/http Server has methods for shutting down etc. I don't think we're using them at the moment, but may in the future. And there may be additional http/2 state management in the http2 package that we would miss if handling connections ourselves.

Yeah, HTTP/2 is a complicated protocol with several interlinks into TLS—doing the negotiation manually is an excellent way to introduce subtle, bizarre bugs.

@mjl-
Copy link
Owner

mjl- commented Dec 6, 2024

What do you think about this plan

Yes, that plan makes sense.

Actually, rather than returning a struct, the IMAP and SMTP listeners could return a function pointer. Then that function needs only to ask for the TLS configuration and the network connection, and it could contain the other arguments configured by the listener as part of its closure. Then the serve() functions wouldn't need to be public. Does that fit into Mox's established coding style and architectural conventions well, or should I avoid that kind of functional-programming-inspired abstraction and make the two functions public?

At first reading, just exporting the serve functions seems more mox-like. But if that has complications, the other option could be better.

@s0ph0s-dog
Copy link
Author

s0ph0s-dog commented Dec 7, 2024

I chose to go with the Listen()-returns-a-function approach because it keeps related code close together. Should something need to change about the serve() function, the other caller is in the same file a few dozen lines up. If instead I were to have the Listen()ers return structs, the structs themselves and also the call site over in http/web.go need to change whenever serve() changes, which is more tedious and spaghetti-like.

(I also updated the testing instructions in the main PR description.)

How does this look?

  • Are the variable names sufficiently clear?
  • Does the documentation text explain the feature well?
  • Is the new warning message too verbose?

@mjl-
Copy link
Owner

mjl- commented Dec 7, 2024

Approach looks good. Documentation looks sufficient, and warnings aren't too
verbose (I do usually try to keep them a single sentence, so I can keep it
all lower case, but sometimes that's just not enough). Some styling nits
later on.

Rebasing will cause some trouble (sorry) with the recent addition of TLS
client cert authentication. The imapserver & smtpserver serve() functions now
always expect a plain tcp connection, on which they're doing tls explicitly
(so they can set up a tls.Config that is bound to the "conn" type, for
setting the "account" and "username" fields during authentication in the tls
"verify client" callback). The "tls bool" parameter to the serve() functions
now means "start a tls server". I think we may need another flag "this is
already a tls conn, don't start another tls server".

TLS client cert auth won't work with the https-alpn connections. I don't
think such support has a use case at the moment. The smtpserver/imapserver
and https tls config code could be reworked in the future to also make tls
client cert auth work for https-alpn connections.

Some nits:

  • The slog attributes are usually all lowercase and a bit shorter. I would
    rename "listenerName" to just "listener", and "SubmissionsEnableOnHTTPS" to
    "submissionhttps".
  • log := mlog.New("smtp-alpn", nil), the log name is usually the package
    name, so it would be "smtpserver".

It could also be useful to track in logging & metrics that a connection came
in over https. Perhaps an attribute on https-alpn connections. And protocols
like "imaphttps" and "submissionhttps" in the metrics.

It would be useful to add some tests. Maybe in the integration test, enabling
it on one listener, testing the https alpn connection works on that listener
and not on the other. You're welcome to look into that, but I'm OK doing it
too.

The solution I chose was to avoid setting up a TLS connection if one was already provided to the `serve()` functions.
Also this required a new attribute on the IMAP and SMTP connection structs, and made it easier to avoid using the reflection tools.
@s0ph0s-dog
Copy link
Author

Approach looks good. Documentation looks sufficient, and warnings aren't too verbose.

Great!

Rebasing will cause some trouble (sorry) with the recent addition of TLS client cert authentication. I think we may need another flag "this is already a tls conn, don't start another tls server".

I implemented this as a "viaHTTPS" flag, because then that also makes the protocol metrics & logging easier.

  • The slog attributes are usually all lowercase and a bit shorter. I would
    rename "listenerName" to just "listener", and "SubmissionsEnableOnHTTPS" to
    "submissionhttps".
  • log := mlog.New("smtp-alpn", nil), the log name is usually the package
    name, so it would be "smtpserver".

I think both of these are fixed!

It could also be useful to track in logging & metrics that a connection came in over https. Perhaps an attribute on https-alpn connections. And protocols like "imaphttps" and "submissionhttps" in the metrics.

Yep! I originally fixed the merge conflict by using reflection to avoid setting up TLS when the connection is already TLS, but the "viaHTTPS" method enabled the logging/metrics to be easy to implement, and also meant I didn't need reflection.

It would be useful to add some tests. Maybe in the integration test, enabling it on one listener, testing the https alpn connection works on that listener and not on the other. You're welcome to look into that, but I'm OK doing it too.

I'm taking a look at this now—I've found the integration tests, but I'm not yet sure I understand how they work. I'll keep poking at it, and let you know if I'm not able to figure it out soon.

@s0ph0s-dog
Copy link
Author

I've now figured out why the example integration test didn't make any sense at first: the integration tests are meant to be run in Docker via the makefile, and the code under test is in a separate Docker container from the integration_test executable. That explains a lot, and now I think I have enough pieces to be able to write an integration test for this feature.

@s0ph0s-dog
Copy link
Author

Alright, how does the test I added look?

@mjl-
Copy link
Owner

mjl- commented Dec 25, 2024

Tests looks good, especially with also testing that the alpn access doesn't work if not enabled.
The request for the mta-sts endpoint isn't entirely working though. http.Get() only returns a non-nil err if writing the request or reading the response fails, not if the status is not 2xx. The response code will be 404 (as I noticed when adding that check to the test), because the url isn't correct, it should be of the form https://mta-sts.<recipientdomain>/.... But the recipient domain is mox1.example, but that domain is also in use with another container.

Anyway, as this looked quite ready to merge, I took another good look, and realized the alpn imap/smtp helper is used for all listener, regardless of in which listener it was configured. I fear this will cause trouble future development, where we would have to change that code and behaviour. So I decided to try to change the https/alpn setup to make it per-listener. Attached is a diff with those changes compare to this latest MR.

It has a few more changes:

  • In config, rename EnableOnHTTPS to EnabledOnHTTPS (add "d"), to match the current "Enabled" bool field.
  • Instead of warning that EnabledOnHTTPS is set but there is no https server on port 443, just cause EnabledOnHTTPS to start listening for https on port 443.
  • In smtpserver, reverted the metric tracking. I pointed to it before, but I think the added logging for new connections is enough after all.
  • Removed snippet from moxacmepebblealpn.sh for integration test that wrote DNS records that would conflict with the non-alpn container, or aren't needed.

Could you have a look at the diff and check if it looks reasonable?
Are there any more changes we would need before getting this in?

https-alpn-mail.diff.txt

@s0ph0s-dog
Copy link
Author

Tests looks good, especially with also testing that the alpn access doesn't work if not enabled. The request for the mta-sts endpoint isn't entirely working though. http.Get() only returns a non-nil err if writing the request or reading the response fails, not if the status is not 2xx. The response code will be 404 (as I noticed when adding that check to the test), because the url isn't correct, it should be of the form https://mta-sts.<recipientdomain>/.... But the recipient domain is mox1.example, but that domain is also in use with another container.

Ah, oops! I think it shouldn't matter for the purpose of the test, since all it really needs to do is get any valid HTTP response (instead of IMAP or SMTP), but I agree that it would definitely be better to have a 200 response from the correct domain.

Anyway, as this looked quite ready to merge, I took another good look, and realized the alpn imap/smtp helper is used for all listener, regardless of in which listener it was configured. I fear this will cause trouble future development, where we would have to change that code and behaviour. So I decided to try to change the https/alpn setup to make it per-listener. Attached is a diff with those changes compare to this latest MR.

I looked through the patch and I think I see what you mean. Thank you for catching that!

It has a few more changes:

  • In config, rename EnableOnHTTPS to EnabledOnHTTPS (add "d"), to match the current "Enabled" bool field.
  • Instead of warning that EnabledOnHTTPS is set but there is no https server on port 443, just cause EnabledOnHTTPS to start listening for https on port 443.
  • In smtpserver, reverted the metric tracking. I pointed to it before, but I think the added logging for new connections is enough after all.
  • Removed snippet from moxacmepebblealpn.sh for integration test that wrote DNS records that would conflict with the non-alpn container, or aren't needed.

Could you have a look at the diff and check if it looks reasonable? Are there any more changes we would need before getting this in?

It looks reasonable to me! The tests pass, but I haven't managed to successfully test it with a Delta Chat client yet. Part of this is that I forgot I needed to implement autoconfigure, but the main issue is that I can't get Delta Chat to talk to a Mox server which presents a certificate signed by a private certificate authority. After spending the second half of today digging into it, I realized it's because the Delta Chat core library uses the Mozilla CA certificate list exclusively, with no override options. This means I can't test with a private CA without recompiling the Delta Chat client.

Identifying this issue has exhausted me for the day, so I'm going to put it aside and decide how to approach client testing again when I revisit it with fresh eyes tomorrow. In the meantime, I'll push your patch and also the code I wrote to make autoconfigure output the port 443 information.

Also, merry Christmas/happy holidays!

@s0ph0s-dog
Copy link
Author

After sleeping on it, I realized that I don't need the test Mox instance to be publicly accessible in order to obtain a certificate for the domain! I've now set up a test environment where I can have Delta Chat Desktop talk to Mox. I have not yet been able to convince Delta Chat Desktop to use the ALPN connection, so I'm working with the Delta Chat folks to figure that part out.

However, I've also discovered that the https-alpn-mail patch you linked above breaks the ability to view Mox webpages in Safari and Firefox. I don't yet understand why this happens, but any attempt results in this error in the Mox logs:

l=info m="https error" err="http: TLS handshake error from 192.168.64.1:57881: tls: client
 requested unsupported application protocols ([h2 http/1.1])" pkg=http pkg=net/http

I've added more logging in http/web.go to verify that all of the HTTP listeners are being configured with h2 in the ALPN list, and I confirmed with Wireshark that the requests from Safari/Firefox are sending two separate protocol names, rather than one long protocol name (h2 and http/1.1 vs. h2 http/1.1).

I'm currently working on testing with Delta Chat in a branch from before I applied your patch (but with cherry-picked autoconfig changes). If you have time and motivation to investigate the h2 ALPN thing, I'd appreciate it! If not, I can figure it out after I work out how to get Delta Chat to use this feature (finally).

@mjl-
Copy link
Owner

mjl- commented Dec 29, 2024

If you have time and motivation to investigate the h2 ALPN thing, I'd appreciate it!

I found the problem. I think it still worked when ACME was enable, but not
with manual TLS certs. With ACME enabled, the TLS config got alpn h2 &
http/1.1 early from the ACME TLS config. Without ACME, the call to
http2.ConfigureServer would add h2 & http/1.1 to the NextProtos of the
tlsconfig, but on a clone, while the call to tls.NewListener was already done
on the tlsconfig from before the clone. So the config used by tls.NewListener
didn't get h2 & http/1.1 in NextProtos.

I fixed it by cloning all TLS configs at an earlier moment. Then we don't
have to clone it between tls.NewListener and http2.ConfigureServer.

There is also a small change for setting "kind" "acme-tls-alpn-01", I can't
see how the previous code would do anything. I'm going to commit that snippet
separately.

fix-non-acme-alpn.diff.txt

[The specification](https://www.bucksch.org/1/projects/thunderbird/autoconfiguration/config-file-format.html#Placeholders) allows the email address to be omitted from the request, and the server should instead send `%EMAILADDRESS%` in place of it in the XML response. If the client doesn’t provide an email address, this code uses the domain of the request, with any `autoconfig.` prefix removed, to look up the correct server configuration.
@s0ph0s-dog
Copy link
Author

Okay, I've gotten Delta Chat Desktop to talk to Mox over 443 with ALPN, and it all works!

Thank you for investigating the HTTP2 thing—I'll test that patch now and push everything once I've tested.

@s0ph0s-dog
Copy link
Author

Yep, testing shows that it all works now! Thank you for your help!

If you're alright with the code I've added to the autoconfig system, then this is ready to merge :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants