Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple domains support #3870

Open
wants to merge 41 commits into
base: main
Choose a base branch
from

Conversation

BlockListed
Copy link
Contributor

@BlockListed BlockListed commented Sep 9, 2023

Fixes #2690

Very WIP PR, I just want some feedback about my approach for now.

I am planning to go with the allowed domains approach.

Overview:

  • We create 2 Hashmaps, which map the Host header to either Domain or Origin.
  • We create a hashmap, which maps the Host header to a combination of Domain and Origin.

Limitations:

  • All domains have to have the same path, because otherwise we would need one web server instance for each different path.

Future:
- Change JWT system to create tokens, which work for a single domain.

@BlackDex
Copy link
Collaborator

BlackDex commented Sep 9, 2023

My first question would be, why change Host to Domain?
That is not how the header is called.

And a dev tip, install and activate pre-commit, that will help fixing issues before committing 😄

@BlockListed
Copy link
Contributor Author

BlockListed commented Sep 9, 2023

first question would be, why change Host to Domain? That is not how the header is called.

And a dev tip, install and activate pre-commit, that will help fixing issues before committing 😄

Host

Because it definitely isn't a host.

format!("{protocol}://{host}")

This is a base URL, but you could say, that domain might also not be the correct term. My reasoning was, that domain is the term used for the config.
It is used when creating URLs and it was very confusing. (for me at least)
I'd have no problem changing it back tho.

Pre-Commit

I ran cargo clippy, I know it's definitely not finished yet, but I'm not gonna put in all that work and find out at the end, that my approach is fully incompatible with the vision/goals of the project maintainers.

@stefan0xC

This comment was marked as resolved.

@BlockListed
Copy link
Contributor Author

BlockListed commented Sep 9, 2023

Because it definitely isn't a host.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Host

It does not store a host. It stores a base url.

If it stored a host the extractor would be:

let headers = ...;

if let Some(host) = headers.get_one("X-Forwarded-Host") {
    host
} else if let Some(host) headers.get_one("Host") {
    host
} else {
    panic!("invalid http request");
}

But it's not, the extractor gets a URL from a combination of protocol, host and path (from config.domain()).

It doesn't store developer.mozilla.org (from Host: developer.mozilla.org) it would store https://developer.mozilla.org.

@BlackDex
Copy link
Collaborator

BlackDex commented Sep 9, 2023

It definitely isn't a domain, a domain is the last 2 (or 3) parts.
So, a domain is vaultwarden.org, a host is vault.vaultwarden.org, so what we are receiving is actually a host.

Edit: so actually, our config is wrong too 😅

@BlockListed
Copy link
Contributor Author

BlockListed commented Sep 9, 2023

Ot definitely isn't a domain, a domain is the last 2 (or 3) parts. So, a domain is vaultwarden.org, a host is vault.vaultwarden.org, so what we are receiving is actually a host.

I know, but the precedent is set in config.rs

/// Domain URL |> This needs to be set to the URL used to access the server, including 'http[s]://'
/// and port, if it's different than the default. Some server functions don't work correctly without this value
domain:                 String, true,   def,    "http://localhost".to_string();

This definitely isn't a domain, but that's what the precedent says.

We combine a host with a protocol and possibly a path into a URL.

@BlackDex
Copy link
Collaborator

BlackDex commented Sep 9, 2023

Ot definitely isn't a domain, a domain is the last 2 (or 3) parts. So, a domain is vaultwarden.org, a host is vault.vaultwarden.org, so what we are receiving is actually a host.

I know, but the precedent is set in config.rs

/// Domain URL |> This needs to be set to the URL used to access the server, including 'http[s]://'
/// and port, if it's different than the default. Some server functions don't work correctly without this value
domain:                 String, true,   def,    "http://localhost".to_string();

This definitely isn't a domain, but that's what the precedent says.

I understand, but the location where you changed it has nothing to do with config, but with the headers.

@BlockListed
Copy link
Contributor Author

BlockListed commented Sep 9, 2023

Ot definitely isn't a domain, a domain is the last 2 (or 3) parts. So, a domain is vaultwarden.org, a host is vault.vaultwarden.org, so what we are receiving is actually a host.

I know, but the precedent is set in config.rs

/// Domain URL |> This needs to be set to the URL used to access the server, including 'http[s]://'
/// and port, if it's different than the default. Some server functions don't work correctly without this value
domain:                 String, true,   def,    "http://localhost".to_string();

This definitely isn't a domain, but that's what the precedent says.

I understand, but the location where you changed it has nothing to do with config, but with the headers.

So then the correct term would be BaseURL right??
Because what the Host / Domain struct stores is definitely not a host.

I renamed it, because in the optimal situation (the domain is configured) the struct Stores the value from the domain configuration variable.

But all I really want to know is if this approach (looking up the domain by host), not implementation, vibes with the project vision.

@BlackDex
Copy link
Collaborator

BlackDex commented Sep 9, 2023

But, maybe we need to wait for the whole picture. But because we didn't do it correctly in one place, doesn't mean we shouldn't at others or improve. Else we should just make it more clear using comments for other developers maybe. But i would prefer to use the right naming where possible when we adjust the code.

@BlackDex
Copy link
Collaborator

BlackDex commented Sep 9, 2023

There are some issues. Mails can be sent without any HTTP connection, so there must be one main host/domain in the config which will be used in those cases. Using a different host/domain for mails when there is a HTTP connection will become confusing.

So, the only location where i think it will be useful for are MFA and attachments/sends.

@BlockListed
Copy link
Contributor Author

BlockListed commented Sep 9, 2023

There are some issues. Mails can be sent without any HTTP connection, so there must be one main host/domain in the config which will be used in those cases. Using a different host/domain for mails when there is a HTTP connection will become confusing.

So, the only location where i think it will be useful for are MFA and attachments/sends.

That's why I added pub fn main_domain(&self) -> String to Config, it should probably be more clear in the configuration though which domain is the main one. (Currently the first one)

@BlockListed
Copy link
Contributor Author

BlockListed commented Sep 9, 2023

Checks are passing now, but I still have some cleanup to be done. Probably will do it tomorrow, after that it should be ready for review.

To do (at least):

  • Remove purposeful breaking changes
  • Re-add domain_origin / domain_path options
  • Do proper error handling for HostInfo extractor

Behaviours which may need to change (I think they're fine)

  • Emails always use the first domain
  • JWT issuer is always the first domain's origin

@BlockListed BlockListed changed the title WIP: Multiple domains support Multiple domains support Sep 9, 2023
@BlockListed
Copy link
Contributor Author

From my perspective I would say this is ready to review.

@BlockListed
Copy link
Contributor Author

BlockListed commented Oct 7, 2023

@BlackDex Sorry for the ping, but is anything gonna happen with this PR? Because if not, it should probably just be closed.

@BlackDex
Copy link
Collaborator

BlackDex commented Oct 7, 2023

It has to wait for a check. As i mentioned in a few other pr's I'm busy with other stuff (for this project). But i just don't have much time currently.

Just be patient and my comments (or other people's comments) will follow eventually.

@BlackDex
Copy link
Collaborator

I still think naming of a few items is still off and could be changed to better match the RFC naming.
It's true that DOMAIN is probably not the correct name here either, and HOST is also wrong, and it should actually be URL or even better ABSOLUTE_URL.

Some items could be moved to config.rs to prevent calling functions every time to extract the right value.
And, i just noticed we actually should not allow configuring the domain via the admin interface since we use that for static lazy loaded JWT Issuers, which should be changed if the domain is changed, or the main domain is switched.

So all in all a lot of items to address i think.

@BlockListed If you still want to work on this let me know and i will add more specific comments to the code.
Else i tend to close this PR for now.

@BlockListed
Copy link
Contributor Author

I still think naming of a few items is still off and could be changed to better match the RFC naming. It's true that DOMAIN is probably not the correct name here either, and HOST is also wrong, and it should actually be URL or even better ABSOLUTE_URL.

Some items could be moved to config.rs to prevent calling functions every time to extract the right value. And, i just noticed we actually should not allow configuring the domain via the admin interface since we use that for static lazy loaded JWT Issuers, which should be changed if the domain is changed, or the main domain is switched.

So all in all a lot of items to address i think.

@BlockListed If you still want to work on this let me know and i will add more specific comments to the code. Else i tend to close this PR for now.

Yeah, I'd still be willing to work on the PR.

Copy link
Collaborator

@BlackDex BlackDex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, have added some comments on some items which i think are mainly an issue.

But in general i think this needs to be revisited on how to solve this.

We want to prevent as much as possible to calculate the the full URL like https://domain.com/vault or https://vault.domain.com or the origin all the time during every call.

For the JWT/MFA this probably need to be dynamic too.
If i use WebAuthn or Duo, i would want to be able to login using those methods via all allowed domains, not only the main one.

Sending emails (especially the ones automated) can of course only be from the main domain. But some of course could be from the domain which the user logged-in on is using. While I'm not fully sure that is what we want though. Force the main domain here does sound more logical in general.

Some quick thoughts maybe.
A hashmap which could quickly match a domain to an origin
Mabye also for path, to match/fetch it for the specific domain.

Also, i have not checked the actual functionally.

@@ -12,6 +12,7 @@ use rocket::{
Catcher, Route,
};

use crate::auth::HostInfo;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved a few lines down to line 21.

@@ -988,7 +1008,7 @@ fn validate_config(cfg: &ConfigItems) -> Result<(), Error> {
}

/// Extracts an RFC 6454 web origin from a URL.
fn extract_url_origin(url: &str) -> String {
pub fn extract_url_origin(url: &str) -> String {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not make this public and use this function at all the other places.
It probably is better for the main_domain to have an extra entry in the config like domain_origin, so main_domain_origin probably. That way it is only generated once when the config reloads and not all the time.

Also, within config.rs it might best be that we change auto to gen to ensure it is generated during config load.

Comment on lines +1022 to +1037
fn extract_origins(urls: &str) -> String {
let mut origins = urls
.split(',')
.map(extract_url_origin)
// TODO add itertools as dependency maybe
.fold(String::new(), |mut acc, origin| {
acc.push_str(&origin);
acc.push(',');
acc
});

// Pop trailing comma
origins.pop();

origins
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you are doing here. It looks like it is only generating the domain_origin value which in the end looks like to never be used only to validate if the amount when split matches domain. Which should, since it is generated. Maybe it doesn't right now because the value is set to auto, so i think it would be best to set it to gen and always generate the value. But, it should then be renamed to something like domain_origin_main probably, and only extract the main element?

All other logic should probably be done within the HostInfo i think.

if embed_images {
"cid:".to_string()
} else {
let domain = domains.split(',').next().expect("Domain missing");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When creating a domain_main variable which will hold the value for of the first item in the list, there should be no need for splitting here any more, and just directly use that special value.

/// Domain path |> Domain URL path (in https://example.com:8443/path, /path is the path)
domain_path: String, false, auto, |c| extract_url_path(&c.domain);
/// MUST be the same for all domains.
domain_path: String, false, auto, |c| extract_url_path(c.domain.split(',').next().expect("Missing domain"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be different per domain of course. So assuming the path only from the first domain is probably not the best solution here.

it might be that for example:

  • https://shared.domain.local/vault
  • https://vault.domain.tld

are configured, or vice-versa. Which would either never add /vault or always add /vault and sometimes this would be an error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have added several log:: entries here. log:: can be removed since those macro's should be callable directly.

But logging in this location could slow down the application a lot. Even though it only should do this during debug, it will do this for every call which will do something with that Struct.
If it is really needed to have this info here, i would then suggest to place it under trace! instead.

@Timshel
Copy link
Contributor

Timshel commented Aug 29, 2024

Hey

So I'm the contributor maintaining the OpenID Connect PR (#3899) and I'm wondering how both feature would interact.

For the SSO, the config is global (As opposed to official Bitwarden where it's per Organization) since I believe it makes more sense in the context of smaller deployment.

The easiest solution would probably be that all domains still use the same SSO, but I'm guessing some user will want more integration between both feature.

On the other end integrating both feature at the organization level would make it closer to how it works in Bitwarden and might allow reusing things like Domain Verification (which appear heavily linked to SSO but probably not 100% since it's in a separate config section).
But this would mean quite a lot of added complexity and maintainers would have to decide if it makes sense for the project.

@BlackDex
Copy link
Collaborator

@Timshel I actually think it might be useful for SSO/OIDC.
During my tests with Keycloak i had some issues at first, mostly because of this.
Even though i added both my testing domains in the Keycloak config, it only worked on the one configured in my config, and not on the one i was using that moment to test stuff.

If an environment is not designed for multiple domains, then admins just need to configure a single domain and not multiple domains of course.

@BlockListed
Copy link
Contributor Author

Yeah I've got some stuff on my plate right now. Will try to start making changes on the weekend. Sorry for not making that more clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiple domains support, per organization
4 participants