Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix various performance issues #724

Closed
wants to merge 1 commit into from
Closed

Conversation

4e554c4c
Copy link
Contributor

This commit includes various fixes that I have found working on bouncer-networks. It includes several performance benefits—such as changing Server to be an Arc<str> which removes much of the copying overhead. I have also made sure that there is only one copy of the server config in the Halloy struct, which was previously duplicated for seemingly no reason.
Still, several performance issues go un-addressed. such as the duplication of server configs between threads (should this also be an Arc<config>?) and the copying of client structs.

I have also made some changes to the client. The capability negotiation process has been streamlined, and I've eliminated some branching with preference toward pattern matching. I've also fixed a slight error with halloy, making sure that it detects the end of registration correctly (matching other clients).

I have also made changes to the Cargo.toml in the data crate. This is to fix some compilation issues when building that crate in isolation. I would like this fixed so that it'd be easier to run data crate tests in isolation.
Ideally I think that each of the sub-crates should be tested individually in CI—but that's out of scope for this PR.

In my opinion, this resolves #640

pub struct Server(String);
#[serde(transparent)]
pub struct Server {
name: Arc<str>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to change this to a named struct instead of a tuple struct, since future PRs will add fields here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't implement Serialize / Deserialize for this if we expect it to become stateful / hold additional data as this will break out serialization format in a few places.

I'd prefer we keep it a newtype for now (with the arc is fine), and if in the future if we want to add fields, we can rename this to server::Name and have some new Server struct that takes over that role

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately servers and server names are traeted as the same thing basically all throughout the codebase. The config is treated as the de-facto location where servers are stored (with everything else being cloned from there) so we kind of have to start changing things here unless we change the entire codebase

Doing things this way makes the bouncer networks change a fairly modest ~100 patch before bouncer-networks specific logic which i thought would be better than totally refactoring everything

data/src/client.rs Outdated Show resolved Hide resolved
Copy link
Member

@casperstorm casperstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for a great PR @4e554c4c.
I'd like to see if @tarkah has any comments before we merge it.

@4e554c4c 4e554c4c mentioned this pull request Jan 24, 2025
data/src/client.rs Outdated Show resolved Hide resolved
This commit includes various fixes that I have found working on
bouncer-networks. It includes several performance benefits—such as
changing `Server` to be an `Arc<str>` which removes much of the copying
overhead. I have also made sure that there is only one copy of the
server config in the `Halloy` struct, which was previously duplicated
for seemingly no reason.
Still, several performance issues go un-addressed. such as the
duplication of server configs between threads (should this also be an
`Arc<config>`?) and the copying of `client` structs.

I have also made some changes to the client. The capability negotiation
process has been streamlined, and I've eliminated some branching with
preference toward pattern matching. I've also fixed a slight error with
halloy, making sure that it detects the end of registration correctly
(matching other clients).

I have also made changes to the `Cargo.toml` in the `data` crate. This
is to fix some compilation issues when building that crate in isolation.
I would like this fixed so that it'd be easier to run `data` crate tests
in isolation.
Ideally I think that each of the sub-crates should be tested
individually in CI—but that's out of scope for this PR.

In my opinion, this resolves squidowl#640
@4e554c4c
Copy link
Contributor Author

updated

I experimented with some other things (in particular using HashSet operations to update caps) that didn't end up so well due to struct field borrowing issues...
Eventually I think it'd be nice to move this into a Cap struct that handles most of the logic, since that would simplify a lot (especially because the logic for CAP NEW and CAP LS should basically be equivalent).

@4e554c4c 4e554c4c requested a review from andymandias January 28, 2025 16:36
Copy link
Collaborator

@andymandias andymandias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, and I didn't find any issues in testing.

@@ -1168,11 +1103,18 @@ impl Client {
// Updated actual nick
let nick = ok!(args.first());
self.resolved_nick = Some(nick.to_string());
}
// end of registration (including ISUPPORT) is indicated by either RPL_ENDOFMOTD or
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the link on the next line it reads as though RPL_WELCOME through RPL_ENDOFMOTD or ERR_NOMOTD is technically all post-registration. It doesn't really matter what that series of messages is called though, this update is correct; we were utilizing RPL_ISUPPORT parameters in channel parsing for sending JOINs, but feature advertisements come after RPL_WELCOME (and before RPL_ENDOFMOTD or ERR_NOMOTD).

Comment on lines 333 to -336
let removed_servers = self
.servers
.servers()
.keys()
.filter(|server| !updated.servers.contains(server))
.cloned()
.collect::<Vec<_>>();

self.servers = updated.servers.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally not a fan of this. Config should be immutable and exactly what we loaded from the disk. We add / remove from this map when servers are disconnected or added via irc:// url and is our source of truth for what is currently connected, not what they've configured

Instead, we should change config.server to be a normal HashMap so it can't be conflated with Halloy.servers / server::Map and misused. We can convert to server::Map initializing Halloy from config or updating here.

Copy link
Contributor Author

@4e554c4c 4e554c4c Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would be ok with this if there were only one Config, but server configs are copied everywhere which is a huge pain for keeping things in sync.

Not to mention that the current logic deletes all servers which are not in the config whenever the config updates, including dynamically added servers, so these two structs are essentially treated as equal except in some edge cases

Copy link
Member

@tarkah tarkah Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would be ok with this if there were only one Config, but server configs are copied everywhere which is a huge pain for keeping things in sync.

The only thing accessing Config.servers is us cloning it to Halloy.servers. And the only thing Halloy.servers is used by is logic for diffing when new servers are added (via irc://) or removed (via config refresh) and then it's our source of truth for which server subscriptions we maintain (active IRC connections).

But by using Config.servers as our source of truth, we have to mutate the users configuration and we also lose control of how to manage runtime servers outside the context of "configuration"

pub struct Server(String);
#[serde(transparent)]
pub struct Server {
name: Arc<str>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't implement Serialize / Deserialize for this if we expect it to become stateful / hold additional data as this will break out serialization format in a few places.

I'd prefer we keep it a newtype for now (with the arc is fine), and if in the future if we want to add fields, we can rename this to server::Name and have some new Server struct that takes over that role

@4e554c4c
Copy link
Contributor Author

4e554c4c commented Feb 4, 2025

closing this until v2

@4e554c4c 4e554c4c closed this Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rfc: refactoring Server
4 participants