Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use behavior stacking for handler implementation #146

Merged
merged 3 commits into from
Jan 5, 2025

Conversation

mruoss
Copy link
Contributor

@mruoss mruoss commented Dec 18, 2024

Hey Mat

This is a refactoring proposal which addresses #144. Rather than "merging" the behaviors of GenServer and ThousandIsland.Handler, we can stack them. I.e. implement the GenServer behavior inside ThousandIsland.Handler and delegate necessary bits to the actual handler implementing only the behaviorThousandIsland.Handler.

Though minimal, the changes sre not backward compatible as we change the interface of handle_info.

Wdyt?

@mruoss
Copy link
Contributor Author

mruoss commented Dec 18, 2024

Note: Benjamin Wilde published a video about this pattern he calls "Behaviour Stacking". This also makes implementing BYO handlers much easier as you can delegate to ThousandIsland.Handler where needed.

Also note: Right now only handle_info is delegated to the handler but can also extend the behavior with handle_call and friends

@mtrudel
Copy link
Owner

mtrudel commented Dec 19, 2024

First off, I think the idea of simplifying the using block to something smaller by factoring out to plain ol' functions on ThousandIsland.Handler is a good one; it's easier to read, isolates the macro complexity, and allows for reuse when people do end up rolling their own. I like that part.

What I'm less in favour of is the idea of making the user's handler not be the GensServer. the wrapper pattern used here (where ThousandIsland.Handler is the one implementing the GenServer callbacks and ThousandIsland.Handler is basically duck typed to GenServer) looks an awful lot like how eg kafka_ex (used to?) do things, and it was a massive pain in the ass when you needed to be a 'real' GenServer. Considering things like OTP debug facilities, facilities like hot code reloading, even error formatting; if we want to cover all those cases then we need to plumb facilities for them through the Handler interface. That's not the kind of fun I want to be having here. The handler module itself needs to be the one calling use GenServer and the module being passed to start_link. Everything else is going to be nightmare of corner cases.

That being said, we could probably get a good chunk of the way there with a combination of defdelegateing to ThousandIsland.Handler and calling helper functions. That's a tack I'm 100% in favour of

@mruoss
Copy link
Contributor Author

mruoss commented Dec 19, 2024

I hear you and I will update the PR. After all that's all the related ticket is about.

For the sake of conversation and my understanding: All that use GenServer does is @behaviour GenServer, a child spec and a few default implementations for the handle_* callbacks, no? What stops us from delegating the GenServer's callbacks required for OTP debug facilities to the Hanlder module (i.e. add the necessary functions to the the ThousandIsland.Handler behviour)?
The upside would a uniform api (eg. {:noreply, {socket, state}}vs.{:continue, state}` results). In terms of processes, nothing changes.

@mtrudel
Copy link
Owner

mtrudel commented Dec 19, 2024

All that use GenServer does is @behaviour GenServer, a child spec and a few default implementations for the handle_* callbacks, no?

I mean yes, but that's a bit of a load bearing 'All'. There's tricky things like subtle before_compile hook ordering, all the hooks that :sys uses for debugging and suspend, etc. It's all doable for sure, but also squarely outside the mandate of the library, as nice as it would be to have a consistent returning API for handle_info et al.

@mruoss
Copy link
Contributor Author

mruoss commented Dec 19, 2024

There's tricky things like #105

This fallback is not needed anymore because of the consistent API ;)

all the hooks that :sys uses for debugging and suspend

But... the process still is a GenServer. The only thing that changes is where the functions are defined which called by the GenServer module (or the erlang version of it). By forwarding format_status to the handler I would assume you get all you had before, no?

Anyway I don't want to waste your time on this. I can refactor the MR as I said. The code will get be a bit messier because in this approach we really have to keep up the pattern matching so the callbacks don't get overwritten.

@mtrudel
Copy link
Owner

mtrudel commented Dec 19, 2024

Anyway I don't want to waste your time on this.

Nonono, this is great! I just want to avoid having to steward code that's duplicating GenServer (to paraphrase Virdig, I don't want to write an ad hoc informally-specified bug-ridden slow implementation of half of GenServer :)

I can refactor the MR as I said. The code will get be a bit messier because in this approach we really have to keep up the pattern matching so the callbacks don't get overwritten.

Stepping back for a moment, I think the ordered list of goals here is to:

  1. Make Handler's implementation be as reusable as possible by folks who are rolling their own Handlers
  2. Reduce to amount of stuff that's defined inline inside the __using__ block
  3. See what we can do to improve the ergonomics of GenServer's handle_* calls, viz a viz their return types

The constraints are:

  1. Not copying GenServer's internal behaviour (but delegating to it is fine!)
  2. Have the resulting process be a GenServer, not just duck typed to look like one
  3. Do our best to avoid breaking changes

To be clear, I'd love to see the above happen (and greatly appreciate your effort in this!). Just so we're on the same page, can you just quickly lay out the overview of 'I can refactor the MR as I said'?

@mruoss
Copy link
Contributor Author

mruoss commented Dec 19, 2024

Nonono, this is great!

okay, cool. I like this, too.

Agree on goals and contraints.

So what I am trying to say is that I think the current version of the MR is still within the constraints (besides BC, see below):

  1. Not copying GenServer's internal behaviour (but delegating to it is fine!)

My approach delegates from ThousandIsland.Handler to actual Handler, your apprach (iiuc) is the opposite direction. My point being: It's just functions calling functions. Let's choose the direction that makes most sense.

  1. Have the resulting process be a GenServer, not just duck typed to look like one

Exactly. The process must be a GenServer. No additional processes (i.e. no message sending, just function calls)

  1. Do our best to avoid breaking changes

My does introduce a breaking change in order to uniform the API. But it doesn't have to! handle_info can be delegated 1:1 wo keep the API as is.

Just so we're on the same page, can you just quickly lay out the overview of 'I can refactor the MR as I said'?

The alternative approach would be to keep the function declarations inside the using macro the same as they were but move their body / implementation out to ThousandIsland.Handler and call / delegate to them. But this way we can't "simlify" the handle_infos (we still have to keep the pattern matching because these functions end up in the actual handler)

@mtrudel
Copy link
Owner

mtrudel commented Dec 19, 2024

My approach delegates from ThousandIsland.Handler to actual Handler, your apprach (iiuc) is the opposite direction. My point being: It's just functions calling functions. Let's choose the direction that makes most sense.

I've always had it in my head that the user's Handler is the thing that's directly interacting with all of the GenServer machinery in start_link etc. This has always just seemed easier to understand ('the use ThousandIsland.Handler macro just injects some start_link/init functions and a few tightly pattern matched handle_info({:tcp,....}) calls into your own module. There's no magic'). So from that perspective I've always thought it would be sensible to try and slim down the __using__ macro to only inject the minimum needed and delegate to concrete functions on ThousandIsland.Handler for actual implementation. But in trying to sketch out what that would actually look like just now, I realize that we end up just smearing a bunch of mostly related logic between the __using__ block and the concrete functions. I'm not sure that's an improvement.

On the other hand, going the other way & having ThousandIsland.Handler be the thing that directly interacts with all of the GenServer machinery would mean that users's Handlers are 'wrapped' and are dependent on ThousandIsland.Handler passing through all of the relevant functions (and having to make some concessions about whether it's ThousndIsland.Handler or the user Handler's state that gets displayed in things like :sys.get_state etc). Having lived through this pattern as a user of kafka_ex, it's actually a real pain to deal with.

So both extremes are kinda lousy. I wonder if something like the 'modular macros' approach that Phoenix uses for Endpoint would be a good compromise? We'd be able to keep most of the logic encapsulated in tight helper macros within ThousandIsland.Handler, but still have them lexically declared within the user's Handler via the __using__ macro. Folks rolling their own handlers could pick and choose which ones they'd need by calling the underlying modular macros directly

WDYT?

@mruoss
Copy link
Contributor Author

mruoss commented Dec 19, 2024

Okay I see. In my approach I had LiveView in mind where I believe (I don't claim understanding what's going on in detail) that Phoenix.LiveView.Channel has the use GenServer i.e. interacts directly with GenServer while the LiveView has its own behaviour (with mount and handle_event etc.). But it's exactly how you describe it. You wrap the handler functions and I guess the handler module only gets passed a subset of the actual GenServer state.

I don't have experience with kafka_ex and can't relate to the pain you went through with it and don't understand it at this point. Maybe I have to learn the hard way at some point.

But yes, I guess using 'modular macros` could be a way to go. It might be a bit strange to use those macros to create a BYO handler but as you said, there are supposedly not many of those use cases. Guess I need to think this approach through a bit more.

The downside of having the logic in macros is that you blow up your code as macros are injected into all useing modules. But that's probably not such a big deal in this case as I guess you don't end up implementing many different TCP connection handlers in one application.

@mruoss
Copy link
Contributor Author

mruoss commented Dec 21, 2024

I have refactored the code now to use the modular macros approach. I've moved two helper functions out of using though. wdyt?

@mruoss mruoss force-pushed the handler-refactoring branch from cf850a6 to 01b48fd Compare December 21, 2024 12:35
@mtrudel
Copy link
Owner

mtrudel commented Dec 21, 2024

This looks great! Assuming you're good with this standing as is, I'm going to add a bit of documentation to the top (mostly to remember what exactly our design goals were here!) over the next few days and merge.

@mruoss mruoss force-pushed the handler-refactoring branch from 01b48fd to 863c68a Compare December 21, 2024 20:17
@mruoss
Copy link
Contributor Author

mruoss commented Dec 21, 2024

Yeah I'm totally fine with this. But Credo has a problem with the complexity level of genserver_impl/0... Fixed the other two Credo issues and leave it to you for the docs.

@mruoss mruoss force-pushed the handler-refactoring branch from 863c68a to e7e2a47 Compare December 26, 2024 12:24
@mruoss
Copy link
Contributor Author

mruoss commented Dec 26, 2024

In case you're interested what I want to use this for: https://gist.github.com/mruoss/57384a7775ef75324bbe9552ecdf69f9

Just playing around in a livebook for now.

@mtrudel mtrudel merged commit 830f571 into mtrudel:main Jan 5, 2025
9 checks passed
@mtrudel
Copy link
Owner

mtrudel commented Jan 5, 2025

Released as 1.3.8, sorry for the delay!

@mtrudel
Copy link
Owner

mtrudel commented Jan 5, 2025

I've been mulling over the idea of behaviour stacking since this PR came in and I think I may be changing my tune a bit. I think doubling down on this model could potentially facilitate a couple of different (and to date largely abstract) ideas I've had for improvements to Thousand Island. There are some places where we'd need to be really careful to not get in our own way too much (mostly around how much of system behaviours such as GenServer we end up having to intermediate), but I feel like it's solvable and definitely worth exploring further.

First some background, mostly for me to try and organize my thinking here:

  • Last year I reworked Bandit's HTTP/2 stack to use a GenServer model for stream processes (my initial implementation implemented HTTP/2 streams via the Task process model). This was prep work for adding WebSockets over HTTP/2 (RFC 8441) support to Bandit (still a work in progress), since when a stream process 'upgrades' itself to a WebSocket connection it needs to have a GenServer shape.

  • I've also been coming to appreciate the distinction between the singular 'connection' that a client makes to the server at a TCP level, and the multiple distinct 'requests' that it issues on this connection (either via HTTP/2 streams or HTTP/1 keepalives). As part of this I've been trying to think about ways that this distinction could be facilitated by Thousand Island, since there are many aspects (network framing, request process supervision, others) which are common enough to be solved generically.

  • Bandit has always supported the idea of swappable Handlers (this is how we support HTTP/1 and HTTP/2 on the same connection, and how we handle HTTP/1 sourced WebSocket upgrades). This is done within Bandit today via the DelegatingHandler handler, but it's always felt like generic enough behaviour that it should be part of Thousand Island. But, given the current 'non-stacked' Handler pattern, it's not been possible.

  • I've also come to realize that major version bumps aren't that big of a deal for Thousand Island due to it being used mostly transitively, so we don't have to be held back by 'what is' and can think a lot more freely about 'what could be'. Major version bumps are basically a deal breaker for Bandit, but they're much easier to swallow here.

With that out of the way, here's some potential ideas for how this could look:

  • The concrete parts of the Handler module go from being macro'd in to being regular functions on a bog standard module.
  • The behavioural parts of the Handler module become a much smaller macro that's basically just default implementations
  • The Handler behaviour grows to encapsulate handle_call, handle_cast, handle_info and the rest of the GenServer family. Messages that are received by the 'wrapper' Handler get pushed straight through to the user's Handler implementation.
  • We unwrap the internal state in these calls so that users don't have to worry about the {socket, state} tuple structure
  • We add {:switch, new_handler, new_state} return values to all functions which allow the handler module to be redefined on the fly

That in and of itself is enough to tie up most of the loose ends I've been trying to find a home for in Bandit, and would simplify things quite a bit. On the other hand, we'd be on the hook for staying up to date with changes to the GenServer behaviour.

I'm also thinking about some more far flung ideas to abstract framing / connection multiplexing & supervision away as well, but I figure this is a good place to call for a discussion. WDYT?

@mruoss
Copy link
Contributor Author

mruoss commented Jan 6, 2025

Good old Christmas break gives one plenty of time to think on such things, right? :)

First let me share my opinion on semantic versioning since you mentioned that major version bumps are a deal breaker for Bandit. I'm a big fan of semantic versioning and I have the feeling that library maintainers are sometimes a bit too afraid to make version bumps. Don't get me wrong, I think breaking changes should be avoided if at all possible and I'm super impressed by the Elixir core team still being on 1.x.
But sometimes after many thoughts you come to the conclusion that a breaking change is worth it so why not bump the major version? For me a major version bump automatically makes me look for breaking changes in the changelog. And I'm more than happy if I find that there is a clear and easy migration path and I gladly replace my ~> 1.0 to a ~> 2.0. I think there is nothing to be afraid of there. As a maintainer of something like this (speaking of Bandit), you can always help a project like Phoenix with a PR to migrate to a newer major version.

Now about the behaviour stacking, I have already shared most of my thoughts. I can't say I have a ton of experience using it - after all I've discovered this only some weeks ago.

[...] we'd be on the hook for staying up to date with changes to the GenServer behaviour.

Yes. But breaking changes on GenServer are highly unlikely. As for non-breaking changes (let's imagine a new callback being introduced) imo it wouldn't be the end of the world if the behaviour doesn't support these right away. If community has a need for it, they are one PR away from adding a "proxy callback".

Also, ThousandIsland is your lib. Your code. It should also be your behaviour (even though it mimics the GenServer behaviour). To me defining your own behaviour is much more transparent than pointing to the GenServer behaviour from a documentation point of view. If I use ThousandIsland.Handler, implementing its behaviour feels more natural than implementing GenServer's behaviour.

But maybe weigh in some other opinions on this...

@mruoss
Copy link
Contributor Author

mruoss commented Jan 6, 2025

Note that on the k8s port-forwarding playground (Gist I linked above), I've gone with full behaviour stacking to get a bit of a feeling for it.

@mtrudel
Copy link
Owner

mtrudel commented Jan 6, 2025

1.3.9 just went out with a tiny dialyzer fix that only became apparent when I tried to integrate this update into Bandit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants