-
Notifications
You must be signed in to change notification settings - Fork 258
Dev: Comments on design
This page describes the underlying reasoning for some of design choices made during maddy development.
This page is WIP, the initial design discussions are scattered across issue tracker, IRC channel logs and etc. I'm slowly collecting the main facts here to help new contributors understand why things are done this way.
mod•u•lar (adj.)
- a self-contained unit or item that can be combined or interchanged with others like it to create different shapes or designs.
-- https://www.thefreedictionary.com/modular
"Module", in terms of maddy, is a implementation of replaceable functionality that may be used by other code.
Maddy exposes access to the technique called "dependency injection" to the user, this allows supporting a great amount of possible configurations with a reasonably simple framework. Essentially, maddy becomes a set of buildings blocks that can be connected together in arbitrary ways to achieve wanted functionality.
Many people call out "overflexibility" of existing e-mail stacks (e.g. postfix+dovecot) as their disadvantage. We believe that flexibility is not a problem on its own, but certain outcomes of not well-thought "flexible" design are:
- Multiple ways to do the same thing lead to confusion about differences.
- Unexpected interactions between components lead to confusion and make troubleshooting harder.
So here are how these problems are addressed in maddy design:
- Avoid having multiple ways to do the same thing, design a single 80% good way to do it and be done with it.
- Draw a clear boundary between components and explain how they interact.
- Keep it simple, stupid.
Was added to remediate the maintenance burden when amount of modules that implement similar functionality will become big. Each module doesn't need to keep track of all modules that can interact with it, it simply looks up requested module by name and checks whether it can be used (e.g. whether it implements a certain interface).
There is a single namespace due to Least Surprise Principle: no matter where you use the name, it will be refer to the same entity.
Concept of "module instances" (aka configuration blocks) was initially introduced to address efficiency concerns. Initially, components of a single module were isolated and had no way to interact with each other. For example, there was no way for the part of storage implementation responsive for message delivery to notify the part of storage implementation responsive for providing mail access to clients about the new message. The solution was to put them back together. Now, with a single object managing certain resource that resource can be used more efficiently by sharing information between contexts where that resource is used.
However, the modules system is more generic that it is needed sometimes. "State sharing" explained above is important only for modules that wrap some sort of resource (e.g. database with messages). There are also modules that don't care about where and how they are used (e.g. dumb message filters based on a single static rule). Configuration syntax defining them at the top-level is a little bit too verbose. There is no need to share state so there is no need to have only one instance of require_matching_rdns
module, for example. However, there are different configurations for require_matching_rdns
users would want to use, so we can't just have only one object for require_matching_rdns
and ditch entire modules thing altogether.
Currently used solution is to allow creating new module instances right where they are used. Such module instances are not registered globally and have no name associated so they are used only where they are defined in configuration.
- https://github.com/foxcpp/maddy/issues/15 (initial proposal of modular design)
- https://github.com/foxcpp/maddy/issues/42 (inline module definitions)
Initially, the idea was to have the minimal possible amount of lines in a configuration that users want in most cases. So, basically, if you don't tell maddy what to do, it will do something you probably want it to do (get SMTP/IMAP listeners, SQL DB for messages, DKIM/DMRAC/etc verification, etc... all that mess for a typical mail server setup). This idea is based on how Caddy configuration works, where it is enough to specify the domain name to have it obtain a TLScertificate for you and configure HTTPS.
There was an argument on IRC about how this contradicts the principle of least surprise: You can't know what maddy is going to do by just looking at its configuration. E-Mail is a much more complex technology than Web (HTTP). For example, strict DNS checks enabled by default may cause unexpected delivery failures.
So the conclusion we came to: It is important for the configuration file to provide complete description of server behavior in all situations to make sure there are no surprises.
- IRC logs of ##emersion channel (TODO, get them)
Check and modificator interfaces were designed with "generic" implementations in mind. That is, implementations that can apply multiple independent transformations (e.g. think of check module implementing a milter protocol client).
Putting things into separate interfaces would add a lot of boilerplate to keep track of what is implemented and what is not (possibly, with a big runtime cost). Putting things into separate modules will negate benefits of "state sharing" (single Unix socket connections pool for milter client, for example) explained in previous section.
Each check or modificator is not used directly, instead, the new "state" object is created for each message. While most implementations will not do anything special with it (StatelessCheck wrapper was created to hide boilerplate required for state object management), some implementations that need to maintain internal state for some purposes will. As stated before, these interfaces were designed with 'generic' implementations in mind. Possible milter protocol client implementation would need to maintain a separate connection handle per each handled message. This is tricky (and error-prone) to do when there is no per-message context object, so that's why CheckState interface exists.