-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS: use fewer inotify instances and/or be more resilient to them not working #2513
Comments
@regevran please assign |
Is it possible the inotify instances are leaking somehow when running the tests via test.py? |
No. The problem is that we use a few per shard (one for each file referenced - typically 3-4 per TLS config, say up to 3-4 TLS configs in scylla.yaml, run 8 shards and then run tests in parallel. This adds up. With ludicrously small default settings we hit the ceiling. I've done some work to maybe make the TLS usage in scylla use a shard-0-only solution for TLS inotify, but I got sidetracked. In any case, however you design it, it will be slightly hackish... |
…ose rebuild Refs scylladb#2513 Adds a more advanced callback type, which takes both actual reloading builder as argument (into which new files are loaded), and allows proper future-wait in callback. Exposes certificates rebuilding (via builder) to allow "manual", quick, reload of certs. The point of these seemingly small changes is to allow client code to, for example, limit actual reloadable_certs (and by extension inotify watches) to shard 0 (or whatever), and simply use this as a trigger for manual reload of other shards. Note: we cannot do any magical "shard-0-only" file monitor in the objects themselves, not without making the certs/builders sharded or similarly stored (which contradict the general design of light objects, copyable between shards etc). But with this, a calling app in which certs _are_ held in sharded manners, we can fairly easily delegate non-shard-0 ops in a way that fits that topology. Note: a builder can be _called_ from any shard (as long as it is safe in its originating shard), but the objects returned are only valid on the current shard. Similarly, it is safe to share the reloading builder across shards _in the callback_, since rebuilding is blocked for the duration of the call.
…ose rebuild Refs scylladb#2513 Adds a more advanced callback type, which takes both actual reloading builder as argument (into which new files are loaded), and allows proper future-wait in callback. Exposes certificates rebuilding (via builder) to allow "manual", quick, reload of certs. The point of these seemingly small changes is to allow client code to, for example, limit actual reloadable_certs (and by extension inotify watches) to shard 0 (or whatever), and simply use this as a trigger for manual reload of other shards. Note: we cannot do any magical "shard-0-only" file monitor in the objects themselves, not without making the certs/builders sharded or similarly stored (which contradict the general design of light objects, copyable between shards etc). But with this, a calling app in which certs _are_ held in sharded manners, we can fairly easily delegate non-shard-0 ops in a way that fits that topology. Note: a builder can be _called_ from any shard (as long as it is safe in its originating shard), but the objects returned are only valid on the current shard. Similarly, it is safe to share the reloading builder across shards _in the callback_, since rebuilding is blocked for the duration of the call.
…ose rebuild Refs #2513 Adds a more advanced callback type, which takes both actual reloading builder as argument (into which new files are loaded), and allows proper future-wait in callback. Exposes certificates rebuilding (via builder) to allow "manual", quick, reload of certs. The point of these seemingly small changes is to allow client code to, for example, limit actual reloadable_certs (and by extension inotify watches) to shard 0 (or whatever), and simply use this as a trigger for manual reload of other shards. Note: we cannot do any magical "shard-0-only" file monitor in the objects themselves, not without making the certs/builders sharded or similarly stored (which contradict the general design of light objects, copyable between shards etc). But with this, a calling app in which certs _are_ held in sharded manners, we can fairly easily delegate non-shard-0 ops in a way that fits that topology. Note: a builder can be _called_ from any shard (as long as it is safe in its originating shard), but the objects returned are only valid on the current shard. Similarly, it is safe to share the reloading builder across shards _in the callback_, since rebuilding is blocked for the duration of the call. Closes #2573
6f39b89 is merged in Seastar, what else do we need? |
The issue cannot really be solved on seastar level. 6f39b89 makes it possible however for calling code to avoid shard-multiplication of the inotify usage. |
…ds' from Calle Wilund Refs scylladb/seastar#2513 Reloadable certificates use inotify instances. On a loaded test (CI) server, we've seen cases where we literally run out of capacity. This patch uses the extended callback and reload capability of seastar TLS to only create actual reloadable certificate objects on shard 0 for our main TLS points (encryption only does TLS on shard 0 already). Closes #22425 * github.com:scylladb/scylladb: alternator: Make server peering sharded and reuse reloadable certs messaging_service: Share reloadability of certificates across shards redis/controller: Reuse shard 0 reloadable certificates for all shards controller: Reuse shard 0 reloadable certificates for all shards generic_server: Allow sharing reloadability of certificates across shards
#22425 was merged. It might fix this issue "enough"... |
The correct link is scylladb/scylladb#22425 However, this link is to is a patch in Scylla, not in Seastar. I was under the impression (which is why I opened this issue in the Seastar repository) that it was Seastar which is opening these inotify instances, and it was Seastar that does it on every shard. It seems the solution in scylladb/scylladb#22425 was to only load the certificates on shard 0 because "encryption only does TLS on shard 0 already". Is this fact determined by Seastar or Scylla?
|
The problem cannot really be solved on a seastar level, since the objects in question do not have any "shard-sharing" attributes - they are not sharded<> objects, or even services of any kind. In fact, they have no knowledge of their counterparts/copies across shards. #2573 adds some support for users of the objects to possibly reduce inotify footprint (by doing the actual inotify listen only on shard 0 typically), by delegating some event processing to smp callbacks or whatnot, and scylladb/scylladb#22425 uses it on a scylla level. |
Recently some ScyllaDB test runs failed (scylladb/scylladb#21199) with the error:
It turns out that Seastar's TLS implementation internally uses inotify to automatically recognize when the certificate files have changed. The error message is misleading - for inotify_init() an EMFILE errno (which gets printed as "Too many open files") does not refer the limit on number of open files, but to the separate per-user limit on the number of inotify instances. This number is configured in
/proc/sys/fs/inotify/max_user_instances
and is often fairly low - e.g., on my Fedora 40 it defaults to just 128. It appears that Seastar creates an inotify instance for each shard, and the ScyllaDB test framework ran many tests in parallel, and the result was running out of inotify instances.The ScyllaDB testers easily solved this problem by increasing the
/proc/sys/fs/inotify/max_user_instances
, but I'm worried that this problem can hit other Seastar users as well, who won't be aware that Seastar TLS even uses inotify, and that/proc/sys/fs/inotify/max_user_instances
is so low. I want to propose that we consider two options, perhaps even both:CC @elcallio
The text was updated successfully, but these errors were encountered: