Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] [Security Manager Replacement] Run Opensearch Plugins as separate systemd service #16753

Open
kumargu opened this issue Dec 2, 2024 · 7 comments
Labels
Meta Meta issue, not directly linked to a PR

Comments

@kumargu
Copy link
Contributor

kumargu commented Dec 2, 2024

Please describe the end goal of this project

#16729 proposes to strengthen the Opensearch core security model via additional systemd configs such as limiting access to sockets and files. An advancement / extension of such sandboxing would be to run (some) plugins as a separate systemd unit (aka separate process), each of it with its own restrictive systemd config . This is akin to security-manager having plugin level security policies. This will also allow some plugins to run with elevated privileges without elevating the privileges of Core. Further, I said, some plugins will move to this architecture because plugins which are performance sensitive and are fairly trusted will continue to work as they work today.

The overall idea would be to expose a secure REST server within Opensearch core where plugin <-> core interactions will be over secure, fast, bidirectional IPC. Such as IPC could be over Unix domain sockets which is fast, lightweight and can be modelled to use POSIX permissions to lock down access to the file descriptor (FD) associated with the socket, and the server side can request information such as credentials and PID of clients before they can fully connect.

High level advantages --

[1] Plugin level fine grained systemd configs

[2] Restrict resource usages like Rss/CPU/FD/Threads at plugin level (as needed by a plugin). Currently it is very tricky to set the resource level limits within the OS core systemd unit.

[3] The plugins could themselves run in a sandbox environment like docker; or leverage Polygot sandboxing in GraalVM bringing better isolation between plugins and OS core.

[4] Better telemetry injected at new REST layer to measure call-volume for individual plugins and throttle for any DDOS attack at the REST layer.

[5] This model allows building plugins natively in any language (such as RUST for high performance and memory safe use-cases).

Supporting References

#1687

Issues

#16634

Related component

Other

@kumargu kumargu added Meta Meta issue, not directly linked to a PR untriaged labels Dec 2, 2024
@reta
Copy link
Collaborator

reta commented Dec 2, 2024

@kumargu this is what extensions where supposed to become, check please this meta issue #1422. AFAIK, the extensions work has stalled :(

@kumargu
Copy link
Contributor Author

kumargu commented Dec 2, 2024

Agreed that extensions are pretty similar to this idea, but I feel extensions are fairly more ambitious and feature rich such as communication being made outside of a node. AFAIK, some of the performance hits of extensions were reason for it being stalled (please correct me anyone)

I want to advertise this proposal, as being more simpler and focused as being one of the replacement of security while being performant.
[1] No SDK, simple IPC communications
[2] No out of node communications
[3] Not mandatory for all plugins, plugins which needs to be performant and are fairly trusted will continue to work as they are.

@dblock
Copy link
Member

dblock commented Dec 2, 2024

I am with @reta, this proposal is actually exactly a subset (or identical) to extensions. I think it's a variation of opensearch-project/opensearch-sdk-java#688.

The interface between core service and extensions was designed to be anything, including IPC as you propose here. You could take https://github.com/opensearch-project/opensearch-sdk-java and replace the transport to achieve exactly what you propose. The REST server is already implemented.

[1] No SDK, simple IPC communications
[2] No out of node communications
[3] Not mandatory for all plugins, plugins which needs to be performant and are fairly trusted will continue to work as they are.

All this is available in extensions design.

AFAIK, some of the performance hits of extensions were reason for it being stalled (please correct me anyone)

While this is theoretically true because out of proc is slower than in proc, it was never demonstrated. On the other hand, in https://opensearch.org/blog/introducing-extensions-for-opensearch/ we were able to achieve a cost reduction of 33% per data node, with performance matching that of the anomaly-detection plugin.

@kumargu
Copy link
Contributor Author

kumargu commented Dec 2, 2024

thanks dB.

All this is available in extensions design.

Great. I didn't know about this. Infact that reduces the work considerably.

Next, I would like to get an opinion: how do we feel about the proposal, do we think we are headed in right direction to revive the extension work for strengthening security in lack of security manager?

I think the other advantages of extensions such as cost benefits can be brought in later.

@dblock
Copy link
Member

dblock commented Dec 2, 2024

Next, I would like to get an opinion: how do we feel about the proposal, do we think we are headed in right direction to revive the extension work for strengthening security in lack of security manager?

I think it's worth experimenting with to see whether what I say is true!

Also cc: @saratvemulapalli and @dbwiddis who did a lot of work in this area and will have opinions.

@kumargu
Copy link
Contributor Author

kumargu commented Dec 2, 2024

keeping @rmuir in loop!

@dblock
Copy link
Member

dblock commented Jan 6, 2025

[Catch All Triage - 1, 2, 3, 4, 5]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Meta Meta issue, not directly linked to a PR
Projects
Status: New
Development

No branches or pull requests

3 participants