Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin multi-instance support #1600

Draft
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

morrone
Copy link
Collaborator

@morrone morrone commented Feb 7, 2025

This builds on @tom95858's PR #1575. Still a work in progress, but it is worth people seeing where things stand.

The general approach is to:

  • Add a new, optional, "plugin=" field to the load command. The existing "name=" is now interpreted as the plugin instance name, but when the "plugin=" field is missing, it is also the plugin name. As opposed to Multi-instance plugin configuration support #1575, this means "plugin=" only needs to be added to the load command, and "name=" will be consistent across the load/config/start/term commands (it was not clear how Multi-instance plugin configuration support #1575 was going to handle that for config/start/term).
  • Two new plugin calls are added: instance_create() and instance_destroy(). instance_create() is called in the load command (after get_plugin()). instance_destroy() is called in the term command (after the term() call).
  • term() is marked as DEPRECATED, but kept for now for backwards compatibility.
  • get_plugin() should now only return the API structure, and nothing else. All other initialization should happen in instance_create(). But legacy plugins can keep doing what they do for now.
  • API functions now take a "void *context" pointer instead of "struct ldmsd_plugin *". This context pointer is the one returned by instance_create(), and it is freed by calling instance_destroy(). NOTE: To enable more backwards compatibility, for legacy plugins that do not implement instance_create()/instance_destroy(), the context pointer will point to what was returned by get_plugin(). Eventually we can drop that legacy support.

tom95858 and others added 11 commits January 15, 2025 17:55
* Converted plugins to configuration objects
* Added multi-instance support to samplers:
  * procnetdev2
  * meminfo
  * avro_kafka store
  * csv store
  * sos store
* Converted all samplers to support new plugin infrastructure
* Changed configuration object reference counting to use ref_t
In before all the multi-instance code, the "name" field in the configuration
was the name of the plugin. But since each plugin had only a single instance,
we could also consider "name" the instance name.

Instead of adding and "instance" attribute, we add a "plugin" attribute.

"name" becomes the instance name. If "plugin" is not specified, "name"
is also the plugin name.
@morrone morrone marked this pull request as draft February 7, 2025 01:07
@morrone morrone requested a review from tom95858 February 7, 2025 01:07
@morrone morrone force-pushed the morrone-multi-instance2 branch from 26b11e6 to 80c98da Compare February 7, 2025 04:37
@morrone morrone force-pushed the morrone-multi-instance2 branch 2 times, most recently from 600d005 to 6d0ca86 Compare February 7, 2025 05:14
@morrone morrone force-pushed the morrone-multi-instance2 branch from 6d0ca86 to 3d2e040 Compare February 7, 2025 05:36
@morrone morrone force-pushed the morrone-multi-instance2 branch from c55f4c4 to 1bfa0eb Compare February 7, 2025 21:40
@morrone
Copy link
Collaborator Author

morrone commented Feb 7, 2025

As of commit 1bfa0eb, things are basically working for multi instance samplers at least. I converted the "filesingle" plugin to work with the new multi-instance scheme, and load/config/sample/term all seem to be working (even interleaving command from different instances).

Lots of work still to do, but at least a working proof-of-concept.

@morrone morrone added this to the v4.5.1 milestone Feb 8, 2025
@baallan
Copy link
Collaborator

baallan commented Feb 10, 2025

@morrone re term(). I understand that destroying an instance is not, in the long run, what term should be. on the other hand I do see a role for a plugin_destroy (the artist formally known as term) in a system where plugins are loaded, unloaded, and loaded again with a replacement class definition without shutting down a daemon. In portability, I see plugin_destroy being distinct from instance destroy or a linker specific dtor.

Basically, if we have a load command, should we not have an unload command?
Is it there and i missed it?

@morrone
Copy link
Collaborator Author

morrone commented Feb 10, 2025

@morrone re term(). I understand that destroying an instance is not, in the long run, what term should be. on the other hand I do see a role for a plugin_destroy (the artist formally known as term) in a system where plugins are loaded, unloaded, and loaded again with a replacement class definition without shutting down a daemon. In portability, I see plugin_destroy being distinct from instance destroy or a linker specific dtor.

Basically, if we have a load command, should we not have an unload command? Is it there and i missed it?

There is still an "unload" command in this code. It results in calling the plugin's term() first (for legacy plugins), and then calls instance_destroy().

I've been debating the value of adding "put_plugin()" to mirror "get_plugin()". Something to note:

In Tom's code, and carried over into mine, each "load" command results in get_plugin() being called again. Each instance is started by a "load". This is redundant in my code (but not Tom's) because get_plugin() only returns the API of the plugin in my code. No state is to be created in get_plugin() (except in legacy code that will eventually be removed). So I could add a "put_plugin()" for symmetry, but it would just be called in the unload after term() (deprecated) and instance_destroy(). It wouldn't likely do anything.

In other words, "get_plugin()" is currently called one every load command, and going through the entire plugin search and dlopen() process. It happens for every instance. So you shouldn't need to call unload() to start using a new version of a plugin that is already in use.

And actually, I have in mind a plan to get rid of get_plugin() altogether. I don't know if folks will accept it in time for 4.5.1 though. But my plan would be to get rid of get_plugin() and just have plugins implement and export published ldmsd API function names, such as:

  • ldmsd_plugin_instance_create()
  • ldmsd_plugin_instance_destroy()
  • ldmsd_plugin_config()
  • ldmsd_plugin_usage()
  • ldmsd_plugin_type()
  • ldmsd_plugin_sample()

And so on.

If that happens, then there is no nead for get_plugin() or put_plugin(). At load command time, the dlopen() happens, and available symbols are dlsym mapped. Then instance_create() is called. At unload command time, instance_destroy() is called, and dlclose() happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants