-
Notifications
You must be signed in to change notification settings - Fork 13
Plugin Development
How to write a plugin.
For information on getting set up to run snewpdag, see the Installation wiki page.
It's recommended that you fork snewpdag
, develop code in your branch.
Then, when you're ready to push your commits, push it to the branch, and make a pull request.
import logging
import numpy as np
import healpy as hp
from snewpdag.dag import Node
The plugin derives from the Node
class.
If your plugin takes arguments on construction, define them in the __init__
method, and then forward the rest of the
keyword arguments to the parent initializer. In the example below, myarg
is a required argument, while optarg
is
an optional argument.
class MyPlugin(Node):
def __init__(self, myarg, **kwargs):
self.myarg = myarg
if 'optarg' in kwargs:
self.optarg = kwargs.pop('optarg')
self.map = {}
super().__init__(**kwargs)
Here I've also initialized the instance variable map
, which will be used to hold data from the plugin's inputs.
The reason this is useful is because alerts typically come from one experiment, but calculations often involve the
data of more than one. So an alert will trigger re-calculation based on the updated information from the
alerting experiment, which will of course still depend on data previously sent by the other experiments.
There are four normal actions, and a fifth one for handling custom (or undefined) actions:
- alert: this is for handling an individual experiment sending new information.
- revoke: this is for handling an individual experiment revoking its previous information.
- reset: this is for revoking all experiments' data. This is used when running MC trials, with a reset clearing the data associated with a single pseudo-experiment.
-
report: this prompts accumulating plugins to produce a summary or plot. Like
reset
, this is to be used in MC trials. - other: this handles actions which aren't any of the above four. By default, it prints an error message.
To implement an action, override the method in your plugin. Each method takes the data payload as an argument:
def alert(self, data):
logging.info("{0}: alert called".format(self.name))
The return value of the action method determines what happens next. If you return False
, the action is "consumed" by the plugin, i.e., it doesn't notify any downstream nodes. If you don't return anything, an error message is printed and the action is consumed.
If you return True
, downstream plugins are notified using the same action verb. The history
field if the payload is also updated by appending the name of your plugin. A (shallow) copy of the payload is also saved in self.last_data
.
You can also return a modified payload, in which case downstream plugins are notified using that payload. If the payload omits the action
field, the current action verb will be inserted. The history will also be updated by appending the name of your plugin to whatever history is already recorded in the payload. This is useful if your plugin combines data from multiple inputs, as you should save the history as a tuple of input histories. As in the case for True
, a shallow copy of the payload is saved in self.last_data
.
If your plugin only observes one other node, or the calculation it performs only depends on the input it's given, you won't need to do anything special to keep the history of your calculation in order. Just return True
or False
as above, or, if you've modified the payload, leave the history
field alone. The Node
infrastructure will add the name of the current node to the end of the history before notifying downstream observers.
If your plugin combines the results from different inputs, then you should combine the histories so someone looking at the history later can at least have a stab at reconstructing what happened. First, you'll probably want to record the payloads as they come in:
self.map[self.last_source] = data.copy()
self.map[self.last_source]['history'] = data['history'].copy()
self.map[self.last_source]['valid'] = True
These will make copies of the payload and the history (if you're wondering why you need to copy both, see the "Digression on payloads" below). I also keep a flag which indicates whether the payload is valid, in case an experiment decides to revoke its data later on.
Let's say you've done the calculation and modified the payload. Then you update the history as follows:
hlist = []
for k in self.map:
if self.map[k]['valid']:
hlist.append(self.map[k]['history'])
data['history'].combine(hlist)
return data
This replaces the history (which initially just contains the history of the current alert) with a history which combines the histories of all the contributing alerts. Node
will then append the name of the current node to the end before notifying downstream observers.
You're done with the code. Now add your class to plugins/__init__.py
so it can be found by others.
This is a note on how payloads are copied. The main messages:
- The payload takes the form of a python dictionary.
- Go ahead and add, delete, and replace fields in the dictionary to your heart's content.
- Don't modify the contents of an underlying list, dictionary, set, or byte array.
Now for the nerdy details, if you're interested (skip if it's not the case yet).
The action methods described here are called with their own, individual shallow copy of the payload. You can then modify the contents in the payload and use it to notify downstream plugins, without affecting the payloads of plugins which aren't downstream. The problem is that if the shallow copy isn't made, modifications you make may then show up in the payloads of plugins which aren't downstream, such as those observing the same upstream plugin but which happen to be executed later; the order of execution of these "sibling" plugins isn't guaranteed.
The other benefit of the shallow copy is that only references to constituent objects are copied, so if one of your dictionary elements contains a long list, the shallow copy only copies the reference to the list rather than the whole list itself. This saves memory and time. However, this also means that any mutable elements can be modified underneath the payload, and these changes would be reflected in the payloads of subsequent (but not necessarily downstream) plugins. For reference, the mutable types in python are list
, dict
, set
, and byte array
. (This problem doesn't exist for immutable types like basic types, strings, and tuples.) On the other hand, there is little need to change these payload elements, so it's probably best to just leave them alone. Instead, do your calculations and just add to the payload.
Be careful to check the validity of the data in the payload before you use it.
With unit tests, you can construct contrived tests of your plugin by itself. This is better done in isolation, rather than by seeing it malfunction among a lot of other calculations.
Go to the tests
directory and create a file named something like test_myplugin.py
. Import the following:
import unittests
from snewpdag.dag.app import configure, inject
Of course, you may also want to import numpy
and healpy
as well.
Then define a test case, in which you then write the individual tests. The usual pattern is to set up the plugin as a one-node DAG, construct an input data dictionary, and inject it into the DAG. Then, extract the last data with which the plugin notified its children (of which there aren't any, but the last data is stored anyway), and make the usual test assertions against it.
class TestMyPlugin(unittest.TestCase):
spec = [ # specification for an array of nodes
{ # each node is specified with a dictionary
'name': 'Node1', # a name by which the node is referenced
'class': 'MyPlugin', # the class
'kwargs': { 'myarg': 42 } # initialization arguments (see MyPlugin.__init__)
}
]
nodes = configure(spec) # construct the DAG
data = [ # an array of alerts
{ # each alert is specified with a dictionary
'name': 'Node1', # the node to be notified with this alert
'action': 'alert',
}
]
inject(nodes, data) # send data through the DAG
tdata = nodes['Node1'].last_data # results of last calculation
self.assertEqual(tdata['action'], 'alert')
Note that in the data
dictionary, you don't have to give a history
field, since there isn't any history at the point of injection. The notify
method constructs a history field if there wasn't one there before. However, if your plugin needs to identify its sources, you'll need to pretend it comes from some input node:
data = [ { 'name': 'Node1', 'action': 'alert', 'history': ('Input1',) },
{ 'name': 'Node1', 'action': 'alert', 'history': ('Input2',) } ]
inject(nodes, data)
tdata = nodes['Node1'].last_data
self.assertEqual(tdata['history'], ( ('Input1',), ('Input2',), 'Node1' ) )
Note that the histories are tuples, which is why the extraneous-looking commas are necessary.
When you've written your test cases, you can use the command python -m unittest snewpdag.tests.test_myplugin
from the root of your snewpdag
directory. When you want it to become a regular part of the unittests, add it to the Makefile
until the test
target.
You can also set up a full graph. Examples can be found in the snewpdag/data
subdirectory, such as test-liq-config.py
.
If you find python-style dictionaries too fiddly to edit (as probably most will), you can also use a csv file which you can edit using a spreadsheet program. See test-csv.csv
in the data subdirectory.
Further instructions will be written...
When your plugin is ready, push your branch to the SNEWS repository and make a pull request.
This section describes features which were useful in the past, but have been superseded by (hopefully) easier ways to do things. The information is kept here for reference.
Note: this section describes the general update
method, which is more finicky and requires the developer to
handle more details. It's easier and preferred to use the individual action methods, described above.
Instead of using the action methods, the developer can instead override the update
method.
The signature is update(self, data)
, where data
is the payload with which to update the calculation.
The payload dictionary will have the following fields:
field | type | description |
---|---|---|
action | string | alert action, either alert or revoke
|
history | tuple | list of plugin names. The last will be the source of the data. |
other | other fields to be transmitted |
def update(self, data):
action = data['action'] # the alert action, which is either 'alert' or 'revoke'
source = data['history'][-1] # the alert source
if action == 'alert':
# get set up for new data
# update the data for this source and flag it as valid
self.map[source] = data
self.map[source]['valid'] = True
elif action == `revoke':
# get set up to revoke data (data from other experiments remains valid)
# flag the data as invalid
if source in self.map:
self.map[source]['valid'] = False
else:
logging.error('[{}] revoke received for unknown source {}'.format(self.name, source))
return
else:
logging.error('[{}] unknown action {}'.format(self.name, action)
At this point, start constructing the result of the calculation, which will be passed to the next plugin.
Here I've called it ndata
. Then check to see if there's any valid data with which to compute,
in which case the action for the next plugin will be alert
. If all the data for this plugin has
been revoked, then one can pass revoke
to the next plugin.
Remember to check the data from each source is valid before you use it.
To notify the downstream plugins, you need an action_verb
(usually alert
or revoke
), a history,
and ndata
. Since most plugins will combine the results of
previous plugins, the history will take the form of a nested tuple. (Don't add the name of your plugin;
that will be done when you call the notify
method to transmit your results to downstream plugins.)
hlist = []
for k in self.map:
v = self.map[k]
if v['valid']:
hlist.append(v['history'])
# notify
self.notify(action_verb, tuple(hist), ndata)
(You can also copy the history from the input data into ndata
, and then pass in None
in
place of the history tuple. notify()
will then append the current plugin name to the
history in ndata
. But this is prone to error, so this feature will probably disappear soon.)
You're done with the code. Now add your class to plugins/__init__.py
so it can be found by others.