Major release: streamparse runner, component API updates, Python logging support and more
This is a major release that introduces several potentially breaking API
changes, hence the major version advancement to 1.0.0. Please read the changes
below as well as our migration wiki
guide before upgrading.
streamparse runner
- Added a new runner for topology components and as a result, a new way to
define topologies (extension to the existing Clojure DSL). More info can be
found in the docs and in the 1.0.0 migration guide. - Quickstart projects have been updated to now have nested directories
"src/bolts" and "src/spouts".
Component API updates
- Added
auto_anchor
,auto_ack
andauto_fail
flags to base Bolt class. See
docs for detailed descriptions of these flags and migration page for info on
how to safely upgrade your bolts. BasicBolt
is now deprecated.- Class var
BatchingBolt.SECS_BETWEEN_BATCHES
renamed to
secs_between_batches
since this isn't a constant, just a setting. Spout.emit
andBolt.emit
now returns a list of task IDs a tuple was
emitted to unlessneed_task_ids
kwarg is set toFalse
.Spout.emit_many
andBolt.emit_many
now return a two-dimensional list of
the task IDs each emit tuple was sent to unlessneed_task_ids
kwarg is set
toFalse
.BatchingBolt
does not return task IDs due to concurrency issues that will
be addressed in a future release.- Spouts and bolts now have the following instance variables which users are
free to use. These are initialized before the call toinitialize()
:_topology_name
- name of the topology when submitted to Storm._task_id
- task ID of the current component in the topology._component_name
- the name of the current component as defined in the
Clojure definition (e.g. "my-bolt")._debug
- thetopology.debug
setting (configured using the sparse
--debug
flag)._storm_conf
- the entire config dict recieved on initial handshake._context
- the entire context dict receieved on initial handshake.
BatchingBolt
threads (main and _batcher) now have more descriptive names.
IPC
- A lot of code cleanup around how we read from and write to stdin and stdout.
Much more stable here but no breaking changes for users since they should
never interface with these methods directly.
Logging
- Full support for Python logging has now been added. Lots of info in the docs
on this and the migration guide. Logging config settings added to new
quickstart projects. - print statements are now properly sent to the component's log file - feel
free to add print statements for handy debugging or better yet, create a
logger for nicely formatted messages. - Full support for Storm log levels once STORM-414 is merged in.
Administration
- Added pre and post submit hooks in fabric and invoke for topologies to
enable users to run arbitrary code (e.g. send IRC message) after topologies
are submitted (info in docs).\ sparse tail
command now requires-n <topology>
argument as it will tail
only the logs for a specific topology and environment.- Added
remove_logs
fab task which users can optionally hook up to a
pre_submit
hook to clear out Python logs.
Testing
- Better test support added for all our components, more improvements to come
here.