Skip to content

streamparse 2.0.0

Compare
Choose a tag to compare
@dan-blanchard dan-blanchard released this 16 Jun 20:27

This release adds a bunch of new functionality (e.g., additional subcommands), but also changes some things that were not used by a lot of people in backward-incompatible ways.

⚠️ API BREAKING CHANGES ⚠️

  • BatchingBolt now uses tick tuples instead of a separate timer thread. This is an API-breaking change, as BatchingBolt.secs_between_batches is now BatchingBolt.ticks_between_batches. You also will need to make sure you run your Storm topology with topology.tick.tuple.freq.secs set to how frequently you want the ticks to occur. Read the docs for more details. (#125, #137)
  • streamparse fabric and invoke tasks have been moved to sparse sub-commands:
  • fab remove_logs ➡️ sparse remove_logs
  • fab tail_logs ➡️ sparse tail
  • fab activate_env is no longer necessary, as all commands that need the fabric environment modified do this automatically.
  • fab create_or_update_virtualenvs ➡️ sparse update_virtualenv (note the case change, since this only every worked on a single virtualenv at a time)
  • inv jar_for_deploy ➡️ sparse jar
  • inv list_topologies ➡️ sparse list
  • inv kill_topology ➡️ sparse kill
  • inv run_local_topology ➡️ sparse run
  • inv submit_topology ➡️ sparse submit
  • inv tail_topology ➡️ sparse tail
  • inv visualize_topology ➡️ sparse visualize
  • inv prepare_topology has been removed because the commands that relied on it (sparse run, sparse submit, and sparse jar) all call streamparse.util.prepare_topology automatically.
  • The streamparse.ext package has been removed and so have the streamparse.ext.fabric and streamparse.ext.invoke modules.
  • streamparse.ext.util ➡️ streamparse.util
  • Users should no longer do from streamparse.ext.fabric import * and from streamparse.ext.invoke import * in their projects' fabfile.py and tasks.py files. pre_submit and post_submit hooks will be executed automatically even without this.

Major enhancements

  • sparse run now runs indefinitely by default (#122)
  • Added Bolt.process_tick(tup) method for processing tick tuples (#116, #124)
  • Added sparse worker_uptime and sparse stats commands for getting information about running Storm topologies and their workers. (#17, #52)
  • --ackers and --workers can now be specified as separate arguments to sparse submit and sparse run, instead of just using --par. (#74, #97)
  • Bolt.emit_many() is now deprecated and will be removed in streamparse 3.0. Please just call Bolt.emit() repeatedly instead. (#66)
  • Added lots of documentation about how topologies work and how to get started with streamparse. (#26, #103)
  • Added conda recipe template for building a streamparse conda package. (#105)
  • SSH tunnels are no longer required for kill, list, and submit commands (#96, #98, #112).
  • env.use_ssh_config is True by default now (#54)
  • Can now deploy/build simple JARs in addition to Uber-JARs. This speeds up sparse submit for pure Python projects. (#106)
  • Added sparse jar, sparse remove_logs, and sparse update_virtualenv commands to replace old Fabric and Invoke tasks.

Minor enhancements

  • Removed dependency on docopt and switched to using argparse for command-line arguments. Now sub-commands all have their own detailed --help switches (e.g., sparse run --help) and sparse --help will list all of the available commands with a brief description of what they do. (#115, #152)
  • Added first pieces of support for a Python DSL for defining topologies (#84) as part of a grander vision to move away from Clojure (#136). Please note that this cannot actually be used yet, because the utility to take the Python DSL and then generate something Storm understands out of it has not been written yet.
  • Overhauled unit tests to separate simplify IPC testing (#41, #47).
  • Added documentation on using an unofficial version of Storm (#142)
  • Added support for Tox (#128)
  • Updated spouts and bolts to allow Python tuples to be emitted. (#119)
  • Switched to using Travis Docker containers for building (#90)
  • Made update of virtualenvs optional by seeing if requirements.txt exists (#60)
  • Created a new storm subpackage, which will be split off into its own package (pystorm) for version 3.0 of streamparse. This contains all of the IPC/Multi-Lang related code. In the future streamparse will just be a collection of utilities for managing Storm topologies/clusters.
  • Moved a lot of code from the Spout and Bolt classes into the Component parent class to cut down on code duplication.

Bugfixes

  • Fixed multithreaded emitting (#101, #133)
  • sparse commands that use lein underneath now display output from lein immediately. (#109)
  • Fixed typo in config name to get maxbytes (#110)
  • We now reset Botl.current_tups even when receiving a heartbeat (#107)
  • Spout.emit_many() works again (#144)
  • sparse tail tails all machines now. (#104)

Contributors for this release (by number of commits)

Thanks to all our contributors!