Skip to content

Releases: PrefectHQ/prefect

The Release is Bright and Full of Features

07 May 20:44
c39aa11
Compare
Choose a tag to compare

Changelog

0.5.3

Released May 7, 2019

Features

  • Add new Storage and Environment specifications - #936, #956

Enhancements

  • Flow now has optional storage keyword - #936
  • Flow environment argument now defaults to a CloudEnvironment - #936
  • Queued states accept start_time arguments - #955
  • Add new Bytes and Memory storage classes for local testing - #956, #961
  • Add new LocalEnvironment execution environment for local testing - #957
  • Add new Aborted state for Flow runs which are cancelled by users - #959
  • Added an execute-cloud-flow CLI command for working with cloud deployed flows - #971
  • Add new flows.run_on_schedule configuration option for affecting the behavior of flow.run - #972
  • Allow for Tasks with manual_only triggers to be root tasks - #667
  • Allow compression of serialized flows #993
  • Allow for serialization of user written result handlers - #623
  • Allow for state to be serialized in certain triggers and cache validators - #949
  • Add new filename keyword to flow.visualize for automatically saving visualizations - #1001
  • Add new LocalStorage option for storing Flows locally - #1006

Task Library

  • None

Fixes

  • Fix Docker storage not pulling correct flow path - #968
  • Fix run_flow loading to decode properly by use cloudpickle - #978
  • Fix Docker storage for handling flow names with spaces and weird characters - #969
  • Fix non-deterministic issue with mapping in the DaskExecutor - #943

Breaking Changes

  • Remove flow.id and task.id attributes - #940
  • Removed old WIP environments - #936
    (Note: Changes from #936 regarding environments don't break any Prefect code because environments weren't used yet outside of Cloud.)
  • Update flow.deploy and client.deploy to use set_schedule_active kwarg to match Cloud - #991
  • Removed Flow.generate_local_task_ids() - #992

Contributors

  • None

Unredacted: The 0.5.2 Release

19 Apr 15:50
c6d115e
Compare
Choose a tag to compare

0.5.2

Released April 19, 2019

Features

  • Implement two new triggers that allow for specifying bounds on the number of failures or successes - #933

Enhancements

  • DaskExecutor(local_processes=True) supports timeouts - #886
  • Calling Secret.get() from within a Flow context raises an informative error - #927
  • Add new keywords to Task.set_upstream and Task.set_downstream for handling keyed and mapped dependencies - #823
  • Downgrade default logging level to "INFO" from "DEBUG" - #935
  • Add start times to queued states - #937
  • Add is_submitted to states - #944
  • Introduce new ClientFailed state - #938

Task Library

  • Add task for sending Slack notifications via Prefect Slack App - #932

Fixes

  • Fix issue with timeouts behaving incorrectly with unpickleable objects - #886
  • Fix issue with Flow validation being performed even when eager validation was turned off - #919
  • Fix issue with downstream tasks with all_failed triggers running if an upstream Client call fails in Cloud - #938

Breaking Changes

  • Remove prefect make user config from cli commands - #904
  • Change set_schedule_active keyword in Flow deployments to set_schedule_inactive to match Cloud - #941

Contributors

  • None

It Takes a Village

04 Apr 20:33
71829f4
Compare
Choose a tag to compare

0.5.1

Released April 4, 2019

Features

  • API reference documentation is now versioned - #270
  • Add S3ResultHandler for handling results to / from S3 buckets - #879
  • Add ability to use Cached states across flow runs in Cloud - #885

Enhancements

  • Bump to latest version of pytest (4.3) - #814
  • Client.deploy accepts optional build kwarg for avoiding building Flow environment - #876
  • Bump distributed to 1.26.1 for enhanced security features - #878
  • Local secrets automatically attempt to load secrets as JSON - #883
  • Add task logger to context for easily creating custom logs during task runs - #884

Task Library

  • Add ParseRSSFeed for parsing a remote RSS feed - #856
  • Add tasks for working with Docker containers and imaged - #864
  • Add task for creating a BigQuery table - #895

Fixes

  • Only checkpoint tasks if running in cloud - #839, #854
  • Adjusted small flake8 issues for names, imports, and comparisons - #849
  • Fix bug preventing flow.run from properly using cached tasks - #861
  • Fix tempfile usage in flow.visualize so that it runs on Windows machines - #858
  • Fix issue caused by Python 3.5.2 bug for Python 3.5.2 compatibility - #857
  • Fix issue in which GCSResultHandler was not pickleable - #879
  • Fix issue with automatically converting callables and dicts to tasks - #894

Breaking Changes

  • Change the call signature of Dict task from run(**task_results) to run(keys, values) - #894

Contributors

Open Source Launch!

24 Mar 17:10
d1bd136
Compare
Choose a tag to compare

0.5.0

Released March 24, 2019

Features

  • Add checkpoint option for individual Tasks, as well as a global checkpoint config setting for storing the results of Tasks using their result handlers - #649
  • Add defaults_from_attrs decorator to easily construct Tasks whose attributes serve as defaults for Task.run - #293
  • Environments follow new hierarchy (PIN-3) - #670
  • Add OneTimeSchedule for one-time execution at a specified time - #680
  • flow.run is now a blocking call which will run the Flow, on its schedule, and execute full state-based execution (including retries) - #690
  • Pre-populate prefect.context with various formatted date strings during execution - #704
  • Add ability to overwrite task attributes such as "name" when calling tasks in the functional API - #717
  • Release Prefect Core under the Apache 2.0 license - #762

Enhancements

  • Refactor all State objects to store fully hydrated Result objects which track information about how results should be handled - #612, #616
  • Add google.cloud.storage as an optional extra requirement so that the GCSResultHandler can be exposed better - #626
  • Add a start_time check for Scheduled flow runs, similar to the one for Task runs - #605
  • Project names can now be specified for deployments instead of IDs - #633
  • Add a createProject mutation function to the client - #633
  • Add timestamp to auto-generated API docs footer - #639
  • Refactor Result interface into Result and SafeResult - #649
  • The manual_only trigger will pass if resume=True is found in context, which indicates that a Resume state was passed - #664
  • Added DockerOnKubernetes environment (PIN-3) - #670
  • Added Prefect docker image (PIN-3) - #670
  • defaults_from_attrs now accepts a splatted list of arguments - #676
  • Add retry functionality to flow.run(on_schedule=True) for local execution - #680
  • Add helper_fns keyword to ShellTask for pre-populating helper functions to commands - #681
  • Convert a few DEBUG level logs to INFO level logs - #682
  • Added DaskOnKubernetes environment (PIN-3) - #695
  • Load context from Cloud when running flows - #699
  • Add Queued state - #705
  • flow.serialize() will always serialize its environment, regardless of build - #696
  • flow.deploy() now raises an informative error if your container cannot deserialize the Flow - #711
  • Add _MetaState as a parent class for states that modify other states - #726
  • Add flow keyword argument to Task.set_upstream() and Task.set_downstream() - #749
  • Add is_retrying() helper method to all State objects - #753
  • Allow for state handlers which return None - #753
  • Add daylight saving time support for CronSchedule - #729
  • Add idempotency_key and context arguments to Client.create_flow_run - #757
  • Make EmailTask more secure by pulling credentials from secrets - #706

Task Library

  • Add GCSUpload and GCSDownload for uploading / retrieving string data to / from Google Cloud Storage - #673
  • Add BigQueryTask and BigQueryInsertTask for executing queries against BigQuery tables and inserting data - #678, #685
  • Add FilterTask for filtering out lists of results - #637
  • Add S3Download and S3Upload for interacting with data stored on AWS S3 - #692
  • Add AirflowTask and AirflowTriggerDAG tasks to the task library for running individual Airflow tasks / DAGs - #735
  • Add OpenGitHubIssue and CreateGitHubPR tasks for interacting with GitHub repositories - #771
  • Add Kubernetes tasks for deployments, jobs, pods, and services - #779
  • Add Airtable tasks - #803
  • Add Twitter tasks - #803
  • Add GetRepoInfo for pulling GitHub repository information - #816

Fixes

  • Fix edge case in doc generation in which some Exceptions' call signature could not be inspected - #513
  • Fix bug in which exceptions raised within flow runner state handlers could not be sent to Cloud - #628
  • Fix issue wherein heartbeats were not being called on a fixed interval - #669
  • Fix issue wherein code blocks inside of method docs couldn't use **kwargs - #658
  • Fix bug in which Prefect-generated Keys for S3 buckets were not properly converted to strings - #698
  • Fix next line after Docker Environment push/pull from overwriting progress bar - #702
  • Fix issue with JinjaTemplate not being pickleable - #710
  • Fix issue with creating secrets from JSON documents using the Core Client - #715
  • Fix issue with deserialization of JSON secrets unnecessarily calling json.loads - #716
  • Fix issue where IntervalSchedules didn't respect daylight saving time after serialization - #729

Breaking Changes

  • Remove the BokehRunner and associated webapp - #609
  • Rename ResultHandler methods from serialize / deserialize to write / read - #612
  • Refactor all State objects to store fully hydrated Result objects which track information about how results should be handled - #612, #616
  • Client.create_flow_run now returns a string instead of a GraphQLResult object to match the API of deploy - #630
  • flow.deploy and client.deploy require a project_name instead of an ID - #633
  • Upstream state results now take precedence for task inputs over cached_inputs - #591
  • Rename Match task (used inside control flow) to CompareValue - #638
  • Client.graphql() now returns a response with up to two keys (data and errors). Previously the data key was automatically selected - #642
  • ContainerEnvironment was changed to DockerEnvironment - #670
  • The environment from_file was moved to utilities.environments - #670
  • Removed start_tasks argument from FlowRunner.run() and check_upstream argument from TaskRunner.run() - #672
  • Remove support for Python 3.4 - #671
  • flow.run is now a blocking call which will run the Flow, on its schedule, and execute full state-based execution (including retries) - #690
  • Remove make_return_failed_handler as flow.run now returns all task states - #693
  • Refactor Airflow migration tools into a single AirflowTask in the task library for running individual Airflow tasks - #735
  • name is now required on all Flow objects - #732
  • Separate installation "extras" packages into multiple, smaller extras - #739
    ...
Read more

Version 0.4.1

31 Jan 19:57
976b205
Compare
Choose a tag to compare

Major Features

  • Add ability to run scheduled flows locally via on_schedule kwarg in flow.run() - #519
  • Allow tasks to specify their own result handlers, ensure inputs and outputs are stored only when necessary, and ensure no raw data is sent to the database - #587

Minor Features

  • Allow for building ContainerEnvironments locally without pushing to registry - #514
  • Make mapping more robust when running children tasks multiple times - #541
  • Always prefer cached_inputs over upstream states, if available - #546
  • Add hooks to FlowRunner.initialize_run() for manipulating task states and contexts - #548
  • Improve state-loading strategy for Prefect Cloud - #555
  • Introduce on_failure kwarg to Tasks and Flows for user-friendly failure callbacks - #551
  • Include scheduled_start_time in context for Flow runs - #524
  • Add GitHub PR template - #542
  • Allow flows to be deployed to Prefect Cloud without a project id - #571
  • Introduce serialization schemas for ResultHandlers - #572
  • Add new metadata attribute to States for managing user-generated results - #573
  • Add new 'JSONResultHandler' for serializing small bits of data without external storage - #576
  • Use JSONResultHandler for all Parameter caching - #590

Fixes

  • Fixed flow.deploy() attempting to access a nonexistent string attribute - #503
  • Ensure all logs make it to the logger service in deployment - #508, #552
  • Fix a situation where Paused tasks would be treated as Pending and run - #535
  • Ensure errors raised in state handlers are trapped appropriately in Cloud Runners - #554
  • Ensure unexpected errors raised in FlowRunners are robustly handled - #568
  • Fixed non-deterministic errors in mapping caused by clients resolving futures of other clients - #569
  • Older versions of Prefect will now ignore fields added by newer versions when deserializing objects - #583
  • Result handler failures now result in clear task run failures - #575
  • Fix issue deserializing old states with empty metadata - #590
  • Fix issue serializing cached_inputs - #594

Breaking Changes

  • Move prefect.client.result_handlers to prefect.engine.result_handlers - #512
  • Removed inputs kwarg from TaskRunner.run() - #546
  • Moves the start_task_ids argument from FlowRunner.run() to Environment.run() - #544, #545
  • Convert timeout kwarg from timedelta to integer - #540
  • Remove timeout kwarg from executor.wait - #569
  • Serialization of States will ignore any result data that hasn't been processed - #581
  • Removes VersionedSchema in favor of implicit versioning: serializers will ignore unknown fields and the create_object method is responsible for recreating missing ones - #583
  • Convert and rename CachedState to a successful state named Cached, and also remove the superfluous cached_result attribute - #586

Version 0.4.0

08 Jan 16:08
b54022f
Compare
Choose a tag to compare

Major Features

  • Add support for Prefect Cloud - #374, #406, #473, #491
  • Add versioned serialization schemas for Flow, Task, Parameter, Edge, State, Schedule, and Environment objects - #310, #318, #319, #340
  • Add ability to provide ResultHandlers for storing private result data - #391, #394, #430
  • Support depth-first execution of mapped tasks and tracking of both the static "parent" and dynamic "children" via Mapped states - #485

Minor Features

  • Add new TimedOut state for task execution timeouts - #255
  • Use timezone-aware dates throughout Prefect - #325
  • Add description and tags arguments to Parameters - #318
  • Allow edge key checks to be skipped in order to create "dummy" flows from metadata - #319
  • Add new names_only keyword to flow.parameters - #337
  • Add utility for building GraphQL queries and simple schemas from Python objects - #342
  • Add links to downloadable Jupyter notebooks for all tutorials - #212
  • Add to_dict convenience method for DotDict class - #341
  • Refactor requirements to a custom ini file specification - #347
  • Refactor API documentation specification to toml file - #361
  • Add new SQLite tasks for basic SQL scripting and querying - #291
  • Executors now pass map_index into the TaskRunners - #373
  • All schedules support start_date and end_date parameters - #375
  • Add DateTime marshmallow field for timezone-aware serialization - #378
  • Adds ability to put variables into context via the config - #381
  • Adds new client.deploy method for adding new flows to the Prefect Cloud - #388
  • Add id attribute to Task class - #416
  • Add new Resume state for resuming from Paused tasks - #435
  • Add support for heartbeats - #436
  • Add new Submitted state for signaling that Scheduled tasks have been handled - #445
  • Add ability to add custom environment variables and copy local files into ContainerEnvironments - #453
  • Add set_secret method to Client for creating and setting the values of user secrets - #452
  • Refactor runners into CloudTaskRunner and CloudFlowRunner classes - #431
  • Added functions for loading default engine classes from config - #477

Fixes

  • Fixed issue with GraphQLResult reprs - #374
  • CronSchedule produces expected results across daylight savings time transitions - #375
  • utilities.serialization.Nested properly respects marshmallow.missing values - #398
  • Fixed issue in capturing unexpected mapping errors during task runs - #409
  • Fixed issue in flow.visualize() so that mapped flow states can be passed and colored - #387
  • Fixed issue where IntervalSchedule was serialized at "second" resolution, not lower - #427
  • Fixed issue where SKIP signals were preventing multiple layers of mapping - #455
  • Fixed issue with multi-layer mapping in flow.visualize() - #454
  • Fixed issue where Prefect Cloud cached_inputs weren't being used locally - #434
  • Fixed issue where Config.set_nested would have an error if the provided key was nested deeper than an existing terminal key - #479
  • Fixed issue where state_handlers were not called for certain signals - #494

Breaking Changes

  • Remove NoSchedule and DateSchedule schedule classes - #324
  • Change serialize() method to use schemas rather than custom dict - #318
  • Remove timestamp property from State classes - #305
  • Remove the custom JSON encoder library at prefect.utilities.json - #336
  • flow.parameters now returns a set of parameters instead of a dictionary - #337
  • Renamed to_dotdict -> as_nested_dict - #339
  • Moved prefect.utilities.collections.GraphQLResult to prefect.utilities.graphql.GraphQLResult - #371
  • SynchronousExecutor now does not do depth first execution for mapped tasks - #373
  • Renamed prefect.utilities.serialization.JSONField -> JSONCompatible, removed its max_size feature, and no longer automatically serialize payloads as strings - #376
  • Renamed prefect.utilities.serialization.NestedField -> Nested - #376
  • Renamed prefect.utilities.serialization.NestedField.dump_fn -> NestedField.value_selection_fn for clarity - #377
  • Local secrets are now pulled from secrets in context instead of _secrets - #382
  • Remove Task and Flow descriptions, Flow project & version attributes - #383
  • Changed Schedule parameter from on_or_after to after - #396
  • Environments are immutable and return dict keys instead of str; some arguments for ContainerEnvironment are removed - #398
  • environment.run() and environment.build(); removed the flows CLI and replaced it with a top-level CLI command, prefect run - #400
  • The set_temporary_config utility now accepts a single dict of multiple config values, instead of just a key/value pair, and is located in utilities.configuration - #401
  • Bump click requirement to 7.0, which changes underscores to hyphens at CLI - #409
  • IntervalSchedule rejects intervals of less than one minute - #427
  • FlowRunner returns a Running state, not a Pending state, when flows do not finish - #433
  • Remove the task_contexts argument from FlowRunner.run() - #440
  • Remove the leading underscore from Prefect-set context keys - #446
  • Removed throttling tasks within the local cluster - #470
  • Even start_tasks will not run before their state's start_time (if the state is Scheduled) - #474
  • DaskExecutor's "processes" keyword argument was renamed "local_processes" - #477
  • Removed the mapped and map_index kwargs from TaskRunner.run(). These values are now inferred automatically - #485
  • The upstream_states dictionary used by the Runners only includes State values, not lists of States. The use case that required lists of States is now covered by the Mapped state. - #485

Version 0.3.3

30 Oct 16:22
bf420ea
Compare
Choose a tag to compare

Major Features

  • Refactor FlowRunner and TaskRunner into a modular Runner pipelines - #260, #267
  • Add configurable state_handlers for FlowRunners, Flows, TaskRunners, and Tasks - #264, #267
  • Add gmail and slack notification state handlers w/ tutorial - #274, #294

Minor Features

  • Add a new method flow.get_tasks() for easily filtering flow tasks by attribute - #242
  • Add new JinjaTemplateTask for easily rendering jinja templates - #200
  • Add new PAUSE signal for halting task execution - #246
  • Add new Paused state corresponding to PAUSE signal, and new pause_task utility - #251
  • Add ability to timeout task execution for all executors except DaskExecutor(processes=True) - #240
  • Add explicit unit test to check Black formatting (Python 3.6+) - #261
  • Add ability to set local secrets in user config file - #231, #274
  • Add is_skipped() and is_scheduled() methods for State objects - #266, #278
  • Adds now() as a default start_time for Scheduled states - #278
  • Signal classes now pass arguments to underlying State objects - #279
  • Run counts are tracked via Retrying states - #281

Fixes

  • Flow consistently raises if passed a parameter that doesn't exist - #149

Breaking Changes

  • Renamed scheduled_time -> start_time in Scheduled state objects - #278
  • TaskRunner.check_for_retry no longer checks for Retry states without start_time set - #278
  • Swapped the position of result and message attributes in State initializations, and started storing caught exceptions as results - #283

Version 0.3.2

02 Oct 19:36
3d78398
Compare
Choose a tag to compare

Major Features

  • Local parallelism with DaskExecutor - #151, #186
  • Resource throttling based on tags - #158, #186
  • Task.map for mapping tasks - #186
  • Added AirFlow utility for importing Airflow DAGs as Prefect Flows - #232

Minor Features

  • Use Netlify to deploy docs - #156
  • Add changelog - #153
  • Add ShellTask - #150
  • Base Task class can now be run as a dummy task - #191
  • New return_failed keyword to flow.run() for returning failed tasks - #205
  • some minor changes to flow.visualize() for visualizing mapped tasks and coloring nodes by state - #202
  • Added new flow.replace() method for swapping out tasks within flows - #230
  • Add debug kwarg to DaskExecutor for optionally silencing dask logs - #209
  • Update BokehRunner for visualizing mapped tasks - #220
  • Env var configuration settings are typed - #204
  • Implement map functionality for the LocalExecutor - #233

Fixes

  • Fix issue with Versioneer not picking up git tags - #146
  • DotDicts can have non-string keys - #193
  • Fix unexpected behavior in assigning tags using contextmanagers - #190
  • Fix bug in initialization of Flows with only edges - #225
  • Remove "bottleneck" when creating pipelines of mapped tasks - #224

Breaking Changes

  • Runner refactor - #221
  • Cleaned up signatures of TaskRunner methods - #171
  • Locally, Python 3.4 users can not run the more advanced parallel executors (DaskExecutor) #186

Version 0.3.1

06 Sep 16:52
1772d4e
Compare
Choose a tag to compare

Major Features

  • Support for user configuration files - #195

Minor Features

  • None

Fixes

  • Let DotDicts accept non-string keys - #193, #194

Breaking Changes

  • None

Version 0.3.0

20 Aug 14:09
813f17d
Compare
Choose a tag to compare

Major Features

  • BokehRunner - #104, #128
  • Control flow: ifelse, switch, and merge - #92
  • Set state from reference_tasks - #95, #137
  • Add flow Registry - #90
  • Output caching with various cache_validators - #84, #107
  • Dask executor - #82, #86
  • Automatic input caching for retries, manual-only triggers - #78
  • Functional API for Flow definition
  • State classes
  • Signals to transmit State

Minor Features

  • Add custom syntax highlighting to docs - #141
  • Add bind() method for tasks to call without copying - #132
  • Cache expensive flow graph methods - #125
  • Docker environments - #71
  • Automatic versioning via Versioneer - #70
  • TriggerFail state - #67
  • State classes - #59

Fixes

  • None

Breaking Changes

  • None