Skip to content

Releases: octue/octue-sdk-python

Speed up event replaying

17 Jul 14:17
84b418a
Compare
Choose a tag to compare

Contents (#669)

Enhancements

  • Skip non-result event validation if only result is required
  • Add ability to skip event validation in event handlers
  • Make diagnostics log messages more consistent
  • Allow instantiation of Diagnostics, Topic, Subscription, and GoogleCloudPubSubEventHandler without cloud credentials

Refactoring

  • Update from deprecated datetime.datetime.utcnow method
  • Use cached_property in Service
  • Remove unused attributes on MockService and Runner

Testing

  • Implement MockSubscription.delete

Update child emulator and improve manifest dataset download

16 Jul 16:25
b3cfa00
Compare
Choose a tag to compare

Contents (#668)

IMPORTANT: There are 2 breaking changes.

Enhancements

  • 💥 BREAKING CHANGE: Update ChildEmulator to use EventReplayer, support schema-compliant events and attributes, and support heartbeats and delivery acknowledgement events. This significantly simplifies the emulator
  • 💥 BREAKING CHANGE: Remove ChildEmulator.from_file
  • Download manifest datasets to same directory by default

Refactoring

  • Move ServicePatcher into its own module

Upgrade instructions

💥 Update `ChildEmulator` to use `EventReplayer` and full events

Give events (including attributes) that satisfy the service communication schema to child emulators.

💥 Remove `ChildEmulator.from_file`

Load the JSON file separately and pass the events into the ChildEmulator constructor.

Enable question chaining

15 Jul 14:24
3dc86aa
Compare
Choose a tag to compare

Summary

This release makes major improvements to event handling and question auditing. Some of the main changes are:

  • Questions are now automatically associated with their parent question and the question that originated them, however deep they are in a question tree
  • Events are ordered by datetime by the event backend, not the SDK
  • Better feedback is provided when asking questions in parallel
  • You can specify the event store to use
  • Log message contexts have been slimmed down without losing any information, and events are replayable with no context (good for smaller screens)
  • Various public classes and functions are faster and easier to use
  • Question retries have the same question UUID

Contents (#660)

IMPORTANT: There are 6 breaking changes.

New features

Events

  • 💥 BREAKING CHANGE: Add parent_question_uuid, originator_question_uuid, originator and retry_count event attributes
  • Avoid redelivery of questions by checking the event store on delivery

Event handlers

  • Add ability to not include service metadata in logs in even handlers
  • Enable EventReplayer to handle question events
  • Add RegisteredTemporaryDirectory class, use it when downloading datasets, and add ability to delete them at end of analysis

Enhancements

Resources

  • 💥 BREAKING CHANGE: Make datasets recursive by default in Dataset
  • Log a warning if a dataset is empty at instantiation

Services

  • 💥 BREAKING CHANGE: Remove name argument from Service and provide an SRUID to Child internal service instead of a name
  • Improve logging of errors, retries, and threading in Child.ask_multiple
  • Order pub/sub messages by datetime using ordering key and remove order event attribute
  • Set question UUIDs in advance in Child.ask_multiple

Subscriptions

  • Allow existing subscriptions in create_push_subscription
  • Give feedback on (un)successful push subscription creation in CLI

Questions and events

  • Remove unnecessary sender argument from get_events and make getting the tail of events the default
  • Allow retried questions to have the same UUID
  • Allow explicit question retries by using retry_count attribute
  • Return empty list from get_events if no events for question

Service configuration

  • Allow setting of event store table ID and delete_local_files in service configuration
  • Use envvar to specify service configuration location by default
  • Add overrides option to Runner.from_configuration

Other

  • Log warning when PYTHONUNBUFFERED envvar is unset
  • Remove "analysis-" from start of question UUIDs in log context

Fixes

  • 💥 BREAKING CHANGE: Return question UUID alongside error from Child.ask_multiple for failed questions
  • Set analysis ID at start of Runner.run
  • Emit correct logs when no diagnostics available with octue get-diagnostics
  • Fix deserialisation of events in get_events
  • Use (meta-)generation agnostic retry strategy with cloud storage
  • Return correct question UUIDs with failed questions from Child.ask_multiple
  • Avoid logging that app failed when it didn't when uploading diagnostics
  • Allow setting of max_workers when CPU count is indeterminate
  • Disable delete_local_files by default

Operations

  • Update event handler and its bigquery table

Dependencies

  • Loosen Sphinx and other docs package ranges
  • Remove unneeded db-dtypes package
  • Make google-cloud-bigquery a mandatory dependency
  • Upgrade google-cloud-secret-manager

Refactoring

Event handlers

  • 💥 BREAKING CHANGE: Remove redundant datetime from delivery ack and heartbeat events
  • 💥 BREAKING CHANGE: Rename originator event attribute to parent
  • Factor out finalising and cleaning up in Runner
  • Move service accounts into separate terraform file
  • Cache metadata against datafile/dataset instead of path
  • Rename python3.9 dockerfile to reflect its python version

Upgrade instructions

  • Add recursive=False to Dataset instantiations
  • Update all services in your service network to use octue>=0.56.0
  • Use version 0.6.1 of the event handler or above and a correspondingly up-to-date BigQuery table.
  • Swap the internal_service_name argument for internal_sruid argument to Child.__init__ and provide a valid SRUID
  • Instances of Service can no longer be given names. Please give them a valid SRUID instead.
  • To get the unraised exception from a failed answer returned by Child.ask_multiple, access the zeroth element e.g. if the third question failed:
    answers = Child.ask_multiple(*questions)
    exception, question_uuid = answers[3]
    
  • Service.received_events, AbstractEventHandler.handled_events, and Child.received_events now include event attributes instead of just the event. These attributes/properties now return a list of dictionaries with the keys {"event", "attributes"}, where what was previously returned is now mapped to the "event" key.
  • Stop providing the recipient argument to EventReplayer and GoogleCloudPubSubEventHandler - it's now automatically acquired from each event's attributes
  • Stop passing the skip_missing_events_after argument to EventReplayer and GoogleCloudPubSubEventHandler
  • Stop using the awaiting_missing_event and time_since_missing_event properties on the event handlers

Use updated `twined`

07 May 17:15
450a280
Compare
Choose a tag to compare

Contents (#650)

IMPORTANT: There is 1 breaking change.

Enhancements

  • Include received invalid data in flask app error messages
  • Allow any iterable for Dataset files argument

Fixes

  • Ensure order argument is given in Service.send_exception
  • Add workaround for apparent bug in getting local metadata file's absolute path
  • Remove (now-) unnecessary json decoding in get_events

Dependencies

  • 💥 BREAKING CHANGE: Drop support for python3.7
  • Use twined==0.5.5
  • Update to black==24.4.2

Testing

  • Remove unnecessary test
  • Update asynchronous deployment test to accept numpy array as output values

Upgrade instructions

💥 Drop support for python3.7

Upgrade to python>=3.8 to keep using octue.

Use twined version 0.5.5 to unpin jsonschema package

01 May 17:15
Compare
Choose a tag to compare
CHO: Add inter-service compatibility metadata

skipci

Improve async event retrieval workflow

23 Apr 12:50
146d5ba
Compare
Choose a tag to compare

Contents (#647)

IMPORTANT: There are 2 breaking changes.

Enhancements

  • 💥 BREAKING CHANGE: Return question UUID from Child.ask
  • Deserialise manifests from events in get_event
  • Raise error if no events found when calling get_events

Fixes

  • Use correct base image for python3.11 dockerfile
  • Return schema-compliant events and attributes from get_events

Operations

  • Import missing APIs into terraform config
  • Deploy version 0.5.0 of event handler cloud function and update event store schema
  • Update actions/setup-python to version 5

Dependencies

  • 💥 BREAKING CHANGE: Make db-dtypes and google-cloud-bigquery optional
  • Upgrade gunicorn to avoid vulnerability
  • Loosen numpy dependency

Testing

  • Test retrieving results from real asynchronous question
  • Run tests with python3.10 (python3.9 isn't available on macos-latest for arm64)

Other

  • Add DOI badge to readme

Upgrade instructions

💥 Return question UUID from `Child.ask`

Instead of writing answer = Child.ask(...), write answer, question_uuid = Child.ask(...) (and the same for ChildEmulator)

💥 Make `db-dtypes` and `google-cloud-bigquery` optional

To keep using the get_events function, add the bigquery optional extra to your installation command e.g. poetry install -E bigquery.

Switch to event-driven infrastructure and improve support for asynchronous questions

11 Apr 17:47
6999b22
Compare
Choose a tag to compare

Summary

This pull request:

  • Makes the SDK fully event-driven by using a single topic to emit/consume events
  • Majorly refactors the event handler to facilitate asynchronous event retrieval
  • Adds the ability to get and replay events from a BigQuery store

Contents (#632)

IMPORTANT: There are 4 breaking changes.

New features

  • 💥 BREAKING CHANGE: Use single topic per workspace (#639)
  • Add get_events function for retrieving events asynchronously from BigQuery
  • Add EventReplayer class to replay asynchronously-retrieved events
  • Add Manifest.download method

Enhancements

  • Get subscription project name from topic by default
  • Improve asking of asynchronous questions via Child.ask
  • Return download path from Dataset.download
  • Include question UUID in delivery acknowledgement log message
  • Improve handling of invalid events
  • Add datetime and uuid attributes to all events

Fixes

  • Await successful publishing of question messages
  • Fix api_access_endpoint usage in mock_generate_signed_url

Operations

  • Add test BigQuery dataset, cloud function, and IAM roles to terraform config
  • Switch to reusable workflows where possible

Dependencies

  • Add google-cloud-bigquery
  • Upgrade coolname
  • Add db-dtypes for converting bigquery rows to dataframes

Refactoring

  • 💥 BREAKING CHANGE: Rename x.received_messages to x.received_events
  • 💥 BREAKING CHANGE: Rename record_messages parameters to record_events
  • 💥 BREAKING CHANGE: Update ChildEmulator to use event* instead of message*
  • Factor out making minimal dictionary
  • Factor out creating push subscription
  • Factor out emitting question event in Service.ask
  • Factor out event handlers and related logic from OrderedMessageHandler into new AbstractEventHandler
  • Move validation module into octue.cloud.events subpackage
  • Rename OrderedMessageHandler to GoogleCloudPubSubEventHandler
  • Rename "message" to "event" in event handler classes
  • Rename GooglePubSubHandler to GoogleCloudPubSubHandler

Chores

  • Update licence year to 2024

Testing

  • Simplify various tests

Upgrade instructions

  • Update all services in your services network to this version of octue or later (0.53.0+)
  • Replace any usages of the received_messages methods with received_events
  • Replace any usages of the record_messages parameters with record_events
  • Replace the word message with event in usages of ChildEmulator methods (apart from in the case of monitor_message)

Warn about messages with duplicate message numbers

06 Feb 15:56
4366d66
Compare
Choose a tag to compare

Contents (#627)

Enhancements

  • Warn about messages with duplicate message numbers

Make event handling faster and resilient to missing events

05 Feb 17:29
2e6c23f
Compare
Choose a tag to compare

Contents (#625)

IMPORTANT: There is 1 breaking change.

Enhancements

  • Allow setting of maximum number of workers for parallel questions in Child.ask_multiple
  • Pull up to 50 messages from answer subscriptions at once instead of 1
  • Allow skipping of any missing message after a 10s delay in OrderedMessageHandler
  • Suppress name/namespace override warning if the value is the same in the environment and service configuration file
  • Speed up event validation by caching service communication JSON schema
  • 💥 BREAKING CHANGE: Extract SRUID for child logs context from subscription in message handler

Fixes

  • Exit early from message pulling if heartbeat check fails
  • Make Manifest.update_dataset_paths method thread-safe

Refactoring

  • Factor out multiple checks of package version in message handler

Testing

  • Improve message handling tests by not mocking _pull_and_enqueue_available_messages method and removing MockMessagePuller

Upgrade instructions

💥 Extract SRUID for child logs from subscription in message handler

This removes the service_name argument from Service.wait_for_answer. If you were using this argument, simply remove it; logs from children shown in a parent will now have the full and correct SRUID automatically.

Publish answers to question topic

09 Jan 18:28
c4b41f6
Compare
Choose a tag to compare

Summary

This pull request removes the use of answer topics by publishing answer messages to the service revision (formerly known as question) topic and filtering subscriptions to only receive a) questions or b) response messages to a specific question. This speeds up the question asking process, reduces cloud infrastructure requirements and the permissions surface, and allows us to avoid topic number limits.

Also added is validation of all messages and their attributes against a new publicly available schema. This ensures services are communicating as they should and opens up the possibility of writing services in other languages and creating emulators.

As this, by itself, constitutes an inter-service communication breaking change, we've taken the opportunity to reduce the complexity of the codebase by removing backwards compatibility patches for service communication (i.e. we've grouped multiple breaking changes together into one).

Contents (#603)

IMPORTANT: There are 7 breaking changes.

New features

  • Validate messages and their attributes against new service communication schema (see #614 for changelog - it was merged into this branch)
  • Allow diagnostics (formerly known as crash diagnostics) to always be switched on for a service

Enhancements

  • 💥 BREAKING CHANGE: Publish responses to questions to the service revision (question) topic instead of creating a separate answer topic
  • 💥 BREAKING CHANGE: Store message number in message attributes instead of in message data
  • 💥 BREAKING CHANGE: Remove question UUID from log record message body
  • 💥 BREAKING CHANGE: Remove inter-service communication backwards compatibility code
  • 💥 BREAKING CHANGE: Make input and output values and manifest optional
  • 💥 BREAKING CHANGE: Replace boolean allow_save_diagnostics_data_on_crash argument with string/enum save_diagnostics argument in Service.ask and related methods
  • Add ability to filter subscriptions
  • Add question UUID attribute to all messages
  • Send more possible errors to parent in Service.answer
  • Add kind field to question messages
  • Add sender_type attribute to all messages
  • Add ability to instantiate Runner from service/app configurations

Fixes

  • Stop double-JSON-encoding output manifests

Dependencies

  • Update octue version in template apps' dependencies

Refactoring

  • 💥 BREAKING CHANGE: Rename crash diagnostics to diagnostics
  • Group message attributes in Service._send_message and MockMessage under explicit attributes argument
  • Make OrderedMessageHandler._waiting_messages attribute public
  • Rename various message attributes

Testing

  • Store mock pub/sub messages against subscriptions instead of topics
  • Add missing type field to emulated Pub/Sub questions

Operations

  • Fix add-issues-to-octue-board workflow
  • Stop automatically building docker images for registry in release workflow
  • Add ReadTheDocs config file to fix documentation building

Upgrade instructions

💥 Update all Octue services in your network to use this version of octue so they're still able to communicate. Postpone upgrading until you can upgrade all services simultaneously.

💥 Replace allow_save_diagnostics_data_on_crash with save_diagnostics set to one of these values: "SAVE_DIAGNOSTICS_OFF", "SAVE_DIAGNOSTICS_ON_CRASH", or "SAVE_DIAGNOSTICS_ON"

💥 Crash diagnostics rename:

  • Use the octue get-diagnostics CLI command instead of the octue get-crash-diagnostics command
  • Rename crash_diagnostics_cloud_path in your service configurations to diagnostics_cloud_path