Releases: octue/octue-sdk-python
Speed up event replaying
Contents (#669)
Enhancements
- Skip non-result event validation if only result is required
- Add ability to skip event validation in event handlers
- Make diagnostics log messages more consistent
- Allow instantiation of
Diagnostics
,Topic
,Subscription
, andGoogleCloudPubSubEventHandler
without cloud credentials
Refactoring
- Update from deprecated
datetime.datetime.utcnow
method - Use
cached_property
inService
- Remove unused attributes on
MockService
andRunner
Testing
- Implement
MockSubscription.delete
Update child emulator and improve manifest dataset download
Contents (#668)
IMPORTANT: There are 2 breaking changes.
Enhancements
- 💥 BREAKING CHANGE: Update
ChildEmulator
to useEventReplayer
, support schema-compliant events and attributes, and support heartbeats and delivery acknowledgement events. This significantly simplifies the emulator - 💥 BREAKING CHANGE: Remove
ChildEmulator.from_file
- Download manifest datasets to same directory by default
Refactoring
- Move
ServicePatcher
into its own module
Upgrade instructions
💥 Update `ChildEmulator` to use `EventReplayer` and full events
Give events (including attributes) that satisfy the service communication schema to child emulators.
💥 Remove `ChildEmulator.from_file`
Load the JSON file separately and pass the events into the ChildEmulator
constructor.
Enable question chaining
Summary
This release makes major improvements to event handling and question auditing. Some of the main changes are:
- Questions are now automatically associated with their parent question and the question that originated them, however deep they are in a question tree
- Events are ordered by datetime by the event backend, not the SDK
- Better feedback is provided when asking questions in parallel
- You can specify the event store to use
- Log message contexts have been slimmed down without losing any information, and events are replayable with no context (good for smaller screens)
- Various public classes and functions are faster and easier to use
- Question retries have the same question UUID
Contents (#660)
IMPORTANT: There are 6 breaking changes.
New features
Events
- 💥 BREAKING CHANGE: Add
parent_question_uuid
,originator_question_uuid
,originator
andretry_count
event attributes - Avoid redelivery of questions by checking the event store on delivery
Event handlers
- Add ability to not include service metadata in logs in even handlers
- Enable
EventReplayer
to handle question events - Add
RegisteredTemporaryDirectory
class, use it when downloading datasets, and add ability to delete them at end of analysis
Enhancements
Resources
- 💥 BREAKING CHANGE: Make datasets recursive by default in
Dataset
- Log a warning if a dataset is empty at instantiation
Services
- 💥 BREAKING CHANGE: Remove
name
argument fromService
and provide an SRUID toChild
internal service instead of a name - Improve logging of errors, retries, and threading in
Child.ask_multiple
- Order pub/sub messages by datetime using ordering key and remove
order
event attribute - Set question UUIDs in advance in
Child.ask_multiple
Subscriptions
- Allow existing subscriptions in
create_push_subscription
- Give feedback on (un)successful push subscription creation in CLI
Questions and events
- Remove unnecessary
sender
argument fromget_events
and make getting the tail of events the default - Allow retried questions to have the same UUID
- Allow explicit question retries by using
retry_count
attribute - Return empty list from
get_events
if no events for question
Service configuration
- Allow setting of event store table ID and
delete_local_files
in service configuration - Use envvar to specify service configuration location by default
- Add
overrides
option toRunner.from_configuration
Other
- Log warning when
PYTHONUNBUFFERED
envvar is unset - Remove "analysis-" from start of question UUIDs in log context
Fixes
- 💥 BREAKING CHANGE: Return question UUID alongside error from
Child.ask_multiple
for failed questions - Set analysis ID at start of
Runner.run
- Emit correct logs when no diagnostics available with
octue get-diagnostics
- Fix deserialisation of events in
get_events
- Use (meta-)generation agnostic retry strategy with cloud storage
- Return correct question UUIDs with failed questions from
Child.ask_multiple
- Avoid logging that app failed when it didn't when uploading diagnostics
- Allow setting of
max_workers
when CPU count is indeterminate - Disable
delete_local_files
by default
Operations
- Update event handler and its bigquery table
Dependencies
- Loosen
Sphinx
and other docs package ranges - Remove unneeded
db-dtypes
package - Make
google-cloud-bigquery
a mandatory dependency - Upgrade
google-cloud-secret-manager
Refactoring
Event handlers
- 💥 BREAKING CHANGE: Remove redundant datetime from delivery ack and heartbeat events
- 💥 BREAKING CHANGE: Rename
originator
event attribute toparent
- Factor out finalising and cleaning up in
Runner
- Move service accounts into separate terraform file
- Cache metadata against datafile/dataset instead of path
- Rename python3.9 dockerfile to reflect its python version
Upgrade instructions
- Add
recursive=False
toDataset
instantiations - Update all services in your service network to use
octue>=0.56.0
- Use version
0.6.1
of the event handler or above and a correspondingly up-to-date BigQuery table. - Swap the
internal_service_name
argument forinternal_sruid
argument toChild.__init__
and provide a valid SRUID - Instances of
Service
can no longer be given names. Please give them a valid SRUID instead. - To get the unraised exception from a failed answer returned by
Child.ask_multiple
, access the zeroth element e.g. if the third question failed:answers = Child.ask_multiple(*questions) exception, question_uuid = answers[3]
Service.received_events
,AbstractEventHandler.handled_events
, andChild.received_events
now include event attributes instead of just the event. These attributes/properties now return a list of dictionaries with the keys {"event", "attributes"}, where what was previously returned is now mapped to the "event" key.- Stop providing the
recipient
argument toEventReplayer
andGoogleCloudPubSubEventHandler
- it's now automatically acquired from each event's attributes - Stop passing the
skip_missing_events_after
argument toEventReplayer
andGoogleCloudPubSubEventHandler
- Stop using the
awaiting_missing_event
andtime_since_missing_event
properties on the event handlers
Use updated `twined`
Contents (#650)
IMPORTANT: There is 1 breaking change.
Enhancements
- Include received invalid data in flask app error messages
- Allow any iterable for
Dataset
files
argument
Fixes
- Ensure
order
argument is given inService.send_exception
- Add workaround for apparent bug in getting local metadata file's absolute path
- Remove (now-) unnecessary json decoding in
get_events
Dependencies
- 💥 BREAKING CHANGE: Drop support for python3.7
- Use
twined==0.5.5
- Update to
black==24.4.2
Testing
- Remove unnecessary test
- Update asynchronous deployment test to accept numpy array as output values
Upgrade instructions
💥 Drop support for python3.7
Upgrade to python>=3.8
to keep using octue
.
Use twined version 0.5.5 to unpin jsonschema package
CHO: Add inter-service compatibility metadata skipci
Improve async event retrieval workflow
Contents (#647)
IMPORTANT: There are 2 breaking changes.
Enhancements
- 💥 BREAKING CHANGE: Return question UUID from
Child.ask
- Deserialise manifests from events in
get_event
- Raise error if no events found when calling
get_events
Fixes
- Use correct base image for
python3.11
dockerfile - Return schema-compliant events and attributes from
get_events
Operations
- Import missing APIs into terraform config
- Deploy version
0.5.0
of event handler cloud function and update event store schema - Update
actions/setup-python
to version 5
Dependencies
- 💥 BREAKING CHANGE: Make
db-dtypes
andgoogle-cloud-bigquery
optional - Upgrade
gunicorn
to avoid vulnerability - Loosen
numpy
dependency
Testing
- Test retrieving results from real asynchronous question
- Run tests with
python3.10
(python3.9
isn't available onmacos-latest
forarm64
)
Other
- Add DOI badge to readme
Upgrade instructions
💥 Return question UUID from `Child.ask`
Instead of writing answer = Child.ask(...)
, write answer, question_uuid = Child.ask(...)
(and the same for ChildEmulator
)
💥 Make `db-dtypes` and `google-cloud-bigquery` optional
To keep using the get_events
function, add the bigquery
optional extra to your installation command e.g. poetry install -E bigquery
.
Switch to event-driven infrastructure and improve support for asynchronous questions
Summary
This pull request:
- Makes the SDK fully event-driven by using a single topic to emit/consume events
- Majorly refactors the event handler to facilitate asynchronous event retrieval
- Adds the ability to get and replay events from a BigQuery store
Contents (#632)
IMPORTANT: There are 4 breaking changes.
New features
- 💥 BREAKING CHANGE: Use single topic per workspace (#639)
- Add
get_events
function for retrieving events asynchronously from BigQuery - Add
EventReplayer
class to replay asynchronously-retrieved events - Add
Manifest.download
method
Enhancements
- Get subscription project name from topic by default
- Improve asking of asynchronous questions via
Child.ask
- Return download path from
Dataset.download
- Include question UUID in delivery acknowledgement log message
- Improve handling of invalid events
- Add
datetime
anduuid
attributes to all events
Fixes
- Await successful publishing of question messages
- Fix
api_access_endpoint
usage inmock_generate_signed_url
Operations
- Add test BigQuery dataset, cloud function, and IAM roles to terraform config
- Switch to reusable workflows where possible
Dependencies
- Add
google-cloud-bigquery
- Upgrade
coolname
- Add
db-dtypes
for converting bigquery rows to dataframes
Refactoring
- 💥 BREAKING CHANGE: Rename
x.received_messages
tox.received_events
- 💥 BREAKING CHANGE: Rename
record_messages
parameters torecord_events
- 💥 BREAKING CHANGE: Update
ChildEmulator
to useevent*
instead ofmessage*
- Factor out making minimal dictionary
- Factor out creating push subscription
- Factor out emitting question event in
Service.ask
- Factor out event handlers and related logic from
OrderedMessageHandler
into newAbstractEventHandler
- Move
validation
module intooctue.cloud.events
subpackage - Rename
OrderedMessageHandler
toGoogleCloudPubSubEventHandler
- Rename "message" to "event" in event handler classes
- Rename
GooglePubSubHandler
toGoogleCloudPubSubHandler
Chores
- Update licence year to 2024
Testing
- Simplify various tests
Upgrade instructions
- Update all services in your services network to this version of
octue
or later (0.53.0
+) - Replace any usages of the
received_messages
methods withreceived_events
- Replace any usages of the
record_messages
parameters withrecord_events
- Replace the word
message
withevent
in usages ofChildEmulator
methods (apart from in the case ofmonitor_message
)
Warn about messages with duplicate message numbers
Make event handling faster and resilient to missing events
Contents (#625)
IMPORTANT: There is 1 breaking change.
Enhancements
- Allow setting of maximum number of workers for parallel questions in
Child.ask_multiple
- Pull up to 50 messages from answer subscriptions at once instead of 1
- Allow skipping of any missing message after a 10s delay in
OrderedMessageHandler
- Suppress name/namespace override warning if the value is the same in the environment and service configuration file
- Speed up event validation by caching service communication JSON schema
- 💥 BREAKING CHANGE: Extract SRUID for child logs context from subscription in message handler
Fixes
- Exit early from message pulling if heartbeat check fails
- Make
Manifest.update_dataset_paths
method thread-safe
Refactoring
- Factor out multiple checks of package version in message handler
Testing
- Improve message handling tests by not mocking
_pull_and_enqueue_available_messages
method and removingMockMessagePuller
Upgrade instructions
💥 Extract SRUID for child logs from subscription in message handler
This removes the service_name
argument from Service.wait_for_answer
. If you were using this argument, simply remove it; logs from children shown in a parent will now have the full and correct SRUID automatically.
Publish answers to question topic
Summary
This pull request removes the use of answer topics by publishing answer messages to the service revision (formerly known as question) topic and filtering subscriptions to only receive a) questions or b) response messages to a specific question. This speeds up the question asking process, reduces cloud infrastructure requirements and the permissions surface, and allows us to avoid topic number limits.
Also added is validation of all messages and their attributes against a new publicly available schema. This ensures services are communicating as they should and opens up the possibility of writing services in other languages and creating emulators.
As this, by itself, constitutes an inter-service communication breaking change, we've taken the opportunity to reduce the complexity of the codebase by removing backwards compatibility patches for service communication (i.e. we've grouped multiple breaking changes together into one).
Contents (#603)
IMPORTANT: There are 7 breaking changes.
New features
- Validate messages and their attributes against new service communication schema (see #614 for changelog - it was merged into this branch)
- Allow diagnostics (formerly known as crash diagnostics) to always be switched on for a service
Enhancements
- 💥 BREAKING CHANGE: Publish responses to questions to the service revision (question) topic instead of creating a separate answer topic
- 💥 BREAKING CHANGE: Store message number in message attributes instead of in message data
- 💥 BREAKING CHANGE: Remove question UUID from log record message body
- 💥 BREAKING CHANGE: Remove inter-service communication backwards compatibility code
- 💥 BREAKING CHANGE: Make input and output values and manifest optional
- 💥 BREAKING CHANGE: Replace boolean
allow_save_diagnostics_data_on_crash
argument with string/enumsave_diagnostics
argument inService.ask
and related methods - Add ability to filter subscriptions
- Add question UUID attribute to all messages
- Send more possible errors to parent in
Service.answer
- Add
kind
field to question messages - Add
sender_type
attribute to all messages - Add ability to instantiate
Runner
from service/app configurations
Fixes
- Stop double-JSON-encoding output manifests
Dependencies
- Update
octue
version in template apps' dependencies
Refactoring
- 💥 BREAKING CHANGE: Rename crash diagnostics to diagnostics
- Group message attributes in
Service._send_message
andMockMessage
under explicitattributes
argument - Make
OrderedMessageHandler._waiting_messages
attribute public - Rename various message attributes
Testing
- Store mock pub/sub messages against subscriptions instead of topics
- Add missing
type
field to emulated Pub/Sub questions
Operations
- Fix
add-issues-to-octue-board
workflow - Stop automatically building docker images for registry in
release
workflow - Add ReadTheDocs config file to fix documentation building
Upgrade instructions
💥 Update all Octue services in your network to use this version of octue
so they're still able to communicate. Postpone upgrading until you can upgrade all services simultaneously.
💥 Replace allow_save_diagnostics_data_on_crash
with save_diagnostics
set to one of these values: "SAVE_DIAGNOSTICS_OFF", "SAVE_DIAGNOSTICS_ON_CRASH", or "SAVE_DIAGNOSTICS_ON"
💥 Crash diagnostics rename:
- Use the
octue get-diagnostics
CLI command instead of theoctue get-crash-diagnostics
command - Rename
crash_diagnostics_cloud_path
in your service configurations todiagnostics_cloud_path