Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: DSS Events #102

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions rfcs/text/0000-dss-event-journaling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
### DCP PR:

***Leave this blank until the RFC is approved** then the **Author(s)** must create a link between the assigned RFC number and this pull request in the format:*

`[dcp-community/rfc#](https://github.com/HumanCellAtlas/dcp-community/pull/<PR#>)`

# RFC: Event Journaling and Replay for the Data Store (DSS)

## Summary

This RFC proposes DSS event journaling and replay API endpoints.

## Author(s)

* [Brian Hannafious](mailto:[email protected])

## Shepherd
***Leave this blank.** This role is assigned by DCP PM to guide the **Author(s)** through the RFC process.*

*Recommended format for Shepherds:*

`[Name](mailto:[email protected])`

## Motivation

The DSS currently provides an event subscription service. Events are triggered when a bundle is created, tombstoned, or
deleted, and may be filtered by applying [JMESPath](http://jmespath.org/) to the
[metadata](https://github.com/HumanCellAtlas/metadata-schema) documents contained in the bundle. The DSS should also
provide JMESPath filterable, and chronological, event replay.

### User Stories

* As a new DSS subscriber, I would like to replay the events prior to my subscription activation date.
* As an existing DSS subscriber, I would like to recover from protracted downtime of my service.
* As a DSS subscriber, I would like to test changes to my JMESPath subscriptions against specific bundles.

## Detailed Design
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment to the one that I left in #101 - this Detailed Design appears to lack details.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I'll reference my comment to your comment in #101: #101 (comment)


The DSS will provide two new API endpoints:
1. GET /events, accepting `replica`, `from_date`, and `to_date`. This will return a paged listing of signed urls
containing event journals.
1. GET /event, accepting `replica`, `bundle_uuid`, and `bundle_version`. This will return the JSON
document produced during the bundle event.

The event data for a single event will be the JSON metadata document currently produced for the DSS JMESPath event
subscription service.

Due to the (currently unbounded) size of the JSON metadata document, new events will be stored directly on object storage
as single objects. An offline daemon will compile and compress events into journals, as needed. Event history will be
maintained indefinitely.

### Unresolved Questions
xbrianh marked this conversation as resolved.
Show resolved Hide resolved