Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/cttso icav2 step function branch #145

Merged
merged 17 commits into from
May 7, 2024

Conversation

alexiswl
Copy link
Member

@alexiswl alexiswl commented Mar 10, 2024

Summary

This AWS step functions performs the following actions

  • Copies a set of fastq files to a structured location (via the icav2 copy batch utility)
  • Uploads a samplesheet to the same structured location
  • Invokes the ctTSO v2 pipeline calling the structured location as the run folder

Dependent PRs

TODO List

  • Implement stateful database to track analysis IDs and runs
  • sfn to handle 'state-change' in analysis run (and update Database)
  • External event handling on state change
  • Raise internal event on terminal state change
  • Fix Technical tags, should have it's own statefunctions steps

@alexiswl alexiswl added the feature New feature label Mar 10, 2024
@alexiswl alexiswl self-assigned this Mar 10, 2024
@alexiswl alexiswl linked an issue Mar 10, 2024 that may be closed by this pull request
Copy link
Member

@victorskl victorskl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, too.

Yup, same, same. Needa integrate with outer CDK and, perhaps you can break it down pending TODOs as another follow up PR of the same issue implementation story. Leave it with you.!

Copy link
Member

@reisingerf reisingerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor questions, nothing that should stop/delay this.

@alexiswl
Copy link
Member Author

Minor questions, nothing that should stop/delay this.

Yep I'd like to pivot to bclconvert-interop-qc which is a tiny pipeline service to be triggered after the successful completion of the bssh fastq copy manager service

The interop qc workflow is shown here - https://github.com/umccr/cwl-ica/releases/tag/bclconvert-interop-qc%2F1.3.1--1.19__202403122095003

This will mean I can focus on (without accidentally deploying 100 cttso v2 workflows)

  1. Event handling
  2. Database handling, updating states, entries etc.

We can then move this logic into here (#145) as well.

@victorskl victorskl self-requested a review April 18, 2024 21:32
Copy link
Member

@victorskl victorskl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for update, Alexis. I will go through the README, today.

(And, I will come back to this TSOv2 branch once basespace upload service branch #186 review done in tick.)

@victorskl victorskl marked this pull request as draft April 19, 2024 00:15
@victorskl
Copy link
Member

Put this branch into draft mode until we finalise

@alexiswl
Copy link
Member Author

WIll prioritise this one over today and tomorrow, shouldn't take more than an hour to get at least the stateless architecture of this PR up to scratch on being part of the Orcabus pipeline

@alexiswl alexiswl force-pushed the feature/cttso-icav2-step-function-branch branch from d9cf223 to 621c63d Compare April 29, 2024 00:21
@alexiswl alexiswl marked this pull request as ready for review April 29, 2024 00:23
@victorskl
Copy link
Member

victorskl commented Apr 29, 2024

.. eta; will review tomorrow (30/04/24)

@victorskl
Copy link
Member

reviewing

Copy link
Member

@victorskl victorskl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with review notes:

  • add 1 note
  • skip BsshIcav2FastqCopyManagerStack (already reviewed Feature/bssh icav2 fastq copy manager #144)
  • skip TypeScript case (already discussed)
  • skip Lambda unit test for later (already discussed)
  • check to cdk synth success
  • heads-up to cdk-nag fail with AwsSolutions-IAM5

The AWS Step functions takes in the following parameters:

* project_id
* sample_id (should match the sample_id in the samplesheet, ideally the lab metadata library id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(No action need for this comment. At least for now.)

Just for my note; I said what I need to note here as reviewer. I wanted to say something about this. But. I am torn at the same time. Thinking about argument that invites as side-effect after 3/4 steps. 😭

Perhaps, we just go by documentation that the notion of sample_id is context specific. Such that in which context we are referring this ID to...

This makes me; a quick go and glimpse at our MetadataManager and, feel settle/calm that there are no traces of sample_id notion there, yet.

TL;DR sample_id notion is context specific. It only gets meaningful/complete by its context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, sample_id will match the library id, for v2 samplesheets we strip out everything except the library id, including _topup and _rerun so sample_id and library_id will be synonymous, however, sample_id is the parameter name into the cttso library, can update this to library_id if you want here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, the upstream tso v2 pipeline analysis input would keep it as sample_id at icav2 endpoint, right? This pipeline is not in our control. (sign) If folks happen to look at icav2 endpoint directly, this analysis input would still be confused those users, I think. (Hence, reason I am torn)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having said, we can also make a call to harmonise and consistent among ourselves at our system interfaces (especially those system entry point expose to user). i.e. We use library_id throughout our schema and input json formats.

Each services would translate as such mapping need as internal logic when appropriate.

Perhaps, we can establish this dev SOP/guideline, I reckon. What you think.?

detailType: ['WorkflowRunStateChange'],
/*
FIXME - nothing is set in stone yet
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, all good. Thanks for experimenting here. This is valuable explore.

@alexiswl alexiswl force-pushed the feature/cttso-icav2-step-function-branch branch 3 times, most recently from 27fe524 to c70d94c Compare May 1, 2024 22:48
@victorskl
Copy link
Member

Would be good to merge this first, before we are following up on WRSC schema alignment, Alexis.

@alexiswl
Copy link
Member Author

alexiswl commented May 7, 2024

Already need to resolve some conflicts again.

Will try align the stack as best I can with respect to the workflowRunStateChange schema and will merge from there.

@alexiswl alexiswl force-pushed the feature/cttso-icav2-step-function-branch branch from c70d94c to 13a53fd Compare May 7, 2024 10:43
@alexiswl alexiswl merged commit 99d6e36 into main May 7, 2024
2 checks passed
@alexiswl alexiswl deleted the feature/cttso-icav2-step-function-branch branch May 7, 2024 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement TSO v2 ILMN pipeline orchestration - baseline
3 participants