Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(data-migrate): data mover task #726

Merged
merged 3 commits into from
Nov 29, 2024
Merged

feat(data-migrate): data mover task #726

merged 3 commits into from
Nov 29, 2024

Conversation

mmalenic
Copy link
Member

@mmalenic mmalenic commented Nov 28, 2024

Implements first task from #721

Changes

  • Adds a data mover fargate task which can be triggered by a step function.
    • Opted for simpler 2x aws s3 sync + aws s3 rm to perform the move.
  • Permissions required are potentially separate for the read from bucket and the write to bucket.

Related to infrastructure PR umccr/infrastructure#508

Todo

  • Consume an ArchiveData event to trigger the step function and publish a ArchiveComplete event back to the event bus.

@mmalenic mmalenic self-assigned this Nov 28, 2024
@mmalenic mmalenic added feature New feature platform labels Nov 28, 2024
@mmalenic mmalenic linked an issue Nov 28, 2024 that may be closed by this pull request
2 tasks
Copy link
Member

@victorskl victorskl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alexiswl
Copy link
Member

Once you're ready @mmalenic these folders are the first batch to need archiving

s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210215_A01052_0032_AHVJVVDSXY/20241122b5d7dc67/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210301_A01052_0035_BHT7C2DSXY/20241122bc56a405/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210305_A01052_0036_AHT7CFDSXY/20241122bde0dd86/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210325_A01052_0038_AHYGWMDSXY/20241122e44acacb/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210331_A01052_0040_AH32HFDSX2/2024112235796bc3/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210331_A01052_0041_BHYMHFDSXY/20241123953d6380/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210416_A01052_0042_BH2YK2DSX2/202411223002b111/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210422_A01052_0045_AH2YNVDSX2/20241122a2dded68/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210506_A01052_0049_BH57YLDSX2/2024112257d54306/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210610_A00130_0160_AH57L5DSX2/20241122c0721f83/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210624_A01052_0052_AH7KFMDSX2/202411228dfd9db7/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210628_A00130_0163_BH7KTMDSX2/2024112212e4a4d3/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210701_A01052_0054_BH7KH7DSX2/202411220863c972/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210702_A00130_0164_AH7KVHDSX2/2024112208b80b65/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210702_A00130_0165_BH7KFWDSX2/2024112250d0164f/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210708_A00130_0166_AH7KTJDSX2/202411220b409482/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210722_A01052_0056_AHGJT7DSX2/20241122b6ae11b0/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210729_A00130_0167_BHGJWLDSX2/2024112227dd5a18/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210805_A01052_0057_BHGKM2DSX2/20241122544f5f7d/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210820_A01052_0058_AHGJM3DSX2/202411221fca6a3e/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210830_A00130_0168_AHGKVWDSX2/20241122266dd762/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210830_A00130_0169_BHGKLNDSX2/20241122a2796c4f/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210903_A00130_0170_AHGKJ7DSX2/20241122bd8d6e76/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210909_A00130_0171_BHGKN7DSX2/202411223c6cd5f8/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210923_A00130_0175_AHGKJNDSX2/20241122a784e0c2/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/210923_A00130_0176_BHH5JFDSX2/2024112221b1895d/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211007_A00130_0177_BHLGGCDSX2/20241122ad872c7c/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211007_A00130_0178_AHLFVWDSX2/20241122ef529091/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211007_A01052_0059_AHLGG3DSX2/202411228efae78e/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211007_A01052_0060_BHLGFYDSX2/20241122bb266761/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211014_A00130_0179_AHLFYJDSX2/2024112235b955b8/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211014_A00130_0180_BHLGF7DSX2/202411222ecbacaa/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211021_A01052_0061_BHLH5VDSX2/2024112214c7b8fe/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211101_A01052_0062_AHLG2LDSX2/202411223f5d8266/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211104_A00130_0181_AHWC25DSX2/2024112280c59663/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211119_A00130_0183_AHWCGCDSX2/20241122537097ce/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211119_A00130_0184_BHWCLMDSX2/2024112283399530/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211125_A00130_0185_AHWC2HDSX2/202411224414cf7c/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211125_A00130_0186_BHWF3MDSX2/20241122d859625a/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211129_A00130_0187_AHWMHWDSX2/20241122760cd166/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211129_A00130_0188_BHWCY3DSX2/202411227b748e05/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211202_A00130_0189_BHWLGFDSX2/2024112284ee7ecd/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211202_A00130_0190_AHWKTKDSX2/202411222dbc6204/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211206_A00130_0191_AHWKNJDSX2/20241122f7b36ffa/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211206_A00130_0192_BH2WT3DMXY/2024112295e7e026/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211210_A00130_0193_BHWLKNDSX2/202411221ec382b5/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211210_A00130_0194_AHWLHHDSX2/2024112269f3473e/
s3://pipeline-prod-cache-503977275616-ap-southeast-2/byob-icav2/production/ora-compression/211220_A00130_0195_BHWKCCDSX2/2024112276ee8cda/

@mmalenic
Copy link
Member Author

Sure! I'll need someone to help me setup umccr/infrastructure#508 first.

@victorskl
Copy link
Member

victorskl commented Nov 29, 2024

Alright, let do this before end of the week. @reisingerf @alexiswl Pls approve
Marko, I will apply infra 508 after merged

@alexiswl
Copy link
Member

Alright, let do this before end of the week.

Ambitious. I think Marko still wanted to put an event notification at the end of the job

@mmalenic
Copy link
Member Author

Happy to do it without event notification, as it's not necessary for now?

@victorskl
Copy link
Member

without event notification

Huh, that's what I thought as Flo told me on last Wed. i.e. Event parts for later...

Thought you guys catch up yesterday in office.

Happy either way.

@mmalenic
Copy link
Member Author

Happy to progress tonight or next week, will merge after approval.

@mmalenic mmalenic merged commit 41efd8c into main Nov 29, 2024
6 checks passed
@mmalenic mmalenic deleted the feat/data-mover branch November 29, 2024 05:40
@reisingerf
Copy link
Member

I don't think we need the event part for now.

First goal is to get the current BYOB data into the archive asap.
It's scriptable, which is a good enough fit for the one time migration and batched / reviewed nature of the current procedure.

Triggering via events may be useful for future automation.

I am not sure about exit / completion events....
If it's using step functions, we have a notification rule set up already (to report failures to Slack). In the longer term it would be good to have a status / monitor for non-workflow jobs (along the line of the status changes in the workflow manager), but I think this is still a bit away. I am not sure if we should (mis-)use the WorkflowManager for that or build something else.

@victorskl
Copy link
Member

@mmalenic Marko, pls leave with us for the move. Flo and I will script it around your stepfunction and trigger them. Keep you posted.

@reisingerf
Copy link
Member

I'd run something like this.
Happy to run it next week if we all agree on it.

ora-data-move-batch1.txt

@victorskl
Copy link
Member

Build takes a while. Let run it in next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add data mover service
4 participants