Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image transcriptomics support #327

Closed
sampierson opened this issue Mar 13, 2019 · 1 comment
Closed

Image transcriptomics support #327

sampierson opened this issue Mar 13, 2019 · 1 comment

Comments

@sampierson
Copy link
Member

This is a heads-up as to what is coming in the future.

At GA the DCP will likely demonstrate proof-of-concept for image-based transcriptomics using the SpaceTx-Allen dataset, which is 600 GB in size.

Valdiation

If there is a validator written for image files (need to confirm), upload currently has a size limit of 1TB for file validation. The aggregate size of all files listed in 1 validation request cannot exceed 1TB.

Aside: we need to look at the number of validation requests assigned to each batch worker. We could easily blow the 1TB limit if several validation requests are assigned to the same server.

If there is a validator written for images, Upload, as currently designed, can support the SpaceTx-Allen dataset.

However, the imaging team also tells us: "Imaging datasets are estimated to be up to 50 TB in size, based on literature and existing example datasets.". Upload cannot validate images of that size, at this time. Of the top of my head, there are several tactics we could use to overcome this limitation:

  • Increase size of volume attached to validator AMI image to 50TB - unappealing.
  • Allocate, attach, format and mount volume more dynamically (see Volume Management: Evaluate BatchIt #25).
  • Modify the system to accommodate streaming validators.
@parthshahva
Copy link
Contributor

@sampierson As it stands, no work is required for current approach to ingest/upload spacetx data. Closing this ticket and new ticket will be created if this changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants