Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do we record cricksaw data? #71

Open
ablot opened this issue Jan 20, 2022 · 3 comments
Open

How do we record cricksaw data? #71

ablot opened this issue Jan 20, 2022 · 3 comments

Comments

@ablot
Copy link
Member

ablot commented Jan 20, 2022

The cricksaw "raw" data, coming out of the microscope can contain:

  • raw raw data unstitched (ideally should not be kept on CAMP, at least not uncompressed)
  • stitched data at 100% with a folder per channel
  • potentially stitched data at lower resolution (for instance 50% for 1x1um acquisition to do cell detection on 2x2um)
  • downsampled data folder with isometric tiff (10,25 and/or 50 um voxels)

What do we add on flexilims? A single record for the main brainsaw folder? Or multiple dataset for each channel/resolution/downsampled data?

@znamensk
Copy link
Member

Might make sense to keep the raw data as a separate entry for archiving purposes?

@ablot
Copy link
Member Author

ablot commented May 12, 2023

The best might be to not have any flexilims entries for the raw data. Often mutiple brains are processed at once and end up compressed and archived together (it the non stitched data is kept at all)

For stitched data, one option is to not have anything specific in flexiznam but instead manually create an adapted entity when needed.
An example solution for both cricksaw and cellfinder datasets is here: https://github.com/znamlab/cricksaw-analysis/blob/dev/cricksaw_analysis/cricksaw_to_flexilims.py

One output entity example is here:
https://flexylims.thecrick.org/flexilims/sample/show?sampleId=645e740b7ddb34517470c869

This has the advantage of separating the very specific code parsing log files to get relevant metadata from flexiznam. It makes automatic discovery of cricksaw datasets using "from_folder" impossible but I'm not sure it's a needed feature.

@ablot
Copy link
Member Author

ablot commented May 12, 2023

And here is an example cellfinder dataset created by the other function:
https://flexylims.thecrick.org/flexilims/sample/show?sampleId=645d0d4c7ddb34517470c7e0

I think that manually adding them once the process has been checked (it did run and the results make senses) is probably more useful than a batch detection with a yaml like we do for 2p.

The alternative is to move these functions in new dataset subclasses. That would make them autodetect datasets but maybe will start to make too many subclasses for everything.

Any opinion @znamensk ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants