CAREamics as a community partner #100

melisande-c · 2024-10-17T15:35:18Z

We would like to add CAREamics as a community partner to the bioimage model zoo!

About

CAREamics is a PyTorch library aimed at simplifying the use of Noise2Void and its many variants and cousins (CARE, Noise2Noise, N2V2, P(P)N2V, HDN, muSplit etc.).

Resources

CAREamics has functions to export and load from bio-image-zoo archive files, this allows users to easily package models to upload to the zoo.

Maintenance

CAREamics has a permanent engineering team and we will be committed to staying compatible with the bioimage model zoo.

Links

CAREamics organisation: https://github.com/CAREamics
CAREamics source code: https://github.com/CAREamics/careamics

FynnBe · 2024-11-08T20:44:39Z

Let's get CAREamics onboard!

Here you can find details on the technical steps of how to become a community partner (I will help you complete these steps):
https://github.com/bioimage-io/collection?tab=readme-ov-file#add-community-partner

add your info to the bioimageio_collection_config.json analog to this example
compatibility check script
compatibility check workflow

melisande-c · 2024-11-12T11:20:04Z

Hi @FynnBe, thanks for getting back to this, I will make the PR adding the CAREamics info to the json file shortly.

For the compatibility script, is there a way to test it is working as expected before we make a PR? I gather we just need to save a compatibility report file using the CompatibilityReport class, is there anything else we need to do?

FynnBe · 2024-11-12T12:21:21Z

No, that's pretty much it. You can also use the CompatibilityReport(TypedDict) as a typed dict in your code if you have issues with the dependencies of the collection_backoffice.

FynnBe · 2024-11-12T13:04:40Z

Small update:
I have refactored the ilastik example to provide script_utils.check_tool_compatibility:

collection/scripts/script_utils.py

Lines 48 to 72 in 5327dac

    
           def check_tool_compatibility( 
        
               tool_name: str, 
        
               tool_version: str, 
        
               *, 
        
               all_version_path: Path, 
        
               output_folder: Path, 
        
               check_tool_compatibility_impl: Callable[ 
        
                   [str, str], Union[CompatibilityReportDict, "CompatibilityReport"] 
        
               ], 
        
               applicable_types: Set[str], 
        
           ): 
        
               """helper to implement tool compatibility checks 
        
               Args: 
        
                   tool_name: name of the tool (without version), e.g. "ilastik" 
        
                   tool_version: version of the tool, e.g. "1.4" 
        
                   all_versions_path: Path to the `all_versions.json` file. 
        
                   output_folder: Folder to write compatibility reports to. 
        
                   check_tool_compatibility_impl: 
        
                       Function accepting two positional arguments: 
        
                       URL to an rdf.yaml, SHA-256 of that rdf.yaml. 
        
                       And returning a compatibility report. 
        
                   applicable_types: Set of resource types 
        
                       **check_tool_compatibility_impl** is applicable to. 
        
               """

This simplifies the compatibility script needed from a partner, e.g. ilastik example, so now almost only an analog to

collection/scripts/check_compatibility_ilastik.py

Lines 16 to 25 in 5327dac

    
           def check_compatibility_ilastik_impl( 
        
               rdf_url: str, 
        
               sha256: str, 
        
           ) -> CompatibilityReportDict: 
        
               """Create a `CompatibilityReport` for a resource description. 
        
               Args: 
        
                   rdf_url: URL to the rdf.yaml file 
        
                   sha256: SHA-256 value of **rdf_url** content 
        
               """

is needed to implement the compatibility check.

Hope this makes things easier now and more maintainable in the future!

melisande-c · 2024-11-12T14:10:53Z

CAREamics will need to check that a CAREamics config.yaml is also included and able to instantiate our pydantic classes; to save me looking through source code, what is the best way to retrieve the url for this file?

Additional question: should we also check the model is loadable (i.e. has compatible architecture), or will downloading model weights be too costly/time consuming?

FynnBe · 2024-11-12T16:18:27Z

I suppose your models add this config.yaml using the attachments field then?
And there is some additional information in the rdf.yaml under config.ceramics to indicate its presence??
Either way you probably want to go through a ModelDescr object (returned by bioimageio.spec.load_model_description(rdf_url)).

You should use bioimageio.spec to download the models (e.g. by simply loading them with model = bioimageio.spec.load_model_description(rdf_url). This way all files will be cached.

If you want to only deal with v0_5.ModelDescr you can simply check the model.format_version attribute.

melisande-c · 2024-11-12T16:48:03Z

I suppose your models add this config.yaml using the attachments field then?
And there is some additional information in the rdf.yaml under config.ceramics to indicate its presence??

Yep it is added in attachments field, but there is no additional info in the rdf file.

You should use bioimageio.spec to download the models (e.g. by simply loading them with model = bioimageio.spec.load_model_description(rdf_url). This way all files will be cached.

So this means, in regards to my previous question, I will have access to the model weights and so I might as well check that the model architecture is compatible?

FynnBe · 2024-11-12T17:03:32Z

Yep it is added in attachments field, but there is no additional info in the rdf file.

hmm.. config.yaml is not a very unique name. This might lead to confusion. Maybe you could consider renaming this file (in the context of model descriptions) and/or referencing the careamics_config.yaml under config.careamics to know what file to look for (then you could name it arbitrarily).
If these files are not hundreds of lines long you could also just insert it into the rdf.yaml at config.careamics.

So this means, in regards to my previous question, I will have access to the model weights and so I might as well check that the model architecture is compatible?

yes, you should in fact. Ideally even run one training iteration (not epoch) and an inference test. CI only has CPU, but the time limit is pretty generous and we could ensure not to test everything at once if this becomes a bottleneck.

melisande-c · 2024-11-12T17:39:18Z

Our config files can be ~60 lines long, we already have 3 models uploaded with a separate configuration file, I would rather not check for both cases so if we change how we export to bmz then I would like to update these existing models.

In the case we do not insert the CAREamics config into the rdf.yaml file, what extra info needs to be added under config.careamics? the file name is already included in the attachments section.

melisande-c · 2024-11-12T18:16:10Z

For developing the script, I would like to test locally, how can I get access to an example rdf_url? (from one of the uploaded CAREamics models).

FynnBe · 2024-11-12T18:16:13Z

Our config files can be ~60 lines long, we already have 3 models uploaded with a separate configuration file, I would rather not check for both cases so if we change how we export to bmz then I would like to update these existing models.

yeah, updating 3 models isn't a big deal 👍

In the case we do not insert the CAREamics config into the rdf.yaml file, what extra info needs to be added under config.careamics? the file name is already included in the attachments section.

short answer: nothing.
long answer: nothing if you make the file name careamics specific. if you do not then you rely on no other tool ever attaching a config.yaml. In addition I find the config.yaml name confusing as we have the config field inside the rdf.yaml already... Therefore I suggest to go with some version of careamics_config.yaml. Then there is no need to specify that the ubiquitous config.yaml is a careamics config file under config.careamics.

FynnBe · 2024-11-12T18:17:39Z

For developing the script, I would like to test locally, how can I get access to an example rdf_url? (from one of the uploaded CAREamics models).

hmm.. there are a few options. first to mind: search for the model id in https://uk1s3.embassy.ebi.ac.uk/public-datasets/bioimage.io/all_versions.json

melisande-c · 2024-11-20T15:15:14Z

Hi @FynnBe, now that I have the compatibility script written, could you give me a few pointers on the CI workflow config? I've had a look at the ilastik one, but it would be good to have an overview. So I obviously need to set up the environment, installing CAREamics and dependencies, then I need to generate the reports using my check_careamics_compatibility.py script and finally the reports should be uploaded using scripts/upload_reports.py and I can copy and paste the S3 environment variables?

FynnBe · 2024-11-23T19:16:37Z

I can copy and paste the S3 environment variables

I don't know exactly what you mean here, but you don't need to do anything with the S3 env variables. They are already set up in this repo, so your workflow can just use them.

Essentially the careamics workflow can look pretty much exactly like the ilastik one. You can give it a go if you like, otherwise I can make a draft end of next week or so and have you fill in the details there.

melisande-c · 2024-11-25T08:31:50Z

Hi @FynnBe,

All I meant was I should copy how the upload-reports job is set up in the ilastik compatibility check workflow. But if you could get me started with a template that would be much appreciated !

Just to let you know we are working on a couple of things before we make a new release and update the models already in the bmz; the models are not going to pass the checks I wrote until this happens. So no rush 😄. (We are reviewing and updating the information in our generated README, and adding cover generation).

melisande-c mentioned this issue Nov 12, 2024

Feat(community partners): add CAREamics to json #109

Merged

melisande-c mentioned this issue Nov 12, 2024

For BMZ export: rename config.yml or include in rdf.yml CAREamics/careamics#269

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CAREamics as a community partner #100

CAREamics as a community partner #100

melisande-c commented Oct 17, 2024 •

edited

Loading

FynnBe commented Nov 8, 2024 •

edited

Loading

melisande-c commented Nov 12, 2024

FynnBe commented Nov 12, 2024

FynnBe commented Nov 12, 2024

melisande-c commented Nov 12, 2024 •

edited

Loading

FynnBe commented Nov 12, 2024

melisande-c commented Nov 12, 2024

FynnBe commented Nov 12, 2024

melisande-c commented Nov 12, 2024 •

edited

Loading

melisande-c commented Nov 12, 2024

FynnBe commented Nov 12, 2024

FynnBe commented Nov 12, 2024

melisande-c commented Nov 20, 2024

FynnBe commented Nov 23, 2024

melisande-c commented Nov 25, 2024

CAREamics as a community partner #100

CAREamics as a community partner #100

Comments

melisande-c commented Oct 17, 2024 • edited Loading

About

Resources

Maintenance

Links

FynnBe commented Nov 8, 2024 • edited Loading

melisande-c commented Nov 12, 2024

FynnBe commented Nov 12, 2024

FynnBe commented Nov 12, 2024

melisande-c commented Nov 12, 2024 • edited Loading

FynnBe commented Nov 12, 2024

melisande-c commented Nov 12, 2024

FynnBe commented Nov 12, 2024

melisande-c commented Nov 12, 2024 • edited Loading

melisande-c commented Nov 12, 2024

FynnBe commented Nov 12, 2024

FynnBe commented Nov 12, 2024

melisande-c commented Nov 20, 2024

FynnBe commented Nov 23, 2024

melisande-c commented Nov 25, 2024

melisande-c commented Oct 17, 2024 •

edited

Loading

FynnBe commented Nov 8, 2024 •

edited

Loading

melisande-c commented Nov 12, 2024 •

edited

Loading

melisande-c commented Nov 12, 2024 •

edited

Loading