Update submodules to the latest changes #212

thodkatz · 2024-07-02T15:56:14Z

Hey!

This is a small PR, that updates the submodules to the latest changes ( git submodules update --remote --merge).
It seems there were no conflicts, and tests are passing without any source code changes.

Initially when I have attempted to clone the repo, and run the tests:

make devenv
mamba activate tiktorch-server-env
python -m pytest tests/

I have stumbled upon this issue pytorch/pytorch#123097.

To resolve this, I have added in the environment.yml file, the -mlk!=2024.1.
The same problem has occured for ilastik too, and it is already listed there https://github.com/ilastik/ilastik/blob/main/dev/environment-dev.yml#L72

I have fixed also a broken link :)

thodkatz · 2024-07-02T18:04:28Z

I have realised that I have updated the submodules, but I missed the actual part of running conda run -n $(TIKTORCH_ENV_NAME) pip install . ./vendor/core-bioimage-io-python ./vendor/spec-bioimage-io to actually install them.
So this is work in progress :)

thodkatz · 2024-07-09T08:01:52Z

I tried to create a sketch to see what things have changed. One of the issues I have encountered, trying to update the spec and core bioimage repos, was the change of the v4 to v5 spec, and specifically the redesign of the concept of axis and shape.

v4

The input tensors could have axis with an explicit size e.g. number, that creates an explicit shape e.g. (1,2,3,4) or we can have a parameterized shape with the concept of minimum shape and step (ParameterizedShape).

The output tensor could have axis with explicit size as well, or an implicit one with the concept of reference tensor, scale and offset (ImplicitOutputShape).

v5

Instead of defining a parameterized shape for inputs or an implicit shape for outputs, this information is encapsulated per axis.

For example a tensor can have 4 axis, and per axis, we could have a parameterized (same concept with ParameterizedShape) size defined as ParameterizedSize, a referenced size (same concept with the ImplicitOutputShape) defined as SizeReference.

Also, more type of axis have been added that they don't fall to the category of parameterized or referenced, as shown in the sketch.

Issue

So an input tensor with one axis having size as ParameterizedSize, and one more with size SizeReference, cannot be inferred as either ParameterizedShape or ImplicitOutputShape, and this breaks the ModelInfo interface, and the ModelSession that provides an abstraction of bioimage with the ilastik.

Conclusion

I am not sure if I should proceed with a new redesign of the abstraction layer, and how this abstraction can happen. For example, is it possible the model info to have an interface only with explicit shapes, just numbers, and don't have to deal with all the implications on how axis size and shape is designed by the spec? This way it could actually create an abstraction, instead of the coupled interface that it has now.

thodkatz · 2024-08-26T23:43:21Z

As we have seen in the PR of attempting to implement the GPU prompting, reasoning on the v4 of spec is redundant and erroneous #213 (comment). Thus, I have finally updated tiktorch to work with spec v5.

The way we test has been refactored. Instead of relying on test files, we mock the dependencies with objects. This allows us to focus on the necessary interface we need to use for tiktorch, simplifying the process as well.
Use spec v5 objects and replace the Sample defined in tiktorch and use the bioimageio one.
Regarding testing for the validation of the shapes and axes as mentioned here Check CUDA out of memory #213 (review), the validation functionality is exposed via the tests used for the forward pass use case (here). I am not sure if we should have separate tests, since what we actually want is for the service requests to fail if input is invalid. So I think it should be exposed implicitly via the inputs of the requests.

k-dominik

Hey @thodkatz, this is really great. Awesome how much you could also simplify the code with the update!

I really like that the models are now fully mocked. Probably speeds up the tests, too.
I wish we would retain maybe one "integration" test with a real model. For me this also fulfills documentation purposes -> how does realistic usage look like.

I'm currently investigating test failures that seem to be on osx-only 🙄
Edit: errors on osx due to different default start method for processes - spawn. Turns out mock objects cannot be pickled resulting in _pickle.PicklingError: args[0] from __newobj__ args has the wrong class errors. Changing it to fork resolves it, but apparently fork is not safe to use on osx.

k-dominik · 2024-08-27T07:50:22Z

Makefile

+install: clean
+	conda run -n $(TIKTORCH_ENV_NAME) pip install -e $(ALL_PACKAGES)


1. pip installs additional packages

Currently this will install a few additional packages via pip:

Successfully installed annotated-types-0.7.0 bioimageio.core-0.6.7 bioimageio.spec-0.5.3.post4 distro-1.9.0 dnspython-2.6.1 email-validator-2.2.0 fire-0.6.0 loguru-0.7.2 pooch-1.8.2 pydantic-2.8.2 pydantic-core-2.20.1 pydantic-settings-2.4.0 python-dotenv-1.0.1 ruyaml-0.91.0 termcolor-2.4.0 tiktorch-23.11.0

the bioimage ones and tiktorch are expected here of course, but having pip install additional packages is often not safe (I think it is in this instance it looks mostly benign) but in general it is better to have the environment solved fully by conda. I would suggest adding these to environment.yml and double checking that during dev-env creation pip will not perform any installs other than tiktorch and the vendored packages.

2. I don't think pip install -e with more than one path works as expected.

While the local packages are installed, but only the first packages is installed editable, the rest ends up in the site-packages folder and is not editable.

Yep, thanks for noticing :) I will add them to the environment.yml

2. I was shaving this problem pypa/pip#11467, when trying to use pip install -e <submodule path>, I couldn't find a workaround so we have pip install <submodule path>. The issue with the last one is that creates a build directory. If the build directory isn't cleaned, then subsequent attempts of re-installing new commits of the submodule won't be updated. That's why I added this thing with the clean target, and separating the install submodules process as well. I am not sure if I am missing something.

Btw, for the first one, is it a good practise to declare indirect dependencies? All these packages installed by pip are coming from the dependencies of bioimageio packages. I could update the conda env file to use:

- pip: - -e .

But I am sorry actually I didn't get the point of the first one.

But it is true that in the setup.py we have as well

"grpcio>=1.31", "numpy", "protobuf", "pyyaml", "xarray",

, that could be for sure part of the env if I am not mistaken, I am not sure why they have been declated in the setup.py in the first place.

.gitignore

Makefile

k-dominik · 2024-08-27T08:56:25Z

tiktorch/rpc/mp.py

@@ -9,7 +9,7 @@
 from typing import Any, List, Optional, Type, TypeVar
 from uuid import uuid4

-from bioimageio.core.resource_io import nodes
+from bioimageio.spec.model import v0_5


phew... Not sure my take here is correct, but this does mean we limit ourselves to 0.5 models only? I don't think there are many 0.5 models in the wild, currently. I'm not really against limiting tiktorch to these only. But this will have consequences as to when we can switch to the new versions in ilastik (once the models in the zoo have been updated to 0.5). Since this PR removes the 0.4 functionality it's obvious how much of simplification it is to only support 0.5 :).

that's a good point. But regarding the discussion we had for #215, one of the major points for doing that is that bioimage is responsible for backwards compatibility right? So in theory a v0_4 model could provide a partial v0_5, so we don't have to deal with maintaining the abstraction layers of the bioimage. I am not sure if I misunderstood something. Otherwise, we have again to introduce something similar to model info to make v4 and v5 to work with tiktorch.

yes, with the library one could get a 0.5 from a 0.4 :)

k-dominik · 2024-08-27T09:03:50Z

tests/conftest.py

+    """
+
+    mocked_input1 = create_autospec(v0_5.InputTensorDescr)
+    mocked_input1.id = "input1"
+    mocked_input1.axes = [
+        v0_5.BatchAxis(),
+        v0_5.ChannelAxis(channel_names=["channel1", "channel2"]),
+        v0_5.SpaceInputAxis(id=AxisId("x"), size=10),
+        v0_5.SpaceInputAxis(id=AxisId("y"), size=v0_5.SizeReference(tensor_id="input3", axis_id="x")),
+    ]
+
+    mocked_input2 = create_autospec(v0_5.InputTensorDescr)
+    mocked_input2.id = "input2"
+    mocked_input2.axes = [
+        v0_5.BatchAxis(),
+        v0_5.ChannelAxis(channel_names=["channel1", "channel2"]),
+        v0_5.SpaceInputAxis(id=AxisId("x"), size=v0_5.ParameterizedSize(min=10, step=2)),
+        v0_5.SpaceInputAxis(id=AxisId("y"), size=v0_5.ParameterizedSize(min=10, step=5)),
+    ]
+
+    mocked_input3 = create_autospec(v0_5.InputTensorDescr)
+    mocked_input3.id = "input3"
+    mocked_input3.axes = [
+        v0_5.BatchAxis(),
+        v0_5.ChannelAxis(channel_names=["channel1", "channel2"]),
+        v0_5.SpaceInputAxis(id="x", size=v0_5.SizeReference(tensor_id="input2", axis_id="x")),
+        v0_5.SpaceInputAxis(id="y", size=10),
+    ]


amazing what's possible in 0.5... input1.y <- input3.x <-- input2.x

good to have a non-trivial test case :)

I was very skeptical about this when implementing the _realize_size_reference functionality :) I am not sure if it is an overkill

Not valid anymore, see #212 (comment)

k-dominik · 2024-08-27T09:09:16Z

tiktorch/server/session/backend/supervisor.py

@@ -39,9 +38,7 @@ def has_work(self):
        return self._pipeline.max_num_iterations and self._pipeline.max_num_iterations > self._pipeline.iteration_count

    def forward(self, input_tensors):


Suggested change

def forward(self, input_tensors):

def forward(self, input_tensors: Sample):

at least in my pyright setup this is not inferred.

k-dominik · 2024-08-27T09:16:11Z

tiktorch/server/session/process.py

-class InputTensorValidator:
-    def __init__(self, input_specs: List[nodes.InputTensor]):
-        self._input_specs = input_specs
+class SampleValidator:


Could you maybe clarify what this class is intended for? I see it takes potentially input, but also output descriptions. _get_spec seems to imply one can only check input tensors. So it's an InputSampleValidator?!

Yep, sorry the name should be InputSampleValidator and should get only the InputTensor

k-dominik · 2024-08-27T09:17:38Z

tiktorch/server/session/process.py

-    def __init__(self, input_specs: List[nodes.InputTensor]):
-        self._input_specs = input_specs
+class SampleValidator:
+    def __init__(self, specs: Union[List[v0_5.InputTensorDescr], List[v0_5.OutputTensorDescr]]):


are the types here as intended? Or should it be List[Union[v0_5.InputTensorDescr, v0_5.OutputTensorDescr]]?

k-dominik · 2024-08-27T09:20:16Z

tests/test_server/test_grpc/test_inference_servicer.py

+def valid_model_request(device_ids=None):
+    ret = inference_pb2.CreateModelSessionRequest(
+        model_blob=inference_pb2.Blob(content=b""), deviceIds=device_ids or ["cpu"]
    )
+    return ret


is an empty model_blob part of a valid ModelSessionRequest?

It has changed in this commit 16ccbe4 :)

thodkatz · 2024-08-29T08:17:38Z

By reproducing the issue that @k-dominik mentioned #212 (review), with mp.set_method_name("spawn") under conftest.py, I have attempted to redesign the way we test.

The idea of using mocks, seems convenient, but actually, I noticed that it can have some pitfalls. For example, by mocking the v0_5.ModelDescr, nothing stops you from patching axes, that are not valid and not supported. One of these use cases was my false impression that a SizeReference object can point to another SizeReference. So, I have simplified the logic to handle the size references (16ccbe4#diff-6aceb0322a663f6cc7ec8ab717637f2f1a0cab3c8a6a57c57f0c18903d8d2a36L68)

Another thing that I noticed, is that previously the main process used to create a prediction pipeline object before starting the child process here (16ccbe4#diff-6aceb0322a663f6cc7ec8ab717637f2f1a0cab3c8a6a57c57f0c18903d8d2a36L144). Then this object is transferred to the child process. This is redundant, since we already have a bytes object (model_bytes), so we can construct either a ModelDescr or a PredictionPipeline with that, instead of deserialing it in the main process, and the serializing, and then again deserializing in the child process.

TODO:

add tests with v4 ModelDescr to check the backwards compatibility that bioimageio provides

thodkatz · 2024-09-03T13:16:13Z

I have included tests for simple use cases of v4 models and I added parameterization for different weight formats e.g. pytorch_state_dict(), and torchscript.

As I was trying to avoid using files, and creating objects from scratch, I have stumbled upon a few issues, when trying to work with v4 models. I managed to find a workflow that it works though :)

I have created an issue in bioimageio spec. I could work on this for sure, if we want to support v4.

codecov · 2024-09-17T11:24:01Z

Codecov Report

Attention: Patch coverage is 71.15385% with 30 lines in your changes missing coverage. Please review.

Project coverage is 64.77%. Comparing base (fb680c1) to head (7877797).
Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
tiktorch/proto/inference_pb2.py	4.00%	24 Missing ⚠️
tiktorch/server/session/process.py	89.28%	6 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #212      +/-   ##
==========================================
- Coverage   65.50%   64.77%   -0.74%     
==========================================
  Files          40       40              
  Lines        2267     2183      -84     
==========================================
- Hits         1485     1414      -71     
+ Misses        782      769      -13

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

thodkatz · 2024-09-18T13:50:04Z

For the windows failing tests, I have submitted an issue in bioimageio spec bioimage-io/spec-bioimage-io#633

k-dominik

Really great, I had only small comments.

Great to have code now to generate various models - if I want to create a model in the future, I'll probably look here for minimal examples :)

Also okay to merge now, without fixing bioimage first (it will need fixing before i2k though).

k-dominik · 2024-09-20T12:52:29Z

tests/conftest.py

+    """
+    Mocked bioimageio prediction pipeline with three inputs single output
+    """
+    test_tensor1 = np.arange(1 * 2 * 10 * 10, dtype="float32").reshape(1, 2, 10, 10)


out of paranoia I'd avoid square test tensors (of fear that something somewhere might transpose and I wouldn't notice...)

Fixed 42210b7 :)

k-dominik · 2024-09-20T12:59:07Z

tests/test_server/test_grpc/test_inference_servicer.py

+    def test_call_predict_valid_explicit(self, grpc_stub, bioimage_model_explicit_siso):
+        model_bytes, expected_output = bioimage_model_explicit_siso
+        model = grpc_stub.CreateModelSession(valid_model_request(model_bytes))
+        arr = xr.DataArray(np.arange(2 * 10 * 10).reshape(1, 2, 10, 10), dims=("batch", "channel", "x", "y"))


for convenience the fixture could also return the input that produces the expected output (just because now one has to check the code in order to build arr.

Returning the expected_output, I think brough the confusion. The thing is that each fixture has a hardcoded dummy network defined in the conftest.py, and for simplicity this network returns a fixed output. But there isn't an actual input that produces it. The test should be responsible to create a valid input.

So, to avoid this confusion, I have changed the fixture to return the network (73e4fb4). So now from the test we need to define a valid input, and use the network to get the expected output, asserting that the Predict returns the same output.

But still we need to check how the input axes were created for each bioimageio fixture to create valid inputs. But I think that is part of the testing and we need to do it for the actual validation.

What do you think :) ? @k-dominik

thodkatz · 2024-09-20T14:06:28Z

The fix was a minor one to resolve the spec issue, I have opened a PR to resolve it bioimage-io/spec-bioimage-io#634. Thanks @k-dominik for the review! I will address them :)

- bioimageio.spec==0.5.3.post4 - bioimageio.core==0.6.7

- For non-unix systems, when launching a new process, `spawn` can be used instead of `fork`. With `spawn`, the memory of the parent process isn't copied to the child one as `fork`. Mock objects can't be serialized, and tests were failing due to this. - The `PredictionPipeline` object was created in the main process before the creation of the child process, and then it was serialized to be transferred to the child one. Then again it was deserialized. This is redundant, since initially the `PredictionPipeline` is created by a bytes value. So both the parent and the child process use the bytes to construct the same `PredictionPipeline`.

- Weights are parameterized for pytorch and torchscript workflows

…onality

Fixtures are bioimageio models with dummy networks that do simple operations such as add one to the input. To test that, and to know what a particular model does, we encode it within the name of the fixture e.g. modelAddOne

thodkatz force-pushed the update-dependencies branch from b72ccd3 to c589515 Compare July 2, 2024 17:19

thodkatz changed the title ~~Update submodules to the latest changes~~ WIP: Update submodules to the latest changes Jul 2, 2024

thodkatz marked this pull request as draft August 7, 2024 12:52

thodkatz changed the title ~~WIP: Update submodules to the latest changes~~ Update submodules to the latest changes Aug 7, 2024

thodkatz mentioned this pull request Aug 10, 2024

Remove interface model info #215

Merged

thodkatz force-pushed the update-dependencies branch from c589515 to 629893b Compare August 26, 2024 23:39

thodkatz marked this pull request as ready for review August 26, 2024 23:40

k-dominik reviewed Aug 27, 2024

View reviewed changes

thodkatz force-pushed the update-dependencies branch from d84c70f to 16ccbe4 Compare August 29, 2024 08:13

thodkatz mentioned this pull request Sep 3, 2024

v4 issues and design ideas bioimage-io/spec-bioimage-io#629

Closed

thodkatz force-pushed the update-dependencies branch from 439149a to 6de95c7 Compare September 3, 2024 13:14

thodkatz force-pushed the update-dependencies branch from 6de95c7 to ad5b34d Compare September 3, 2024 13:22

thodkatz mentioned this pull request Sep 12, 2024

Migrate from circleci to github actions #217

Merged

thodkatz force-pushed the update-dependencies branch from ad5b34d to bfac267 Compare September 17, 2024 11:19

thodkatz force-pushed the update-dependencies branch 10 times, most recently from 91e34c8 to eb2b18b Compare September 17, 2024 14:39

k-dominik approved these changes Sep 20, 2024

View reviewed changes

thodkatz force-pushed the update-dependencies branch from 73e4fb4 to e5f786a Compare October 11, 2024 14:14

thodkatz added 8 commits October 11, 2024 19:45

Remove test data and unused code originated from ModelInfo

cc44712

Update submodules to the latest changes

984bc80

- bioimageio.spec==0.5.3.post4 - bioimageio.core==0.6.7

Add tests for v4 models and parameterized weights

2e86724

- Weights are parameterized for pytorch and torchscript workflows

Fix flaky tests due to timeout errors

57aa7c4

Simplify input sample validation utilizing existing bioimageio functi…

8b35a1b

…onality

Change test arrays to not be identical with transpose

c53ddc1

Encode network's functionality within fixture names

7877797

Fixtures are bioimageio models with dummy networks that do simple operations such as add one to the input. To test that, and to know what a particular model does, we encode it within the name of the fixture e.g. modelAddOne

thodkatz force-pushed the update-dependencies branch from e5f786a to 7877797 Compare October 11, 2024 17:48

thodkatz merged commit c5ac7d6 into ilastik:main Oct 11, 2024
7 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update submodules to the latest changes #212

Update submodules to the latest changes #212

thodkatz commented Jul 2, 2024 •

edited

Loading

thodkatz commented Jul 2, 2024

thodkatz commented Jul 9, 2024 •

edited

Loading

thodkatz commented Aug 26, 2024 •

edited

Loading

k-dominik left a comment •

edited

Loading

k-dominik Aug 27, 2024

thodkatz Aug 29, 2024 •

edited

Loading

thodkatz Aug 29, 2024 •

edited

Loading

k-dominik Aug 27, 2024

thodkatz Aug 28, 2024 •

edited

Loading

k-dominik Aug 28, 2024 •

edited

Loading

k-dominik Aug 27, 2024

thodkatz Aug 28, 2024

thodkatz Aug 29, 2024

k-dominik Aug 27, 2024

k-dominik Aug 27, 2024

thodkatz Aug 29, 2024

k-dominik Aug 27, 2024

k-dominik Aug 27, 2024

thodkatz Aug 29, 2024

thodkatz commented Aug 29, 2024 •

edited

Loading

thodkatz commented Sep 3, 2024 •

edited

Loading

codecov bot commented Sep 17, 2024 •

edited

Loading

thodkatz commented Sep 18, 2024

k-dominik left a comment

k-dominik Sep 20, 2024

thodkatz Oct 10, 2024

k-dominik Sep 20, 2024

thodkatz Oct 10, 2024 •

edited

Loading

thodkatz commented Sep 20, 2024 •

edited

Loading

		install: clean
		conda run -n $(TIKTORCH_ENV_NAME) pip install -e $(ALL_PACKAGES)

		@@ -39,9 +38,7 @@ def has_work(self):
		return self._pipeline.max_num_iterations and self._pipeline.max_num_iterations > self._pipeline.iteration_count

		def forward(self, input_tensors):

	def forward(self, input_tensors):
	def forward(self, input_tensors: Sample):

Update submodules to the latest changes #212

Update submodules to the latest changes #212

Conversation

thodkatz commented Jul 2, 2024 • edited Loading

thodkatz commented Jul 2, 2024

thodkatz commented Jul 9, 2024 • edited Loading

v4

v5

Issue

Conclusion

thodkatz commented Aug 26, 2024 • edited Loading

k-dominik left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

1. pip installs additional packages

2. I don't think pip install -e with more than one path works as expected.

thodkatz Aug 29, 2024 • edited Loading

Choose a reason for hiding this comment

thodkatz Aug 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thodkatz Aug 28, 2024 • edited Loading

Choose a reason for hiding this comment

k-dominik Aug 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thodkatz commented Aug 29, 2024 • edited Loading

thodkatz commented Sep 3, 2024 • edited Loading

codecov bot commented Sep 17, 2024 • edited Loading

Codecov Report

thodkatz commented Sep 18, 2024

k-dominik left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thodkatz Oct 10, 2024 • edited Loading

Choose a reason for hiding this comment

thodkatz commented Sep 20, 2024 • edited Loading

thodkatz commented Jul 2, 2024 •

edited

Loading

thodkatz commented Jul 9, 2024 •

edited

Loading

thodkatz commented Aug 26, 2024 •

edited

Loading

k-dominik left a comment •

edited

Loading

2. I don't think `pip install -e` with more than one path works as expected.

thodkatz Aug 29, 2024 •

edited

Loading

thodkatz Aug 29, 2024 •

edited

Loading

thodkatz Aug 28, 2024 •

edited

Loading

k-dominik Aug 28, 2024 •

edited

Loading

thodkatz commented Aug 29, 2024 •

edited

Loading

thodkatz commented Sep 3, 2024 •

edited

Loading

codecov bot commented Sep 17, 2024 •

edited

Loading

thodkatz Oct 10, 2024 •

edited

Loading

thodkatz commented Sep 20, 2024 •

edited

Loading