Fix model architecture for deployment to ONNX #234

ziw-liu · 2023-05-12T21:31:44Z

Addresses most of #214.

Also includes fixes to the inference module.

~~I'm opening this now for code visibility, but probably should merge after #232 (this is based on that branch).~~

not doing anything yet

This reverts commit c30ed61.

this is not practical for larger datasets

…ion in inference readme

* sync log metrics * example of using more GPUs * sync log metrics * example of using more GPUs

this was accidentally tracked

mattersoflight · 2023-05-26T03:26:32Z

micro_dl/inference/readme.md


 The main command for inference is:
+
 ```buildoutcfg


The behavior of inference CLI needs to be changed, I get following error when trying to run inference.

(lit-monai) [shalin.mehta@gpu-a-2 microDL]$ python micro_dl/cli/torch_inference_script.py --config /hpc/projects/CompMicro/projects/virtual_staining/models/phase2nuclei018/test/inference_config_processed_hek_no_perturb_512_512.yml Traceback (most recent call last): File "/hpc/mydata/shalin.mehta/code/microDL/micro_dl/cli/torch_inference_script.py", line 201, in <module> main(args.config, args.gpu, args.gpu_mem_frac) File "/hpc/mydata/shalin.mehta/code/microDL/micro_dl/cli/torch_inference_script.py", line 190, in main torch_predictor = torch_inference_utils.TorchPredictor( File "/hpc/mydata/shalin.mehta/code/microDL/micro_dl/inference/inference.py", line 59, in __init__ self.network_z_depth = self.network_config["in_stack_depth"] KeyError: 'in_stack_depth'

I think we are now linking network_z_depth and in_stack_depth in train.py

Let's switch to lightning CLI's predict subcommand after we merge this PR.

mattersoflight · 2023-05-26T03:35:03Z

micro_dl/inference/readme.md

The deployment script uses inference module and I am seeing the same error as above:

(lit-monai) [shalin.mehta@gpu-a-2 microDL]$ python micro_dl/cli/onnx_export_script.py --model_path /hpc/projects/CompMicro/projects/virtual_staining/models/phase2nuclei018/lightning_logs/20230514-003340/checkpoints/epoch=29-step=4350.ckpt --stack_depth 5 --export_path /hpc/projects/CompMicro/projects/virtual_staining/models/phase2nuclei018/deployment/20230514-003340_epoch29.onnx Initializing model in pytorch... Traceback (most recent call last): File "/hpc/mydata/shalin.mehta/code/microDL/micro_dl/cli/onnx_export_script.py", line 177, in <module> main(args) File "/hpc/mydata/shalin.mehta/code/microDL/micro_dl/cli/onnx_export_script.py", line 164, in main export_model(model_dir, model_name, args.stack_depth, args.export_path) File "/hpc/mydata/shalin.mehta/code/microDL/micro_dl/cli/onnx_export_script.py", line 99, in export_model torch_predictor = inference.TorchPredictor( File "/hpc/mydata/shalin.mehta/code/microDL/micro_dl/inference/inference.py", line 59, in __init__ self.network_z_depth = self.network_config["in_stack_depth"] KeyError: 'in_stack_depth'

mattersoflight

@ziw-liu with the change in training config, the inference and export CLI are throwing an error that seems easy to fix. Please handle them both. Feel free to update the inference stage to use predict subcommand of lightning CLI. That may impose some directory structure, but that is a good thing! We can handle the refactor of inference CLI as the next PR if that is easier for you.

ziw-liu · 2023-05-27T03:43:31Z

@mattersoflight I was already moving inference into lightning (#236) as you reviewed this PR. Can we merge this as-is?

This reverts commit 00375b0.

Christian Foley and others added 25 commits April 30, 2023 01:45

added primitive xy scaling feature to inference

f00aa4e

fixed opset mismatches

08f77fa

changes to model + scripts for exporting to onnx

24125b3

fixed dropout incompatibility, moved logger to utils.

c7ba944

removed inference session from model export script

1e93666

use builtins to normalize

04df230

caching argument

a189d57

not doing anything yet

profiling script

dab0d01

separate augmentation

c30ed61

Revert "separate augmentation"

402654b

This reverts commit c30ed61.

remove unused import

99eb6a1

let zarr auto-detect multi-processing

08ae1cd

remove unused import

c995dbb

copy zarr store to memory

6269afd

allow cache and preload

259c69b

update profiling script

dac13b7

format

665cf89

remove preload

acc6bcc

this is not practical for larger datasets

cleanup

6a6ffc8

configurable number of log samples

8b0620d

split train and validation dataset at fov level

8664b84

fix shuffling

086bab9

Merge branch 'fast-dataloading' into inference-no-dropout

b5809e1

fix dropout layers initialization

d1b6ec7

fix filter hyperparameter check

8f82fa9

ziw-liu added bug Something isn't working enhancement New feature or request inference using and sharing the models labels May 12, 2023

ziw-liu added 2 commits May 12, 2023 15:38

disable dropout for the head

2644e1b

stronger default augmentation

ba8ecb0

Christian Foley and others added 4 commits May 15, 2023 14:16

formalized onnx exporting and moved script to CLI, updated documentat…

35e116a

…ion in inference readme

Merge branch 'pytorch_implementation' into inference-no-dropout

5bcaabc

fix merge conflict

8a07a13

isort

c600caa

ziw-liu marked this pull request as ready for review May 15, 2023 21:52

ziw-liu requested review from mattersoflight, Christianfoley, Soorya19Pradeep and edyoshikun May 15, 2023 21:52

mattersoflight and others added 5 commits May 17, 2023 17:28

combine training CLI with others

208e135

Multi-GPU training (#235)

6df904b

* sync log metrics * example of using more GPUs * sync log metrics * example of using more GPUs

revised data format

db96a52

updated data org

35763cb

remove profile output

5f31f97

this was accidentally tracked

mattersoflight reviewed May 26, 2023

View reviewed changes

mattersoflight suggested changes May 26, 2023

View reviewed changes

ziw-liu requested a review from mattersoflight May 27, 2023 03:43

mattersoflight merged commit 00375b0 into pytorch_implementation May 29, 2023
1 check failed

ziw-liu added a commit that referenced this pull request May 30, 2023

Revert "Fix model architecture for deployment to ONNX (#234)"

00c484a

This reverts commit 00375b0.

ziw-liu mentioned this pull request Jun 16, 2023

Make model architecture compatible with deployment #214

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix model architecture for deployment to ONNX #234

Fix model architecture for deployment to ONNX #234

ziw-liu commented May 12, 2023 •

edited

Loading

mattersoflight May 26, 2023

mattersoflight May 26, 2023

mattersoflight May 26, 2023

mattersoflight left a comment

ziw-liu commented May 27, 2023

Fix model architecture for deployment to ONNX #234

Fix model architecture for deployment to ONNX #234

Conversation

ziw-liu commented May 12, 2023 • edited Loading

mattersoflight May 26, 2023

Choose a reason for hiding this comment

mattersoflight May 26, 2023

Choose a reason for hiding this comment

mattersoflight May 26, 2023

Choose a reason for hiding this comment

mattersoflight left a comment

Choose a reason for hiding this comment

ziw-liu commented May 27, 2023

ziw-liu commented May 12, 2023 •

edited

Loading