Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Running MONAI Spleen CT Segmentation Bundle for version>1.2.0 When using MONAI FL Module #8330

Open
Zilinghan opened this issue Feb 6, 2025 · 0 comments
Assignees

Comments

@Zilinghan
Copy link

Describe the bug
I am trying to use the MONAI's Spleen CT Segmentation Bundle with the MONAI FL module, and errors appear for monai>1.2.0. (It worked for 1.2.0, but error occurs for 1.3.0 and 1.4.0.)

To Reproduce
Steps to reproduce the behavior:

  1. I downloaded the bundle using the following commands:
JOB_NAME=job
python3 -m monai.bundle download --name "spleen_ct_segmentation" --version "0.4.6" --bundle_dir ./${JOB_NAME}/app/config
  1. I download the data using the following script:
# download_spleen_dataset.py
import argparse

from monai.apps.utils import download_and_extract


def download_spleen_dataset(filepath, output_dir):
    url = "https://msd-for-monai.s3-us-west-2.amazonaws.com/Task09_Spleen.tar"
    download_and_extract(url=url, filepath=filepath, output_dir=output_dir)


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--filepath",
        "-f",
        type=str,
        help="the file path of the downloaded compressed file.",
        default="./data/Task09_Spleen.tar",
    )
    parser.add_argument(
        "--output_dir", "-o", type=str, help="target directory to save extracted files.", default="./data"
    )
    args = parser.parse_args()
    download_spleen_dataset(args.filepath, args.output_dir)

and then runs command

JOB_NAME=job
python download_spleen_dataset.py
sed -i "s|/workspace/data/Task09_Spleen|${PWD}/data/Task09_Spleen|g" ${JOB_NAME}/app/config/spleen_ct_segmentation/configs/train.json
  1. I installed monai via:
pip install monai[all]==1.4.0 # or pip install monai[all]==1.3.0 # or pip install monai[all]==1.2.0
  1. The testing script I run is: (python test.py)
# test.py
from monai.fl.client.monai_algo import MonaiAlgo
from monai.fl.utils.constants import ExtraItems

monai_algo = MonaiAlgo(
    bundle_root='./job/app/config/spleen_ct_segmentation',
    send_weight_diff=False,
)

monai_algo.initialize(
    extra={
        ExtraItems.CLIENT_NAME: "Client",
    }
)

model = monai_algo.get_weights()
metric = monai_algo.evaluate(model)
print(metric.metrics)
monai_algo.train(model)
new_model = monai_algo.get_weights()
metric = monai_algo.evaluate(new_model)
print(metric.metrics)
  1. I got the following error for monai 1.3.0 and 1.4.0, (1.2.0 works fine)
2025-02-06 14:55:23,959 - INFO - Setting logging properties based on config: job/app/config/spleen_ct_segmentation/configs/logging.conf.
2025-02-06 14:55:24,020 - INFO - Initialized Client.
2025-02-06 14:55:24,025 - INFO - Returning current weights.
2025-02-06 14:55:24,025 - INFO - Load Client weights...
2025-02-06 14:55:24,025 - INFO - Converted 148 global variables to match 148 local variables.
2025-02-06 14:55:24,026 - INFO - 'dst' model updated: 148 of 148 variables.
2025-02-06 14:55:24,031 - INFO - Start Client evaluating...
2025-02-06 14:55:24,031 - ignite.engine.engine.SupervisedEvaluator - INFO - Engine run resuming from iteration 0, epoch 0 until 1 epochs
2025-02-06 14:55:26,275 - ignite.engine.engine.SupervisedEvaluator - ERROR - Current run is terminating due to exception: 'image_meta_dict'
2025-02-06 14:55:26,275 - ERROR - Exception: 'image_meta_dict'
Traceback (most recent call last):
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/ignite/engine/engine.py", line 1069, in _run_once_on_dataset_as_gen
    self._fire_event(Events.ITERATION_COMPLETED)
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/ignite/engine/engine.py", line 425, in _fire_event
    func(*first, *(event_args + others), **kwargs)
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/monai/handlers/metrics_saver.py", line 124, in _get_filenames
    meta_data = self.batch_transform(engine.state.batch)
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/monai/handlers/utils.py", line 199, in _wrapper
    ret = [data[0][k] if first else [i[k] for i in data] for k in _keys]
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/monai/handlers/utils.py", line 199, in <listcomp>
    ret = [data[0][k] if first else [i[k] for i in data] for k in _keys]
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/monai/handlers/utils.py", line 199, in <listcomp>
    ret = [data[0][k] if first else [i[k] for i in data] for k in _keys]
KeyError: 'image_meta_dict'
2025-02-06 14:55:27,522 - ignite.engine.engine.SupervisedEvaluator - ERROR - Engine run is terminating due to exception: 'image_meta_dict'
2025-02-06 14:55:27,522 - ERROR - Exception: 'image_meta_dict'
Traceback (most recent call last):
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/ignite/engine/engine.py", line 959, in _internal_run_as_gen
    epoch_time_taken += yield from self._run_once_on_dataset_as_gen()
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/ignite/engine/engine.py", line 1087, in _run_once_on_dataset_as_gen
    self._handle_exception(e)
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/ignite/engine/engine.py", line 636, in _handle_exception
    self._fire_event(Events.EXCEPTION_RAISED, e)
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/ignite/engine/engine.py", line 425, in _fire_event
    func(*first, *(event_args + others), **kwargs)
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/monai/handlers/stats_handler.py", line 202, in exception_raised
    raise e
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/ignite/engine/engine.py", line 1069, in _run_once_on_dataset_as_gen
    self._fire_event(Events.ITERATION_COMPLETED)
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/ignite/engine/engine.py", line 425, in _fire_event
    func(*first, *(event_args + others), **kwargs)
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/monai/handlers/metrics_saver.py", line 124, in _get_filenames
    meta_data = self.batch_transform(engine.state.batch)
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/monai/handlers/utils.py", line 199, in _wrapper
    ret = [data[0][k] if first else [i[k] for i in data] for k in _keys]
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/monai/handlers/utils.py", line 199, in <listcomp>
    ret = [data[0][k] if first else [i[k] for i in data] for k in _keys]
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/monai/handlers/utils.py", line 199, in <listcomp>
    ret = [data[0][k] if first else [i[k] for i in data] for k in _keys]
KeyError: 'image_meta_dict'
...

Expected behavior

python test.py runs through without any errors.

Screenshots
N/A

Environment

Ensuring you use the relevant python executable, please paste the output of:

================================
Printing MONAI config...
================================
MONAI version: 1.4.0
Numpy version: 1.26.4
Pytorch version: 2.3.1+cu121
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: 46a5272196a6c2590ca2589029eed8e4d56ff008
MONAI __file__: /eagle/tpc/<username>/conda_envs/appfl/lib/python3.10/site-packages/monai/__init__.py

Optional dependencies:
Pytorch Ignite version: 0.4.11
ITK version: 5.4.0
Nibabel version: 5.3.2
scikit-image version: 0.24.0
scipy version: 1.14.1
Pillow version: 10.3.0
Tensorboard version: 2.18.0
gdown version: 5.2.0
TorchVision version: 0.18.1+cu121
tqdm version: 4.67.1
lmdb version: 1.6.2
psutil version: 5.9.8
pandas version: 2.2.3
einops version: 0.8.0
transformers version: 4.40.2
mlflow version: 2.19.0
pynrrd version: 1.1.1
clearml version: 1.17.0

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies


================================
Printing system config...
================================
System: Linux
Linux version: SUSE Linux Enterprise Server 15 SP5
Platform: Linux-5.14.21-150500.55.49-default-x86_64-with-glibc2.31
Processor: x86_64
Machine: x86_64
Python version: 3.10.14
Process name: pt_main_thread
Command: ['python', '-c', 'import monai; monai.config.print_debug_info()']
Open files: []
Num physical CPUs: 32
Num logical CPUs: 64
Num usable CPUs: 64
CPU usage (%): [1.5, 1.2, 1.1, 1.1, 0.9, 0.8, 0.8, 0.8, 1.3, 1.1, 0.8, 0.9, 5.0, 3.2, 1.0, 1.1, 1.0, 0.9, 0.9, 1.1, 0.9, 1.0, 0.9, 0.9, 0.9, 1.0, 0.9, 1.1, 6.2, 0.9, 0.9, 1.0, 1.1, 1.2, 1.3, 35.3, 1.3, 1.0, 1.0, 0.9, 0.9, 0.9, 1.0, 1.0, 1.0, 1.0, 1.0, 1.4, 1.0, 0.9, 0.9, 1.0, 0.8, 1.1, 1.1, 1.1, 0.9, 0.8, 0.9, 0.9, 0.9, 0.9, 0.9, 0.8]
CPU freq. (MHz): 2788
Load avg. in last 1, 5, 15 mins (%): [0.7, 1.2, 1.0]
Disk usage (%): 0.8
Avg. sensor temp. (Celsius): UNKNOWN for given OS
Total physical memory (GB): 503.2
Available memory (GB): 490.8
Used memory (GB): 4.7

================================
Printing GPU config...
================================
Num GPUs: 4
Has CUDA: True
CUDA version: 12.1
cuDNN enabled: True
NVIDIA_TF32_OVERRIDE: None
TORCH_ALLOW_TF32_CUBLAS_OVERRIDE: None
cuDNN version: 8902
Current device: 0
Library compiled for CUDA architectures: ['sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']
GPU 0 Name: NVIDIA A100-SXM4-40GB
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 108
GPU 0 Total memory (GB): 39.4
GPU 0 CUDA capability (maj.min): 8.0
GPU 1 Name: NVIDIA A100-SXM4-40GB
GPU 1 Is integrated: False
GPU 1 Is multi GPU board: False
GPU 1 Multi processor count: 108
GPU 1 Total memory (GB): 39.4
GPU 1 CUDA capability (maj.min): 8.0
GPU 2 Name: NVIDIA A100-SXM4-40GB
GPU 2 Is integrated: False
GPU 2 Is multi GPU board: False
GPU 2 Multi processor count: 108
GPU 2 Total memory (GB): 39.4
GPU 2 CUDA capability (maj.min): 8.0
GPU 3 Name: NVIDIA A100-SXM4-40GB
GPU 3 Is integrated: False
GPU 3 Is multi GPU board: False
GPU 3 Multi processor count: 108
GPU 3 Total memory (GB): 39.4
GPU 3 CUDA capability (maj.min): 8.0

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants