Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocessing error in visibility phase #243

Open
PawtingDev opened this issue Jan 16, 2024 · 1 comment
Open

Multiprocessing error in visibility phase #243

PawtingDev opened this issue Jan 16, 2024 · 1 comment

Comments

@PawtingDev
Copy link

I followed instructions in dataset.md to process THuman2.0.
Rendering phase works fine using python -m scripts.render_batch -debug -headless.
However, running visibility phase using python -m scripts.visibility_batch -debug failed:

(dev) pawting@pc0809:/media/pawting/SN640/hello_worlds/ICON$ python -m scripts.visibility_batch_mod -debug
Start Visibility Computing thuman2 with 36 views.
Output dir: ./debug/thuman2_36views
  0%|                                                                                    | 0/2 [00:00<?, ?it/s]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/media/pawting/SN640/hello_worlds/ICON/scripts/visibility_batch_mod.py", line 36, in visibility_subject
    smpl_verts = torch.from_numpy(rescale_fitted_body.vertices).to(device).float()
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/site-packages/torch/cuda/__init__.py", line 284, in _lazy_init
    raise RuntimeError(
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/media/pawting/SN640/hello_worlds/ICON/scripts/visibility_batch_mod.py", line 122, in <module>
    for _ in tqdm(
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/site-packages/tqdm/std.py", line 1182, in __iter__
    for obj in iterable:
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

I'm not familiar with muiltiprocessing, maybe it's related to operating .to(device) in subprocesses?

I've tried suggestion here to force 'spawn' as start method, it wont work:

(dev) pawting@pc0809:/media/pawting/SN640/hello_worlds/ICON$ python -m scripts.visibility_batch -debug
Start Visibility Computing thuman2 with 36 views.
Output dir: ./debug/thuman2_36views
  0%|                                                                                    | 0/2 [00:06<?, ?it/s]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/media/pawting/SN640/hello_worlds/ICON/scripts/visibility_batch.py", line 25, in visibility_subject
    gpu_id = queue.get()
NameError: name 'queue' is not defined. Did you mean: 'Queue'?
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/media/pawting/SN640/hello_worlds/ICON/scripts/visibility_batch.py", line 97, in <module>
    for _ in tqdm(
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/site-packages/tqdm/std.py", line 1182, in __iter__
    for obj in iterable:
  File "/home/pawting/anaconda3/envs/dev/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
NameError: name 'queue' is not defined

@RichardChen20
Copy link

RichardChen20 commented Oct 23, 2024

I also have the same problem. I failed to figure it out, so I just modify the code to process the data case by case using only one progress.

for sub in tqdm(subjects):
visibility_subject(
subject=sub,
dataset=args.dataset,
save_folder=current_out_dir,
rotation=args.num_views,
debug=args.debug,
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants