Failed to load flux.1-dev with enable_sequential_cpu_offload and use_fp8_t5_encoder (4090) #407

WeiboXu · 2024-12-24T02:56:55Z

command:
CUDA_VISIBLE_DEVICES=4,5 torchrun --nproc_per_node=2 examples/flux_example.py --model /models/sfast_model/FLUX.1-dev --height 512 --width 512 --no_use_resolution_binning --pipefusion_parallel_degree 2 --ulysses_degree 1 --num_inference_steps 2 --warmup_steps 0 --prompt "A small dog" --tensor_parallel_degree 1 --use_fp8_t5_encoder --enable_sequential_cpu_offload

environment:
cuda: 12.2
Driver Version: 535.146.02
python3.10.12
torch: 2.5.1

log:
W1224 02:55:39.414000 5378 torch/distributed/run.py:793]
W1224 02:55:39.414000 5378 torch/distributed/run.py:793] *****
W1224 02:55:39.414000 5378 torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
W1224 02:55:39.414000 5378 torch/distributed/run.py:793] *
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.2+cu121 with CUDA 1201 (you have 2.5.1+cu124)
Python 3.10.13 (you have 3.10.12)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.2+cu121 with CUDA 1201 (you have 2.5.1+cu124)
Python 3.10.13 (you have 3.10.12)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
/usr/local/lib/python3.10/dist-packages/xformers/triton/softmax.py:30: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
@custom_fwd(cast_inputs=torch.float16 if _triton_softmax_fp16_enabled else None)
/usr/local/lib/python3.10/dist-packages/xformers/triton/softmax.py:87: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(
/usr/local/lib/python3.10/dist-packages/xformers/ops/swiglu_op.py:107: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
def forward(cls, ctx, x, w1, b1, w2, b2, w3, b3):
/usr/local/lib/python3.10/dist-packages/xformers/ops/swiglu_op.py:128: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(cls, ctx, dx5):
/usr/local/lib/python3.10/dist-packages/xformers/triton/softmax.py:30: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
@custom_fwd(cast_inputs=torch.float16 if _triton_softmax_fp16_enabled else None)
/usr/local/lib/python3.10/dist-packages/xformers/triton/softmax.py:87: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(
/usr/local/lib/python3.10/dist-packages/xformers/ops/swiglu_op.py:107: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
def forward(cls, ctx, x, w1, b1, w2, b2, w3, b3):
/usr/local/lib/python3.10/dist-packages/xformers/ops/swiglu_op.py:128: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(cls, ctx, dx5):
WARNING 12-24 02:55:43 [args.py:326] Distributed environment is not initialized. Initializing...
DEBUG 12-24 02:55:43 [parallel_state.py:179] world_size=-1 rank=-1 local_rank=-1 distributed_init_method=env:// backend=nccl
WARNING 12-24 02:55:43 [args.py:326] Distributed environment is not initialized. Initializing...
DEBUG 12-24 02:55:43 [parallel_state.py:179] world_size=-1 rank=-1 local_rank=-1 distributed_init_method=env:// backend=nccl
INFO 12-24 02:55:43 [config.py:120] Ring degree not set, using default value 1
INFO 12-24 02:55:43 [config.py:120] Ring degree not set, using default value 1
INFO 12-24 02:55:43 [config.py:164] Pipeline patch number not set, using default value 2
INFO 12-24 02:55:43 [config.py:164] Pipeline patch number not set, using default value 2
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████| 2/2 [00:05<00:00, 2.85s/it]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████| 2/2 [00:05<00:00, 2.86s/it]
Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 43%|█████████████████████▊ | 3/7 [00:00<00:01, 3.20it/s]=====checkpoint_file: /models/sfast_model/FLUX.1-dev/vae/diffusion_pytorch_model.safetensors
Loading pipeline components...: 100%|███████████████████████████████████████████████████| 7/7 [00:01<00:00, 5.72it/s]
WARNING 12-24 02:56:12 [runtime_state.py:63] Model parallel is not initialized, initializing...
Loading pipeline components...: 57%|█████████████████████████████▏ | 4/7 [00:01<00:00, 3.12it/s]=====checkpoint_file: /models/sfast_model/FLUX.1-dev/vae/diffusion_pytorch_model.safetensors
Loading pipeline components...: 100%|███████████████████████████████████████████████████| 7/7 [00:01<00:00, 5.19it/s]
WARNING 12-24 02:56:12 [runtime_state.py:63] Model parallel is not initialized, initializing...
INFO 12-24 02:56:12 [base_pipeline.py:292] Transformer backbone found, paralleling transformer...
INFO 12-24 02:56:12 [base_pipeline.py:292] Transformer backbone found, paralleling transformer...
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.0.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.1.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.2.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.3.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.4.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.0.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.5.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.1.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.6.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.2.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.7.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.3.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.4.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.8.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.5.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.9.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.6.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.7.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.10.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.8.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.11.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.9.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.12.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.10.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.11.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.13.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.12.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.14.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.13.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.14.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.15.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.15.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.16.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.16.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.17.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.17.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.18.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping transformer_blocks.18.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.19.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.20.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping single_transformer_blocks.0.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.21.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping single_transformer_blocks.1.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.22.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping single_transformer_blocks.2.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.23.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping single_transformer_blocks.3.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.24.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping single_transformer_blocks.4.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.25.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping single_transformer_blocks.5.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.26.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping single_transformer_blocks.6.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 1] Wrapping single_transformer_blocks.27.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping single_transformer_blocks.7.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping single_transformer_blocks.8.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_model.py:83] [RANK 0] Wrapping single_transformer_blocks.9.attn in model class FluxTransformer2DModel with xFuserAttentionWrapper
INFO 12-24 02:56:12 [base_pipeline.py:343] Scheduler found, paralleling scheduler...
INFO 12-24 02:56:12 [base_pipeline.py:343] Scheduler found, paralleling scheduler...
[rank1]: Traceback (most recent call last):
[rank1]: File "/workspace/xDiT/examples/flux_example.py", line 96, in
[rank1]: main()
[rank1]: File "/workspace/xDiT/examples/flux_example.py", line 46, in main
[rank1]: pipe.enable_sequential_cpu_offload(gpu_id=local_rank)
[rank1]: File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 1151, in enable_sequential_cpu_offload
[rank1]: cpu_offload(model, device, offload_buffers=offload_buffers)
[rank1]: File "/usr/local/lib/python3.10/dist-packages/accelerate/big_modeling.py", line 205, in cpu_offload
[rank1]: attach_align_device_hook(
[rank1]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 518, in attach_align_device_hook
[rank1]: attach_align_device_hook(
[rank1]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 518, in attach_align_device_hook
[rank1]: attach_align_device_hook(
[rank1]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 518, in attach_align_device_hook
[rank1]: attach_align_device_hook(
[rank1]: [Previous line repeated 4 more times]
[rank1]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 509, in attach_align_device_hook
[rank1]: add_hook_to_module(module, hook, append=True)
[rank1]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 161, in add_hook_to_module
[rank1]: module = hook.init_hook(module)
[rank1]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 308, in init_hook
[rank1]: set_module_tensor_to_device(module, name, "meta")
[rank1]: File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py", line 365, in set_module_tensor_to_device
[rank1]: new_value = param_cls(new_value, requires_grad=old_value.requires_grad).to(device)
[rank1]: TypeError: WeightQBytesTensor.new() missing 6 required positional arguments: 'axis', 'size', 'stride', 'data', 'scale', and 'activation_qtype'
[rank0]: Traceback (most recent call last):
[rank0]: File "/workspace/xDiT/examples/flux_example.py", line 96, in
[rank0]: main()
[rank0]: File "/workspace/xDiT/examples/flux_example.py", line 46, in main
[rank0]: pipe.enable_sequential_cpu_offload(gpu_id=local_rank)
[rank0]: File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 1151, in enable_sequential_cpu_offload
[rank0]: cpu_offload(model, device, offload_buffers=offload_buffers)
[rank0]: File "/usr/local/lib/python3.10/dist-packages/accelerate/big_modeling.py", line 205, in cpu_offload
[rank0]: attach_align_device_hook(
[rank0]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 518, in attach_align_device_hook
[rank0]: attach_align_device_hook(
[rank0]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 518, in attach_align_device_hook
[rank0]: attach_align_device_hook(
[rank0]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 518, in attach_align_device_hook
[rank0]: attach_align_device_hook(
[rank0]: [Previous line repeated 4 more times]
[rank0]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 509, in attach_align_device_hook
[rank0]: add_hook_to_module(module, hook, append=True)
[rank0]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 161, in add_hook_to_module
[rank0]: module = hook.init_hook(module)
[rank0]: File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 308, in init_hook
[rank0]: set_module_tensor_to_device(module, name, "meta")
[rank0]: File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py", line 365, in set_module_tensor_to_device
[rank0]: new_value = param_cls(new_value, requires_grad=old_value.requires_grad).to(device)
[rank0]: TypeError: WeightQBytesTensor.new() missing 6 required positional arguments: 'axis', 'size', 'stride', 'data', 'scale', and 'activation_qtype'
[rank0]:[W1224 02:56:13.235291809 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator())
E1224 02:56:13.579000 5378 torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 5443) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/local/bin/torchrun", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 355, in wrapper
return f(*args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 919, in main
run(args)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 910, in run
elastic_launch(
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 138, in call**
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 269, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

examples/flux_example.py FAILED

Failures:
[1]:
time : 2024-12-24_02:56:13
host : l117-11-p-ga
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 5444)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2024-12-24_02:56:13
host : l117-11-p-ga
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 5443)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

The text was updated successfully, but these errors were encountered:

feifeibear · 2024-12-26T01:54:22Z

This issue might be caused by a mismatch between the versions of the libraries you're using or an incompatibility in the model's configuration.

Could you please check the version of your diffusers? we recommend diffusers>=0.32.0.dev for flux

WeiboXu · 2024-12-26T02:49:48Z

Here is diffusers' version:

diffusers 0.32.1

feifeibear · 2024-12-26T09:26:40Z

I am using Diffusers version 0.31.0.

While I have successfully managed to run pp=2 with CPU offloading, I am encountering the exact the same issues when attempting to use both enable_sequential_cpu_offload and use_fp8_t5_encoder simultaneously.

In conclusion, the combination of these two features seems to cause unexpected behavior or errors. I would appreciate any insights or suggestions on how to resolve this compatibility issue.

feifeibear changed the title ~~Failed to load flux.1-dev with enable_sequential_cpu_offload using 4090~~ Failed to load flux.1-dev with enable_sequential_cpu_offload and use_fp8_t5_encoder Dec 26, 2024

feifeibear changed the title ~~Failed to load flux.1-dev with enable_sequential_cpu_offload and use_fp8_t5_encoder~~ Failed to load flux.1-dev with enable_sequential_cpu_offload and use_fp8_t5_encoder (4090) Dec 26, 2024

feifeibear added the bug Something isn't working label Dec 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to load flux.1-dev with enable_sequential_cpu_offload and use_fp8_t5_encoder (4090) #407

Failed to load flux.1-dev with enable_sequential_cpu_offload and use_fp8_t5_encoder (4090) #407

WeiboXu commented Dec 24, 2024

feifeibear commented Dec 26, 2024

WeiboXu commented Dec 26, 2024

feifeibear commented Dec 26, 2024

Failed to load flux.1-dev with enable_sequential_cpu_offload and use_fp8_t5_encoder (4090) #407

Failed to load flux.1-dev with enable_sequential_cpu_offload and use_fp8_t5_encoder (4090) #407

Comments

WeiboXu commented Dec 24, 2024

examples/flux_example.py FAILED

Failures: [1]: time : 2024-12-24_02:56:13 host : l117-11-p-ga rank : 1 (local_rank: 1) exitcode : 1 (pid: 5444) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure): [0]: time : 2024-12-24_02:56:13 host : l117-11-p-ga rank : 0 (local_rank: 0) exitcode : 1 (pid: 5443) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

feifeibear commented Dec 26, 2024

WeiboXu commented Dec 26, 2024

feifeibear commented Dec 26, 2024

Failures:
[1]:
time : 2024-12-24_02:56:13
host : l117-11-p-ga
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 5444)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2024-12-24_02:56:13
host : l117-11-p-ga
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 5443)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html