You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
black-forest-labs/FLUX.1-dev runs very slow. it takes about 15min to generate 1344x768 (wxh) image. Has anyone experienced the same or is it just me.
pipe=FluxPipeline.from_pretrained(args.model, torch_dtype=torch.bfloat16)
#pipe.enable_model_cpu_offload() #save some VRAM by offloading the model to CPU. Remove this if you have enough GPU powerpipe.enable_sequential_cpu_offload()
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()
pipe.to(torch.float16) # casting here instead of in the pipeline constructor because doing so in the constructor loads all models into CPU memory at onceprompt=args.promptimage=pipe(
prompt,
height=args.height,
width=args.width,
guidance_scale=0.0,
num_inference_steps=args.num_inference_steps,
max_sequence_length=512,
generator=torch.Generator("cpu").manual_seed(0)
).images[0]
Path(args.output).parent.mkdir(parents=True, exist_ok=True)
image.save(args.output)
args.num_inference_steps=50
The text was updated successfully, but these errors were encountered:
The 24GB VRAM should just be enough to keep the transformer model fully in VRAM, that means you can use pipe.enable_model_cpu_offload() instead of pipe.enable_sequential_cpu_offload(). Maybe you don't even need the vae slicing/tiling.
I.e.:
pipe=FluxPipeline.from_pretrained(args.model, torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload() # save some VRAM by offloading the model to CPU. Remove this if you have enough GPU powerprompt=args.promptimage=pipe(
prompt,
height=args.height,
width=args.width,
guidance_scale=0.0,
num_inference_steps=args.num_inference_steps,
max_sequence_length=512,
generator=torch.Generator("cpu").manual_seed(0)
).images[0]
Path(args.output).parent.mkdir(parents=True, exist_ok=True)
image.save(args.output)
If that still uses too more VRAM than available (see task manager), you can look into quantizing the model.
black-forest-labs/FLUX.1-dev
runs very slow. it takes about 15min to generate 1344x768 (wxh) image. Has anyone experienced the same or is it just me.args.num_inference_steps=50
The text was updated successfully, but these errors were encountered: