You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The text model is based on Gemma 2 2b, which makes it pretty good at understanding prompt. Seeing the examples on the project page, you can even give instructions to the model.
Concerning the implementation in krita-ai-diffusion, it should be pretty straightforward. Using 4.0 CFG, DPM++ 2M sampler, and simple or beta scheduler, with 20 steps gives good results. It works with the nodes already used by the plugin : CLIPTextEncode, SamplerCustomAdvanced, etc. The workflow example from ComfyUI seems to add a ModelSamplingAuraFlow with 3.0 value, but if you remove this node, it seems this value is used by default.
I just had a minor problem when upscaling, because it seems the image size must be a multiple of 16, and not 8 as I supposed.
I've tested it a little, the images are less aesthetic than Flux out of the box, but the model + encoder is much much lighter (almost as fast as SDXL on my system), and it can handle CFG and thus negative prompts. The prompt adherence seems on par with the T5xxl text encoder.
The text was updated successfully, but these errors were encountered:
Tried it out a bit yesterday, it looks promising. With all the quantization and optimization flux has by now it didn't actually run any faster for me though and the images don't look as good. Let's hope it gets some investment. On paper it looks a lot like what I think many have been looking for (straight upgrade to SDXL, good license, around the same size but modern architecture/TE/VAE) - whether it's easy to train remains to be seen.
Lumina 2 support was just added to ComfyUI : https://comfyanonymous.github.io/ComfyUI_examples/lumina2/
The text model is based on Gemma 2 2b, which makes it pretty good at understanding prompt. Seeing the examples on the project page, you can even give instructions to the model.
Concerning the implementation in krita-ai-diffusion, it should be pretty straightforward. Using 4.0 CFG, DPM++ 2M sampler, and simple or beta scheduler, with 20 steps gives good results. It works with the nodes already used by the plugin :
CLIPTextEncode
,SamplerCustomAdvanced
, etc. The workflow example from ComfyUI seems to add aModelSamplingAuraFlow
with3.0
value, but if you remove this node, it seems this value is used by default.I just had a minor problem when upscaling, because it seems the image size must be a multiple of 16, and not 8 as I supposed.
I've tested it a little, the images are less aesthetic than Flux out of the box, but the model + encoder is much much lighter (almost as fast as SDXL on my system), and it can handle CFG and thus negative prompts. The prompt adherence seems on par with the T5xxl text encoder.
The text was updated successfully, but these errors were encountered: