Testing unet0 #2

htoyryla · 2022-04-20T14:34:38Z

htoyryla
Apr 20, 2022
Maintainer

One of the alternative model architectures supported, called here unet0, is the original reset based Unet from lucidrains' repo. With some modifications I got it working and trained now for perhaps an hour with 128px images of characters in various colors.

What it produces right now, when sampled

When used to transform a photo of myself at 512px, CLIP guided, I get

Thus, quite minimal training can already be useful for transforming or maybe even generating images, in particular when guided by CLIP or other means (such as style transfer from VGG feature maps).

What if we use a fresh untrained model instead? Let's try.

It works, but the effect is somewhat similar to using CLIP to guide a pixel matrix directly. So it appears, that even a minimal training in denoising makes a big difference when using CLIP guidance.

For comparison, let's use the minimally trained model, CLIP guided but without the constraints set by the seed image.
I am actually a bit impressed, not really cohesive but that is not to be expected either with the cutout mechanism used with CLIP. Still, the result works and at the same time reflects the stylistic elements set by the training (dots and series of dots) (even if not the dataset --characters)