Skip to content

Latest commit

 

History

History
76 lines (49 loc) · 3.17 KB

README.md

File metadata and controls

76 lines (49 loc) · 3.17 KB

thumbnail-stable-diffusion

I fine tuned stable diffusion on youtube thumbnails and the video title. I was hoping to create a tool that help Youtubers brain storm with diffusion models to create eye-catchy thumbnails. The results for kids channels and cartoonish thumbnails were particulary cool. However, I found that stable diffusion had a hard time generating photo-realistic thumbnails.

Here are some of the generated thumbnails with the fine tuned model. First row of all the figures are the generated images by stable diffusion (not fine-tuned) for the video title. Rest of the rows are outputs of the fine tuned model on the data of the channel:

Title: animal figurines from fruits:

drawing

Title: baby in a space suit:

drawing

Title: baby is in a spaceship:

drawing

Title: blowing candles on birthday cake:

drawing

Title: playing with baloons:

drawing

Title: playing with other kids:

drawing

Title: running in a green garden:

drawing

Title: sad cocomelon baby:

drawing

Title: swimming in the pool:

drawing

Title: Birthday at the farm song with cocomelon:

drawing

Another interesting observation that I had was that when stable diffusion is being fine-tuned, the loss usually does not decrease fast and when it does decrease, perhaps it is overfitting to the data. Here is a figure that demonstrates the progress of generated images throughout fine-tuning:

Title: Birthday at the farm song with cocomelon:

drawing

First row is without fine tuning and next rows are when fine tuning progresses. As seen above, images in row 3 and after are identical to the images in training data for the exact same title.