This project demonstrates the implementation of the Neural Style Transfer (NST) algorithm, which merges the content of one image with the style of another to create a novel artistic image. The algorithm, created by Gatys et al. (2015), utilizes a pre-trained convolutional network (VGG-19) to achieve this transfer. The project covers the complete process, from setting up the environment to generating the final stylized image.
- Packages
- Problem Statement
- Transfer Learning
- Neural Style Transfer
- Solving the Optimization Problem
- Test With Your Own Image
- Results
First, we import the necessary packages and dependencies for Neural Style Transfer. These include libraries for image processing, numerical computations, and deep learning.
Neural Style Transfer (NST) is an optimization technique in deep learning that combines two images: a content image (C) and a style image (S) to create a generated image (G) that merges the content of C with the style of S.
We use the VGG-19 model, pre-trained on the ImageNet database, to build the NST algorithm. This model is used to extract features from the content and style images. Transfer learning leverages the pre-trained model's ability to capture intricate features, saving time and computational resources.
To match the content of the generated image G to the content image C, we choose a middle activation layer of the VGG network. This layer captures both low-level features (like edges) and high-level features (like shapes).
The content cost function ensures that the content in the generated image G closely matches the content of image C. It does this by minimizing the difference between the feature representations of G and C in the chosen layer.
The style of an image is represented by the Gram matrix, which captures the correlations between different filter responses. This matrix helps in understanding the texture and patterns of the style image.
The style cost function minimizes the difference between the Gram matrix of the style image S and the Gram matrix of the generated image G. This ensures that G captures the stylistic patterns of S.
The total cost function combines the content and style costs. This is done by weighting them with factors alpha and beta, respectively. The combined cost guides the optimization process to generate an image that balances both content and style.
We start by loading the content image, which is the base image whose content we want to preserve in the generated image.
Next, we load the style image, which provides the artistic style that we want to apply to the content image.
The generated image G is initialized randomly. During the optimization process, this image will be adjusted to minimize the total cost function.
The pre-trained VGG-19 model is loaded. This model will be used to compute the feature representations needed for the content and style costs.
The content cost is computed based on the difference in feature representations between the content image C and the generated image G in the chosen layer.
The style cost is computed by comparing the Gram matrices of the style image S and the generated image G across several layers.
The optimization process involves iteratively adjusting the pixels of the generated image G to minimize the total cost function. This is done using a gradient descent algorithm, where the gradients of the cost function with respect to the image pixels are used to update G.
You can test the NST algorithm with your own content and style images by replacing the default images in the code. This allows for customization and exploration of different artistic effects.
To use the neural style transfer implementation, follow these steps:
- Place your content image & style image in the
images/
directory. - Update the paths to these images in the notebook.
- Run the notebook cells sequentially to generate the styled image.
- The generated image will be saved in the
output/
directory.
Here are some examples of generated images using different content and style combinations:
-
The beautiful ruins of the ancient city of Persepolis (Iran) with the style of Van Gogh (The Starry Night)
-
The tomb of Cyrus the great in Pasargadae with the style of a Ceramic Kashi from Ispahan.
-
A scientific study of a turbulent fluid with the style of a abstract blue fluid painting.