Convolutional Autoencoders and image denoising.
In this post, I will extend the general idea of autoencoders to develop a convolutional autoencoder. This post is a continuation of this article, in which I explained the main idea behind autoencoders and their architecture. I also showed how to build this type of neural network architecture using TensorFlow and Keras.
As it is natural to think, we can use different types of layers when we are designing the autoencoder, since we were working with the fashion-mnist dataset is convenient to use convolutional neural networks to extract a better representation from the images. The convolutional architecture is very straightforward, let’s see what it would look like.
In this case, the layers are convolutional, de encoder is in charge of reducing the dimension of the input image as well as extracting the features that would be the latent representation of the input, we reduce the image dimension by using MaxPooling operations, so in each layer, the dimension (height and width) is smaller. On the other hand, there is a decoder, which is in charge of creating an output that resembles the input from the latent representation.
The decoder uses a different type of layer to increase the dimension of the latent representation, this operation is the opposite of the Pooling operation, we can do this by using an upsampling layer, which increases the dimension of the data by repeating the rows and columns within a matrix. These layers do not perform any learning so each upsampling layer must be followed by a convolutional layer to learn the details behind this repeated representation.
A simple autoencoder.
Let’s build an autoencoder with basic functionality, this is, reconstruct the input from the latent representation. First, let’s see how the encoder is built.
The function shown above is quite simple, and just creates the encoder side of our autoencoder by interleaving convolutional layers with max-pooling layers. The next part is the bottleneck.
The bottleneck is really straightforward, it is just a layer that interconnects the encoder to the decoder. Let’s see how we can build the latter.
The decoder uses Upsampling layers to gradually increase the dimension of the data, finally, we have a last convolutional layer with 1 filter, so the output will have a final dimension of (28,28,1) as the input data. The code to build the entire autoencoder is shown below.
The process is very simple, and as it was showed on the first post, to train the model we just have to use the input data as the target data, so the model we will learn how to reconstruct the images from the latent representation. The code to implement the training process is shown below.
After the training process, we observed the following results.
The images at the top are the original images, while the ones below are the images reconstructed from the latent representation of the autoencoder. Is not bad for such a simple architecture.
Image denoising.
It was mentioned that the autoencoder will learn how to reconstruct the input from the latent representation, which is a reduced representation of the input data. We achieve this by adding as a target value the same input, so the model will build on the output layer an image that resembles the input. So the model will try to minimize the loss function by matching the output with the target image during the training process.
We can use this fact to train a model to denoise images by using a clean representation of the noised input as a target image. We can use the same model used before, we just have to add noise to the images and use them as inputs. But instead of using the same images on the output, we must use the original images, so the model will minimize the loss function by matching the noised images with the original images by doing so it will denoise the images.
After the training process, we can notice an excellent job of our autoencoder denoising images. it can be observed that can reduce the noise and keep track of some details of the images, the dataset is really simple, but it is enough to show how autoencoders can be used to denoise images.
These articles are related to some courses I am taking on Coursera, the course is Generative Deep Learning with TensorFlow, which is included in the Specialized program, Tensorflow: Advanced techniques. If you want to keep in contact with me and know more about this type of content, I invite you to follow me on Medium and check my profile on LinkedIn. You also can subscribe to my blog and receive updates every time I create new content.