A friendly introduction to autoencoders.

Manuel Gil
5 min readMar 21, 2022

--

In this article I am going to delve into the possible applications and architectures that auto encoders can have, and how to implement them using TensorFlow and Keras.

Autoencoders… The definition.

Autoencoders play a fundamental role in unsupervised learning and in deep learning architectures for transfer learning and other tasks. Broadly, they are neural networks capable of learning dense representations of input data without supervision. Autoencoders can be used for dimensional reduction, content generation or even image denoising.

Autoencoders Architecture.

The main characteristic of autoencoder architecture is that the output is designed to resemble the input data. Below you can see simple autoencoder architecture.

In the figure shown above, we can see the neural network architecture of a basic autoencoder. During the training process, the autoencoder has to learn how to reproduce the input data, for this reason, the output dimension should be the same as the input. The middle part of the network is called a bottleneck, where the encoder is forced to learn a compressed representation of the input data. Finally, the decoder tries to resemble the input data from the compressed representation that we obtain in the bottleneck.

If an autoencoder succeeds in simply learning the input data, then it is not especially useful. Instead, autoencoders are designed to be unable to learn to copy perfectly. Usually, they are restricted in ways that allow them to copy only approximately, and to copy only input that resembles the training data, for example, by using a bottleneck. the model is forced to prioritize which aspects of the input should be copied.

Dimensional Reduction

Autoencoders can help us to reduce the dimensionality of data, and perhaps help with the visualization of important aspects of it. Let’s see a simple example of this. We have the following data that describes a three-dimensional pattern.

Three dimensional data.

Let’s use an autoencoder to reduce the dimensional representation of this data. Using the TensorFlow functional API we can create both the encoder and decoder as follows.

The encoder will receive the input data, which is three dimensional, and compresses it at the same time. So this will be our bottleneck, the decoder will take the compressed representation of the data and will try to reproduce the input data from this compressed input. Finally, we just created our autoencoder using the Sequential model, we are just stacking the encoder and decoder in one single model.

Since we want to copy the input data, we will use as a loss function the “mean squared error” function.

In this case, the label data will be the same input data, for that reason autoencoders are considered as an unsupervised learning algorithm. Let’s see how the compressed representation of the input data looks after the training process. Using the encoder part of the network we obtain the reduced representation.

The compressed representation is able to show the main characteristics of the input data. In this scenario, we can see how the behavior of the input data is reflected in a two-dimensional plane. Now, what happens if we want to reconstruct the input data from this compressed representation?. To do this let’s use the decoder to obtain the reconstructed data.

Reconstructed data.

In the image shown above, we can see that our decoder is able to turn the two-dimensional data from the compressed data into a three-dimensional representation, as might be expected, the reconstructed data is not perfect, due to the bottleneck structure, there was a loss of information.

Using the MNIST dataset

Let’s use more complex data and try to implement an autoencoder for the MNIST Dataset. The code to create the autoencoder is shown below.

The architecture is really simple, in supervised problems we have labels that we want to match with the output model. In this scenario the role of the label is the pixels of each image, Due to this the loss function binary cross-entropy is a good candidate to use in this case, since the output of this function allows values between 0 and 1 (the probabilities), 1 represents the white color and 0 black, everything between those values is the grayscale.

Autoencoder with MNIST dataset

We can see how good the model is at reconstructing data, just with 10 epochs the model learned how to reproduce the input data, however, this is a very simple dataset, probably for more complex data the results would not be as good.

Deep Autoencoders.

The Fashion MNIST dataset contains clothing images, let’s train an autoencoder with it. The code to create the autoencoder is shown below.

The architecture results in a deep autoencoder, due to the fact we are using more layers in both the encoder and the decoder sections. After training this model we obtain the following results.

The reconstructed images resemble the original data, however, there was a remarkable loss of information related to the details of the images. We can improve this model using convolutional autoencoders so we can track details reducing the loss of information.

I will explore other autoencoders structures and applications and write about them, so if you want to keep in contact with me and know more about this type of content, I invite you to follow me on Medium and check my profile on LinkedIn. You also can subscribe to my blog and receive updates every time I create new content.

References

--

--

Manuel Gil
Manuel Gil

Written by Manuel Gil

I am an engineer and I am passionate about the computer science field. I have two years working as a Data scientist, using data to solve industrial problems.

No responses yet