Understanding of Transfer learning and Fine-tuning

Abstract

In our case the two main topics were Transfer learning and Fine-tuning, in order for us to understand what does it mean and how to use it.

Transfer learning is a research problem in machine learning whose purpose is to store knowledge of a pre-trained model and transfer it to a new model on a different problem.

Fine-tuning means taking weights of a pre-trained model and use it as an initializer in our new model, that will be trained on a new problem (still the problem must be from the same type). We use it to overcome small dataset size and decrease the time the model will use to train.

Introduction

As humans learns not only by themselves but by sharing and transfering knowledge, the idea is the same is Machine learning. The goal is to transfer the knowledge of an already trained model to another, we can take the following example :

In the first training we could have trained a classification model to correctly labels images of dogs and cats, but facing a new problematic such as labelling images of horses and chickens rather than constructing a new model from scratch, we can use the first model with some additions in order to treat the new problematic.

In our case, we had to treat the CIFAR-10 datasets that contains 10 classes of images (airplane, automobile, bird, cat, deer, dog, frog, horse, ship and truck) using transfer learning.

Materials and Methods

As we had to work on the CIFAR-10 datasets, we based our work on documentations to understand transfer learning and fine-tuning. We chose to use the InceptionResNetV2 application from Keras and uses the following layers :

  • Flatten layer
  • Dense layer (using RELU activation function)
  • Dropout
  • Dense layer (using RELU activation function)
  • Dropout
  • Output layer (using softmax activation function)

We also used early stopping techniques and choose the Adam optimizer for the our Gradient descent. The training was set with 4 epochs with the base model (the InceptionResNetV2) frozen and a last epochs when we unfreeze the base model and set a low learning rate (1e-5).

Results

During the training with the InceptionResNetV2 application frozen, the 4 epochs increased the accuracy to a point of 84 % accuracy and a loss value of 0,9. Following the 4 epochs, the last epoch with the base model unfrozen the loss value decreased to 0,2 and resulting in 94 % of accuracy.

Discussion

We can see in our results that the training of our model on top is greatly improved when we use the application and it increases our efficiency without building a complete model from scratch. In order to obtain a good accuracy we had to fine tune the parameters of our model, such as the batch size, the number of epochs and the learning rate without compromising the training duration.

The goal was to increase the accuracy without taking to much time, and we obtains a interesting result even if the InceptionResNetV2 is not the fastest application we could used.

Acknowledgments

Thanks to Adrien Millot, with whom I’ve worked on this project. We worked together on how to use the chosen application and how fine-tuning works.

Sources

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store