Project Overview:
Wherever we are, whether in an urban environment or the country, we usually have lots of dog friends around in our life. But how much do you know about them? What kind of breed are they? (e.g. Can you tell the difference between the Field Spaniel (left), Boykin Spaniel (right) below? ) Different dog breeds are probably hard to identify for many people. In this article, we will apply deep learning skills and let computer vision help us with this question.
Specifically, I’ll demonstrate how to apply keras and tensor flow to build a Convolutional Neural Network (CNN) from scratch. In addition to training and testing the network, I’ll also explain the concept of transfer learning. A successful model will have at least 80% accuracy on the test set. Now let’s get started!
Step 1: Import datasets
For the convenience of processing, I’ll load all files into memory.
Then I’ll normalize the data by dividing every pixel by 255 and format the output to a vector processable by keras, which is also known as a tensor.
Step 2: Let’s create a CNN from scratch
Here is a simple CNN I created using keras with a tensor flow backend. It’s an 8-layer sequential neural network utilizing 3 convolutional layers with pooling layers after each. The final later is a fully connected layer with 133 nodes, each represents a breed I’m trying to predict. The dense layer I use is a softmax activation function because it will give an output range from 0 to 1, and also it makes the sum of all nodes in the output to be 1.
The final step is to compile the model for training.
Step 3: Apply the CNN to classify dog breeds
Before I train, validate and test a model, I will create a ModelCheckpoint object, which will be a container storing all the model weights for the use later.
This model is only used for demonstration, you can see the training accuracy is pretty low.
Step 4: Using transfer learning
How can we improve the performance of the model in a quick and dirty way? Let’s take advantage of transfer learning. We can leverage an existing CNN which has been pretrained to recognize features from general data sources, and add extra layers to make it easily adapt to our use case.
To transfer learning, I will need to freeze the weights and kernels of the convolution layers, and only train the layers I added. This will be the most efficient way. Udacity has already fed all their supplied training images through a few built-in CNNs, and provided the output bottleneck features, I can simply select one and apply directly to the model.
As the code shows below, I designed a fully connected network to take the bottleneck features and output 133 nodes for each breed. The final layer is also a softmax activation function.
Step 5: Write and test our algorithm
The accuracy is 80.3% on the test set, which means our model is eighty percent correct when making a prediction!
Step 6: Refinement and next steps
Although the transferred learning model is working great, there is still something I could keep working on to refine the recognition, for example, augmentation. For the next steps of this study in the future, I could apply random rotations and zooms to the training images to artificially increase the training data size but maintaining the data integrity. This could improve both the model accuracy and also help prevent overfitting.
Final Output
The final output of this project will be a function that could tell the dog breed from an input image. The image could be either human or dog, I will apply two detectors accordingly.
Here are the test images I selected. Feel free to select yours.
If you have been following me in this blog post, how about trying to apply an image of yourself? You will see which breed of dog that looks most similar to you!