CNN Binary Classifier
Task: Classify the species (cat or dog) by a given image.
Data: 10000 of images of cats and dogs, seperated in
- 4000 images training set dogs under dataset/training_set/dogs
- 4000 images training set cats under dataset/training_set/cats
- 1000 images test set dogs under dataset/training_set/dogs
- 1000 images test set cats under dataset/training_set/cats
We start by importing the classes for the deep network model itself Sequential and the already known fully connected layer Dense. The additional layer classes Conv2D (convolution layer), MaxPooling2D (max pooling layer) and Flatten (flattening layer) are required to build our CNN.
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
Similar to the MLP we start creating the network by instantiate a Sequential object, representing the network.
# Initialising the CNN
classifier = Sequential()
Next we add a Convolutional layer with 32 features of size 3x3.
# Step 1 - Convolution
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))
Instead of input_dim a convolutional layer requires an input shape, representing the image dimentions. In this example we want the first layer to take 64x64 pixel images as input. The third parameter describes the number of color channel, which is in most cases 3 (red, green, blue). The images of our data differ in size, so we will rescale our images to 64 x 64 pixels later.
It is common practise (but not necessary) to add a max pooling layer afterwards.
# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))
Most of the times the size is set to 2x2, which also sets the strides to 2x2 per default.
To improve the performances of the network we add an additional pair of convolution and max pooling layer, with identical parameters.
# Adding a second convolutional layer
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
The previous steps of convolution and pooling are performed to extract features, while the classification itself is performed on an traditional MLP, appended at the end of the network. So deal with the extracted and calculated features, the 2 dimentional pooled maps get flatten.
# Step 3 - Flattening
classifier.add(Flatten())
The flatten layer is usually only inserted once between the CNN specific layers and the final NN that perform the classification.
# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 1, activation = 'sigmoid'))
Here we use a fully connected layer of 128 nodes and a single output node, since we have a binary output.
At the end the network is compiled using adam and binary crossentropy as loss function.
# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
Instead of byte-wise image (jpeg) data, neural networks requires uncompressed floating number matrices. Keras provides a class, to generate these matrices from input images.
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
Four parameters are used in this example.
- rescale - scale the byte values (0 to 255) to a floating number between 0 and 1 by dividing the number by 255
- shear_range - allow the generator to randomly rotate the images up to 0.2 radians
- zoom_range - allow the generator to randomly zoom into the data up to 20%
- horizontal_flip - allow the generator to randomly flip the image horizontally
As we can see, the ImageDataGenerator class allows to perform distortion operations like zoom and rotation of the image, which will increase the accuracy if applied to the training set. For the test set we stick to the given images and only rescale the date to floats. The generators are then applied on the images of our training or test directory.
training_set = train_datagen.flow_from_directory('dataset/training_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
test_set = test_datagen.flow_from_directory('dataset/test_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
As already mentioned, the data consists of images of different sizes. Again four parameters are used:
- the directory of the images containing subdirectories with images. The directory name define the class label
- target_size - each image is rescaled to 64x64 pixel
- batch_size - 32 images are loaded at once and submitted to the NN
- class_mode - we are dealing with binary data - cat or dogs
A detailled description and examples of the ImageDataGenerator can be found under https://keras.io/preprocessing/image/
The fitting process
classifier.fit_generator(training_set,
steps_per_epoch = 8000 / 32,
epochs = 25,
validation_data = test_set,
validation_steps = 2000 / 32
result in a test accuracy of about 80%.