Home » Tutorials » TensorFlow » Convolutional Neural Network (CNN)

Convolutional Neural Network (CNN)

An adaptation of Convolutional Neural Network (CNN) tutorial using Habana Gaudi AI processors. This tutorial demonstrates training a simple Convolutional Neural Network (CNN) to classify CIFAR images. Because this tutorial uses the Keras Sequential API, creating and training your model will take just a few lines of code. Import TensorFlow Enable Habana Let’s enable a single Gaudi device by loading ...

An adaptation of Convolutional Neural Network (CNN) tutorial using Habana Gaudi AI processors.

This tutorial demonstrates training a simple Convolutional Neural Network (CNN) to classify CIFAR images. Because this tutorial uses the Keras Sequential API, creating and training your model will take just a few lines of code.

Import TensorFlow

import tensorflow as tf

from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt

Enable Habana

Let’s enable a single Gaudi device by loading the Habana module:

import habana_frameworks.tensorflow as htf
htf.load_habana_module()

Download and prepare the CIFAR10 dataset

The CIFAR10 dataset contains 60,000 color images in 10 classes, with 6,000 images in each class. The dataset is divided into 50,000 training images and 10,000 testing images. The classes are mutually exclusive and there is no overlap between them.

(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0

Verify the data

To verify that the dataset looks correct, let’s plot the first 25 images from the training set and display the class name below each image:

class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i])
    # The CIFAR labels happen to be arrays, 
    # which is why you need the extra index
    plt.xlabel(class_names[train_labels[i][0]])
plt.show()
Habana image grid

Create the convolutional base

The 6 lines of code below define the convolutional base using a common pattern: a stack of Conv2D and MaxPooling2D layers.

As input, a CNN takes tensors of shape (image_height, image_width, color_channels), ignoring the batch size. If you are new to these dimensions, color_channels refers to (R,G,B). In this example, you will configure your CNN to process inputs of shape (32, 32, 3), which is the format of CIFAR images. You can do this by passing the argument input_shape to your first layer.

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

Let’s display the architecture of your model so far:

model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 30, 30, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 15, 15, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 13, 13, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 64)          36928     
=================================================================
Total params: 56,320
Trainable params: 56,320
Non-trainable params: 0
______________________________Code language: PHP (php)

Above, you can see that the output of every Conv2D and MaxPooling2D layer is a 3D tensor of shape (height, width, channels). The width and height dimensions tend to shrink as you go deeper in the network. The number of output channels for each Conv2D layer is controlled by the first argument (e.g., 32 or 64). Typically, as the width and height shrink, you can afford (computationally) to add more output channels in each Conv2D layer.

Add Dense layers on top

To complete the model, you will feed the last output tensor from the convolutional base (of shape (4, 4, 64)) into one or more Dense layers to perform classification. Dense layers take vectors as input (which are 1D), while the current output is a 3D tensor. First, you will flatten (or unroll) the 3D output to 1D, then add one or more Dense layers on top. CIFAR has 10 output classes, so you use a final Dense layer with 10 outputs.

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

Here’s the complete architecture of your model:

model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 30, 30, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 15, 15, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 13, 13, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 64)          36928     
_________________________________________________________________
flatten (Flatten)            (None, 1024)              0         
_________________________________________________________________
dense (Dense)                (None, 64)                65600     
_________________________________________________________________
dense_1 (Dense)              (None, 10)                650       
=================================================================
Total params: 122,570
Trainable params: 122,570
Non-trainable params: 0Code language: PHP (php)

The network summary shows that (4, 4, 64) outputs were flattened into vectors of shape (1024) before going through two Dense layers.

Compile and train the model

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(train_images, train_labels, epochs=10,batch_size=50,
                    validation_data=(test_images, test_labels))
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 16s 318us/sample - loss: 0.5715 - accuracy: 0.7990 - val_loss: 0.8745 - val_accuracy: 0.7117
Epoch 2/10
50000/50000 [==============================] - 15s 290us/sample - loss: 0.5383 - accuracy: 0.8099 - val_loss: 0.8526 - val_accuracy: 0.7247
Epoch 3/10
50000/50000 [==============================] - 15s 292us/sample - loss: 0.5081 - accuracy: 0.8206 - val_loss: 0.8999 - val_accuracy: 0.7182
Epoch 4/10
50000/50000 [==============================] - 15s 292us/sample - loss: 0.4786 - accuracy: 0.8318 - val_loss: 0.8891 - val_accuracy: 0.7176
Epoch 5/10
50000/50000 [==============================] - 15s 294us/sample - loss: 0.4509 - accuracy: 0.8405 - val_loss: 0.8847 - val_accuracy: 0.7192
Epoch 6/10
50000/50000 [==============================] - 15s 293us/sample - loss: 0.4275 - accuracy: 0.8490 - val_loss: 0.9814 - val_accuracy: 0.7006
Epoch 7/10
50000/50000 [==============================] - 15s 293us/sample - loss: 0.4028 - accuracy: 0.8552 - val_loss: 0.9727 - val_accuracy: 0.7160
Epoch 8/10
50000/50000 [==============================] - 14s 290us/sample - loss: 0.3778 - accuracy: 0.8651 - val_loss: 1.0074 - val_accuracy: 0.7230
Epoch 9/10
50000/50000 [==============================] - 15s 297us/sample - loss: 0.3561 - accuracy: 0.8723 - val_loss: 1.0690 - val_accuracy: 0.7115
Epoch 10/10

Evaluate the model

plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')

test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)
Habana plot chart
print(test_acc)
0.7136Code language: CSS (css)

Your simple CNN has achieved a test accuracy of over 70%. Not bad for a few lines of code!

Copyright (c) 2021 Habana Labs, Ltd. an Intel Company.
Copyright 2019 The TensorFlow Authors.
All rights reserved.

Licensed under the Apache License, Version 2.0 (the “License”);

you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Share this article:

Sign up for the latest Habana developer news, events, training, and updates.