An adaptation of Image classification tutorial using Habana Gaudi AI processors.
This tutorial shows how to classify images of flowers. It creates an image classifier using a keras.Sequential
model, and loads data using preprocessing.image_dataset_from_directory
. You will gain practical experience with the following concepts:
- Efficiently loading a dataset off disk.
- Identifying overfitting and applying techniques to mitigate it, including data augmentation and Dropout.
This tutorial follows a basic machine learning workflow:
- Examine and understand data
- Build an input pipeline
- Build the model
- Train the model
- Test the model
- Improve the model and repeat the process
Import TensorFlow and other libraries
import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
Enable Habana
Let’s enable a single Gaudi device by loading the Habana module:
from habana_frameworks.tensorflow import load_habana_module
Code language: HTML, XML (xml)2021-08-13 18:28:29.810652: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 9974214766501581564, name: "/device:HPU:0" device_type: "HPU" memory_limit: 268435456 locality { } incarnation: 2541556219435803882]
Download and explore the dataset
This tutorial uses a dataset of about 3,700 photos of flowers. The dataset contains 5 sub-directories, one per class:
Code language: HTML, XML (xml)flower_photo/ daisy/ dandelion/ roses/ sunflowers/ tulips/
import pathlib
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)
data_dir = pathlib.Path(data_dir)
Downloading data from https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz
228818944/228813984 [==============================] - 2s 0us/step
Code language: JavaScript (javascript)
After downloading, you should now have a copy of the dataset available. There are 3,670 total images:
image_count = len(list(data_dir.glob('*/*.jpg')))
print(image_count)
Code language: HTML, XML (xml)3670
Here are some roses:
roses = list(data_dir.glob('roses/*'))
PIL.Image.open(str(roses[0]))

PIL.Image.open(str(roses[1]))

And some tulips:
tulips = list(data_dir.glob('tulips/*'))
PIL.Image.open(str(tulips[0]))

PIL.Image.open(str(tulips[1]))

Load using keras.preprocessing
Let’s load these images off disk using the helpful image_dataset_from_directory utility. This will take you from a directory of images on disk to a tf.data.Dataset
in just a couple lines of code. If you like, you can also write your own data loading code from scratch by visiting the load images tutorial.
Create a dataset
Define some parameters for the loader:
batch_size = 32
img_height = 180
img_width = 180
It’s good practice to use a validation split when developing your model. Let’s use 80% of the images for training, and 20% for validation.
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
Found 3670 files belonging to 5 classes. Using 2936 files for training.
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
Found 3670 files belonging to 5 classes. Using 734 files for validation.
You can find the class names in the class_names
attribute on these datasets. These correspond to the directory names in alphabetical order.
class_names = train_ds.class_names
print(class_names)
Code language: JSON / JSON with Comments (json)['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']
Here are the first 9 images from the training dataset.
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i]])
plt.axis("off")

You will train a model using these datasets by passing them to model.fit
in a moment. If you like, you can also manually iterate over the dataset and retrieve batches of images:
for image_batch, labels_batch in train_ds:
print(image_batch.shape)
print(labels_batch.shape)
break
(32, 180, 180, 3) (32,)
The image_batch
is a tensor of the shape (32, 180, 180, 3)
. This is a batch of 32 images of shape 180x180x3
(the last dimension refers to color channels RGB). The label_batch
is a tensor of the shape (32,)
, these are corresponding labels to the 32 images.
You can call .numpy()
on the image_batch
and labels_batch
tensors to convert them to a numpy.ndarray
.
Configure the dataset for performance
Let’s make sure to use buffered prefetching so you can yield data from disk without having I/O become blocking. These are two important methods you should use when loading data.
Dataset.cache()
keeps the images in memory after they’re loaded off disk during the first epoch. This will ensure the dataset does not become a bottleneck while training your model. If your dataset is too large to fit into memory, you can also use this method to create a performant on-disk cache.
Dataset.prefetch()
overlaps data preprocessing and model execution while training.
Interested readers can learn more about both methods, as well as how to cache data to disk in the data performance guide.
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
Standardize the data
The RGB channel values are in the [0, 255]
range. This is not ideal for a neural network; in general you should seek to make your input values small. Here, you will standardize values to be in the [0, 1]
range by using a Rescaling layer.
normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)
Note: The Keras Preprocessing utilities and layers introduced in this section are currently experimental and may change.
There are two ways to use this layer. You can apply it to the dataset by calling map:
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]
# Notice the pixels values are now in `[0,1]`.
print(np.min(first_image), np.max(first_image))
0.0 0.9961222
Code language: CSS (css)
Or, you can include the layer inside your model definition, which can simplify deployment. Let’s use the second approach here.
Note: you previously resized images using the image_size
argument of image_dataset_from_directory
. If you want to include the resizing logic in your model as well, you can use the Resizing layer.
Create the model
The model consists of three convolution blocks with a max pool layer in each of them. There’s a fully connected layer with 128 units on top of it that is activated by a relu
activation function. This model has not been tuned for high accuracy, the goal of this tutorial is to show a standard approach.
num_classes = 5
model = Sequential([
layers.experimental.preprocessing.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes)
])
Compile the model
For this tutorial, choose the optimizers.Adam
optimizer and losses.SparseCategoricalCrossentropy
loss function. To view training and validation accuracy for each training epoch, pass the metrics
argument.
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
Model summary
View all the layers of the network using the model’s summary
method:
model.summary()
Code language: HTML, XML (xml)Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= rescaling_1 (Rescaling) (None, 180, 180, 3) 0 _________________________________________________________________ conv2d (Conv2D) (None, 180, 180, 16) 448 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 90, 90, 16) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 90, 90, 32) 4640 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 45, 45, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 45, 45, 64) 18496 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 22, 22, 64) 0 _________________________________________________________________ flatten (Flatten) (None, 30976) 0 _________________________________________________________________ dense (Dense) (None, 128) 3965056 _________________________________________________________________ dense_1 (Dense) (None, 5) 645 ================================================================= Total params: 3,989,285 Trainable params: 3,989,285 Non-trainable params: 0 _________________________________________________________________
Train the model
epochs=10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
Epoch 1/10 92/92 [==============================] - 15s 69ms/step - loss: 1.1651 - accuracy: 0.5003 - val_loss: 0.9909 - val_accuracy: 0.6240 Epoch 2/10 92/92 [==============================] - 1s 14ms/step - loss: 0.9057 - accuracy: 0.6478 - val_loss: 0.9009 - val_accuracy: 0.6471 Epoch 3/10 92/92 [==============================] - 1s 14ms/step - loss: 0.7292 - accuracy: 0.7190 - val_loss: 0.8492 - val_accuracy: 0.6567 Epoch 4/10 92/92 [==============================] - 1s 14ms/step - loss: 0.5396 - accuracy: 0.7977 - val_loss: 0.9430 - val_accuracy: 0.6567 Epoch 5/10 92/92 [==============================] - 1s 14ms/step - loss: 0.3282 - accuracy: 0.8825 - val_loss: 0.9100 - val_accuracy: 0.6689 Epoch 6/10 92/92 [==============================] - 1s 14ms/step - loss: 0.1878 - accuracy: 0.9370 - val_loss: 1.2716 - val_accuracy: 0.6608 Epoch 7/10 92/92 [==============================] - 1s 14ms/step - loss: 0.1299 - accuracy: 0.9601 - val_loss: 1.4604 - val_accuracy: 0.6621 Epoch 8/10 92/92 [==============================] - 1s 14ms/step - loss: 0.1036 - accuracy: 0.9690 - val_loss: 1.2901 - val_accuracy: 0.6757 Epoch 9/10 92/92 [==============================] - 1s 14ms/step - loss: 0.0378 - accuracy: 0.9898 - val_loss: 1.5615 - val_accuracy: 0.6853 Epoch 10/10 92/92 [==============================] - 1s 14ms/step - loss: 0.0093 - accuracy: 0.9990 - val_loss: 1.7193 - val_accuracy: 0.6703
Visualize training results
Create plots of loss and accuracy on the training and validation sets.
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

As you can see from the plots, training accuracy and validation accuracy are off by large margin and the model has achieved only around 60% accuracy on the validation set.
Let’s look at what went wrong and try to increase the overall performance of the model.
Overfitting
In the plots above, the training accuracy is increasing linearly over time, whereas validation accuracy stalls around 60% in the training process. Also, the difference in accuracy between training and validation accuracy is noticeable—a sign of overfitting.
When there are a small number of training examples, the model sometimes learns from noises or unwanted details from training examples—to an extent that it negatively impacts the performance of the model on new examples. This phenomenon is known as overfitting. It means that the model will have a difficult time generalizing on a new dataset.
There are multiple ways to fight overfitting in the training process. In this tutorial, you’ll use data augmentation and add Dropout to your model.
Data augmentation
Overfitting generally occurs when there are a small number of training examples. Data augmentation takes the approach of generating additional training data from your existing examples by augmenting them using random transformations that yield believable-looking images. This helps expose the model to more aspects of the data and generalize better.
You will implement data augmentation using the layers from tf.keras.layers.experimental.preprocessing
. These can be included inside your model like other layers, and run on Gaudi.
data_augmentation = keras.Sequential(
[
layers.experimental.preprocessing.RandomFlip("horizontal",
input_shape=(img_height,
img_width,
3)),
layers.experimental.preprocessing.RandomRotation(0.1),
layers.experimental.preprocessing.RandomZoom(0.1),
]
)
Let’s visualize what a few augmented examples look like by applying data augmentation to the same image several times:
plt.figure(figsize=(10, 10))
for images, _ in train_ds.take(1):
for i in range(9):
augmented_images = data_augmentation(images)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_images[0].numpy().astype("uint8"))
plt.axis("off")

You will use data augmentation to train a model in a moment.
Dropout
Another technique to reduce overfitting is to introduce Dropout to the network, a form of regularization.
When you apply Dropout to a layer it randomly drops out (by setting the activation to zero) a number of output units from the layer during the training process. Dropout takes a fractional number as its input value, in the form such as 0.1, 0.2, 0.4, etc. This means dropping out 10%, 20% or 40% of the output units randomly from the applied layer.
Let’s create a new neural network using layers.Dropout
, then train it using augmented images.
model = Sequential([
data_augmentation,
layers.experimental.preprocessing.Rescaling(1./255),
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Dropout(0.2),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes)
])
Compile and train the model
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.summary()
Code language: HTML, XML (xml)Model: "sequential_2" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= sequential_1 (Sequential) (None, 180, 180, 3) 0 _________________________________________________________________ rescaling_2 (Rescaling) (None, 180, 180, 3) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 180, 180, 16) 448 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 90, 90, 16) 0 _________________________________________________________________ conv2d_4 (Conv2D) (None, 90, 90, 32) 4640 _________________________________________________________________ max_pooling2d_4 (MaxPooling2 (None, 45, 45, 32) 0 _________________________________________________________________ conv2d_5 (Conv2D) (None, 45, 45, 64) 18496 _________________________________________________________________ max_pooling2d_5 (MaxPooling2 (None, 22, 22, 64) 0 _________________________________________________________________ dropout (Dropout) (None, 22, 22, 64) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 30976) 0 _________________________________________________________________ dense_2 (Dense) (None, 128) 3965056 _________________________________________________________________ dense_3 (Dense) (None, 5) 645 ================================================================= Total params: 3,989,285 Trainable params: 3,989,285 Non-trainable params: 0 _________________________________________________________________
epochs = 15
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
Epoch 1/15 92/92 [==============================] - 10s 72ms/step - loss: 1.2387 - accuracy: 0.4806 - val_loss: 1.0788 - val_accuracy: 0.5559 Epoch 2/15 92/92 [==============================] - 4s 40ms/step - loss: 0.9652 - accuracy: 0.6264 - val_loss: 1.0192 - val_accuracy: 0.5981 Epoch 3/15 92/92 [==============================] - 4s 40ms/step - loss: 0.8547 - accuracy: 0.6812 - val_loss: 0.8912 - val_accuracy: 0.6730 Epoch 4/15 92/92 [==============================] - 4s 40ms/step - loss: 0.7661 - accuracy: 0.7115 - val_loss: 0.9404 - val_accuracy: 0.6213 Epoch 5/15 92/92 [==============================] - 4s 39ms/step - loss: 0.6779 - accuracy: 0.7446 - val_loss: 0.9571 - val_accuracy: 0.6580 Epoch 6/15 92/92 [==============================] - 4s 40ms/step - loss: 0.6225 - accuracy: 0.7640 - val_loss: 0.8378 - val_accuracy: 0.6785 Epoch 7/15 92/92 [==============================] - 4s 39ms/step - loss: 0.5640 - accuracy: 0.7810 - val_loss: 0.8682 - val_accuracy: 0.6894 Epoch 8/15 92/92 [==============================] - 4s 40ms/step - loss: 0.4831 - accuracy: 0.8222 - val_loss: 0.9229 - val_accuracy: 0.6921 Epoch 9/15 92/92 [==============================] - 4s 39ms/step - loss: 0.4432 - accuracy: 0.8375 - val_loss: 0.9157 - val_accuracy: 0.6975 Epoch 10/15 92/92 [==============================] - 4s 39ms/step - loss: 0.3756 - accuracy: 0.8583 - val_loss: 1.0624 - val_accuracy: 0.6989 Epoch 11/15 92/92 [==============================] - 4s 39ms/step - loss: 0.3368 - accuracy: 0.8726 - val_loss: 1.1544 - val_accuracy: 0.6526 Epoch 12/15 92/92 [==============================] - 4s 39ms/step - loss: 0.2948 - accuracy: 0.8890 - val_loss: 1.1491 - val_accuracy: 0.6907 Epoch 13/15 92/92 [==============================] - 4s 39ms/step - loss: 0.2350 - accuracy: 0.9162 - val_loss: 1.1246 - val_accuracy: 0.6798 Epoch 14/15 92/92 [==============================] - 4s 39ms/step - loss: 0.2297 - accuracy: 0.9193 - val_loss: 1.1273 - val_accuracy: 0.6717 Epoch 15/15 92/92 [==============================] - 4s 39ms/step - loss: 0.1761 - accuracy: 0.9390 - val_loss: 1.1768 - val_accuracy: 0.6921
Visualize training results
After applying data augmentation and Dropout, there is less overfitting than before, and training and validation accuracy are closer aligned.
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

Predict on new data
Finally, let’s use our model to classify an image that wasn’t included in the training or validation sets.
Note: Data augmentation and Dropout layers are inactive at inference time.
sunflower_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/592px-Red_sunflower.jpg"
sunflower_path = tf.keras.utils.get_file('Red_sunflower', origin=sunflower_url)
img = keras.preprocessing.image.load_img(
sunflower_path, target_size=(img_height, img_width)
)
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create a batch
predictions = model.predict(img_array)
score = tf.nn.softmax(predictions[0])
print(
"This image most likely belongs to {} with a {:.2f} percent confidence."
.format(class_names[np.argmax(score)], 100 * np.max(score))
)
Downloading data from https://storage.googleapis.com/download.tensorflow.org/example_images/592px-Red_sunflower.jpg
122880/117948 [===============================] - 0s 0us/step
This image most likely belongs to sunflowers with a 100.00 percent confidence.
Code language: JavaScript (javascript)
Licensed under the Apache License, Version 2.0 (the “License”);
you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.