Custom Input Pipelines With Data Augmentation for A.I. Image Classification Using Keras and Imgaug
If you want to skip the reading you can check my colab notebook on this issue.
Hello again! If you are interested into deep learning and its trending technologies, most likely you already had at least a glimpse of tensorflow. An open-source library that enables easy implementation of neural networks and other machine learning models for your artificial intelligence applications.
One of the most popular tasks in Deep Learning is image classification. Given a dataset of labeled images, train a convolutional neural network to classify images that the model has not seen. Everything sound cool, but at least for me, it was a bumpy ride at first. When visiting courses about deep learning in tensorflow, you would come across the theory, along with a few toy examples like MNIST where you use something like tf.data.Dataset.from_tensor_slices() to feed your model.
However, for real world datasets, which most likely won’t fit entirely on your memory, you would like to load just batches of data efficiently, while also having complete control of pre-processing and augmentation. I present to you, two approaches to load and transform image datasets. The first is very straightforward, but limited in capabilities and the second is completely customizable, but requires more code.
Basic image loader using keras preprocessing module
This one appears on tensorflow tutorials, but we are going to have our own use case. Let’s say you are a covid-19 AI researcher and would like to perform image classification on chest x-ray images that have three kinds of labels: normal, pneumonia and covid-19. One approach is to use tf.keras.preprocessing.image_dataset_from_directory().
Then your images directory should be organized as:
covid_chest_xrays/
normal/
pneumonia/
COVID-19/
With little code we can create an image loader:
Now assume that our dataset is short in training samples, which makes the model susceptible to overfitting. This is a problem because our model will not be able to generalize to future unseen examples. Moreover, we would like to create an image augmentation pipeline, so as to artificially increase our number of training examples:
Now lets see a sample of our images with the help of matplotlib:
Which gives:
Assuming that we have a keras model
somewhere in our script which we want to train, it would be as easy as:
Custom Input Loader using Keras Sequence and image augmentation with imgaug
The earlier approach was simple and it gets the job done. However, our data augmentation options are quite limited using keras preprocessing and we do not have too much control about loading our batches. In addition, let’s say that our covid-19 class is short on labels. Our dataset is unbalanced! we could perform upsampling of covid-19 images on training time to balance each batch (I got this idea from COVID-net creators). How can we do all this? let us review it in parts.
We can define a tf.keras.utils.Sequence
. It is great for loading batches of data to feed our network and we can define any operation we want. From the documentation: a sequence must implement the methods__getitem__
__len__
and optionally on_epoch_end
for modifications in between epochs. We are also implementing __next__
, to iterate over our generator. We start by the class initialization definition:
We declare our ImageDataset
class, child of a keras Sequence and define its parameters: a list of filenames of the dataset, batch_size, augmentation function, whether we want batch balancing or not, image shape and so forth. Then we create three lists, one for each class. If we want to perform upsampling of covid images, we merge normal and pneumonia and leave covid-19 separated. Otherwise we merge everything.
Now we move to defining the function that will tell the number of batches this generator provides to complete one epoch:
Simple enough. Just divide the total number of samples by its batch_size. Furthermore we define how we load the images from our disk:
This code snippet is diligently commented, so hopefully the idea of how batches are loaded is straightforward. Finally we implement the __next__
method, which calls __getitem__
with the proper index as argument to load images and on_epoch_end
to shuffle the dataset.
Now there is one more thing we should take into account which is data augmentation. There is a dedicated python library for image augmentation called imgaug. It is well documented and provides many examples to get started. One great thing about it, is that it is not restricted to augmentation for image classification, as we can also augment images with bounding boxes, keypoints and polygons annotations. So, with that said, let’s define our augmentation engine:
This pipeline consist in sequential transformations, but the application is not deterministic. Some functions are applied just 50% of the time because of sometimes
lambda. This seq
object is going to be the argument of the augmentation
parameter of ImageDataset.
Now that we have everything set, instantiate an image generator object:
Now we can iterate over it to extract batches of data, one at a time. It can also be passed as an argument to model.fit()
, just as the training snippet on the first method to train our network. Let’s retrieve a sample batch and plot the resulting images along with their labels:
This gives:
And that is it for the second approach. Again, you can check this codes on my colab notebook.
Thank you for taking the time to read this article. I really hope it can be of use to someone working on a deep learning project or just wanting to learn something. I am starting to take the time to put these articles together as an appreciation to all the community for their efforts on bringing high quality material to help others advance their knowledge and careers.
I will bring some more articles about creating Convolutional neural Networks that use the image generators explained here, and how to measure model performance.
See you!