Vegetable Classification and Recognition

Introduction

This article explores the use of convolution neural networks (CNN). Also, since it is my second deep learning project, I took every opportuinity to build my basics strongly.

Dataset consists of 15 vegetable ( Bean, Bitter_Gourd, Bottle_Gourd, Brinjal, Broccoli, Cabbage, Capsicum, Carrot, Cauliflower, Cucumber, Papaya, Potato, Pumpkin, Radish, Tomato) images resized to 224X224 pixel and arranged in three different folders of train, test and dev.

Each class has equal proportion of images making it a balanced dataset.

Traditional Machine Learning

First of all, I have used the de-facto image processing package of python - PIL ( Python Imaging Library ). Further, created an iterative function load_images to load images from the respective directory.

def load_images(imagepath):
  image_data = []
  image_data_flatten = []
  labels = []
  files_or_folders = os.listdir(imagepath)
  for i in files_or_folders:
    if(os.path.isdir(imagepath+i)):
      imagefiles = os.listdir(imagepath+i)
      for j in imagefiles:
        if (np.asarray(Image.open(imagepath+i+"/"+j)).shape == (224,224,3)):
          data = np.asarray(Image.open(imagepath+i+"/"+j))
          image_data.append(data)
          data = data.reshape(-1,)
          image_data_flatten.append(data)
          labels.append(i)
  return image_data, image_data_flatten, labels

Later, have modeled both Logistic Regression & Random Forest. Random Forest gave a good lift on test accuracy, but it is over-fitting on training data.

Model	Train Accuracy	Test/Validation Accuracy
Logistic Regression	77%	62%
Random Forest	100%	79%

Below is detailed classification matrix for both these models. ( Carrot identification better by both models )

Deep Learning/ Neural Networks

If I had performed my experimentation on same notebook and in one sitting, I would have used the loaded images output which was used earlier. However, as the learning continued for multiple days, I came across image_dataset_from_directory which is far more easier to load the images in keras.

Below is the simple command where we can load the images directly.

train_ds = tf.keras.utils.image_dataset_from_directory(
  train_dir,
  seed=123,
  image_size=(224, 224),
  batch_size=32)


test_ds = tf.keras.utils.image_dataset_from_directory(
  test_dir,
  seed=123,
  image_size=(224, 224),
  batch_size=32)

val_ds = tf.keras.utils.image_dataset_from_directory(
  val_dir,
  seed=123,
  image_size=(224, 224),
  batch_size=32)

One & Two Layered CONV2D Model

To start with I have explored Single and Two layered CONV2D Models.

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(224, 224, 3)),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(15, activation='softmax')
])

model_layered = tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, (5,5), activation='relu', input_shape=(224, 224, 3)),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(1024, activation='relu'),
    tf.keras.layers.Dense(15, activation='softmax')
])

Model	Epochs	Train Accuracy	Test/Validation Accuracy
One-Layered	10	97%	63%
Two-Layered	10	90%	52%

Both these models, definetly over fits on the training data suggesting us to explore better models.

Rescaling

Rescaling helps to create a better model.

model_rescale = tf.keras.Sequential([
    tfl.Rescaling(1./255, input_shape=(224, 224, 3)),
    tfl.Conv2D(32,7,padding='same',activation='relu'),
    tfl.MaxPooling2D(),
    tfl.Conv2D(64,5,padding='same',activation='relu'),
    tfl.MaxPooling2D(),
    tfl.Conv2D(128,3,padding='same',activation='relu'),
    tfl.MaxPooling2D(),
    tfl.Flatten(),
    tfl.Dense(1024,activation='relu'),
    tfl.Dense(128,activation='relu'),
    tfl.Dense(15,activation='softmax')
])

Model	Epochs	Train Accuracy	Test/Validation Accuracy
Re-scaling	10	98%	93%

We can see that by reshaping the data and adding a layered CONV2D structure we got a way better model compared to traditional ML methods. However, still it over fits on the training data.

Data-Augmentation

The most common way to avoid overfitting is to perform either flipping, rotating or zoomning the image. The commands are fairly simple if we use keras.

input = tf.keras.Input(shape=(224, 224, 3))
x = tfl.RandomFlip("horizontal")(input)
x = tfl.RandomRotation(0.2)(x)
x = tfl.RandomZoom(0.2)(x)
x = tfl.Rescaling(1./255)(x)
x = tfl.Conv2D(32,7,padding='same',activation='relu')(x)
x = tfl.MaxPooling2D()(x)
x = tfl.Conv2D(64,7,padding='same',activation='relu')(x)
x = tfl.MaxPooling2D()(x)
x = tfl.Conv2D(128,7,padding='same',activation='relu')(x)
x = tfl.MaxPooling2D()(x)
x = tfl.Flatten()(x)
x = tfl.Dense(1024,activation='relu')(x)
x = tfl.Dense(512,activation='relu')(x)
output = tfl.Dense(15,activation = "softmax")(x)

model_dataaug = tf.keras.Model(input,output)
model_dataaug.summary()

Model	Epochs	Train Accuracy	Test/Validation Accuracy
Data Augmentation	10	95%	95%

Wow!!! we got a very decent and balanced model with 95% accuracy with data augmentation. Now, Let’s explore transfer learning.

Transfer Learning with Custom Layers

Earlier, transfer learning of MobileNetV2 without custom layer has got an accuracy of 97% on test and dev sets. So, I have added two custom layers of Dense with 1024 and 512 neurons each.

Here is the network structure of the transfer learning model.

inputs = tf.keras.Input(shape=(224, 224, 3))
x = tfl.Rescaling(1./255)(inputs)
x = tfl.RandomFlip("horizontal")(x)
x = tfl.RandomRotation(0.2)(x)
x = tfl.RandomZoom(0.2)(x)
x = mobilenet_model(x, training=False)
x = tf.keras.layers.Dropout(0.2)(x)
x = tfl.Flatten()(x)
x = tfl.Dense(1024, activation='relu')(x)
x = tfl.Dropout(0.2)(x)
x = tfl.Dense(512, activation='relu')(x)
x = tfl.Dropout(0.2)(x)
outputs = tf.keras.layers.Dense(15)(x)
model = tf.keras.Model(inputs, outputs)

And WOW!!! This model gave 100 % Accuracy. Wohooo!!!!

Model	Epochs	Train Accuracy	Test/Validation Accuracy
Transfer_Learning	15	100%	100%

Below is detailed classification matrix for this model on test data.

Conclusion

It is wonderful to learn that CNN models are far better for images. Adding to it, transfer learning with custom layers is always better to achieve higher accuracy in quick time. I am so happy to get 100% accuracy on my first exploration on image dataset.