Face expression recognition is software technology that involves the computer to read the biometric data regarding the face and detect the emotions in the face. The emotion detection process involves detecting the exact location of the faces in the image and then classifying the emotions of the faces detected.
This is important as the systems can detect and adapt their response and behavioral patterns according to the emotions of the humans and make the interactions more natural. Emotion detection has many applications in the field of Computer Vision.
Automobile Industry:- Analyzing the face of the driver whether he/she is tired or, sleepy and notifying to take a break.
Hospitals:- Helps doctors to detect how much pain a patient is feeling
Online Interviews:- Employee morale can be perceived using this technology by holding and recording interaction on the job. As an HR tool it can help not only in devising recruiting strategies but also in designing HR policies.
Testing video games:- Video games are designed particularly to target specific audience. When testing the video games when users play it, the facial emotions are recorded and anlyzed which makes designers to detect at what points different emotions ae experienced.
It is widely supported by the scientific community that there are 7 basic or universal emotions and they are:-
import cv2
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from google.colab.patches import cv2_imshow
import zipfile
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Conv2D, MaxPooling2D, Flatten, BatchNormalization
Loading the Dataset
You can download the dataset fer_images.zip from the link provided. After downloading upload it to your google drive to access it from your collab notebook.
You need to mount your google drive to the collab notebook you are working on. It can be done easily by typing the command drive.mount('/content/drive').
After successful connection of drive copy the path of the zip file you have stored in your drive and extract it using zip_object.extractall(‘/’).
Generate Train and Test Dataset
We can create the train and test set for the model by loading the data from the respective directories using a dedicated generator for each of them. But the first process involved is to develop an image generator.
# We have previously imported ImageDataGenerator so you dont have to load it again
#from tensorflow.keras.preprocessing.image import ImageDataGenerator
Now the Train and Test sets are perfectly loaded we can proceed with building the neural network.
Building and Training the Convolutional Neural Network
Before Building the Neural Network you need to understand the process of how a neural network works and what are the layers are involved, and few other terms such as Feature detector, feature map, max-pooling et.,. you can go get a quick glimpse about these terms from our previous tutorials on Basics of Neural Networks and CNN for Image Classification.
To provide a quick recap of how a Convolutional layer is represented is shown below
feature maps + padding (we can apply parameters valid or same) —–> max-pool
max-pool —-> flattening
flattening —-> input layer
For Expression recognition, we need to add few more convolutional layers since the network needs to extract only the most important pixels from the image to reduce the number of neurons in the input layer which can indirectly affect the neuron complexity while adding hidden layers for the network.
To overcome this situation we initially add repeated convolutional layers with increasing feature maps in each layer. Also, the kernel size is set to (3, 3) with activation function relu for each feature map.
For every layer we have built, we apply batch normalization. BatchNormalization() applies transformation that maintains the mean output close to ‘0’ and output standard deviation close to ‘1’.
As we keep adding convolution layers to avoid overfitting of the neuron in the network we can apply the Dropout() function. In our example, we remove 20% of the neurons. This reduces the network complexity.
# dropout layer
network.add(Dropout(0.2))
We can keep adding the convolutional layers to achieve a better accuracy rate, once we get the max-pooling matrix we can apply the flattening layer. To add a hidden layer we use the network.add(dense()), also applying the batch normalization and dropout remains the same for each hidden layer we add.
Finally while adding the output layer we need to provide the softmax function. A softmax function returns a list of scores of each image as to how close the image is matched to a particular class. The maximum of the scores list for a particular image determines t which class the image belongs to. For this purpose, we define 7 outputs on the output layer.
Once our neural network is completely built we can compile the network and run it to train the model. If a model runs for 5 times from start to end it is considered as 5 epochs. Over here we need to run our model 70 times to get the required accuracy rate.
Based on the Train result we can change the epochs. With a simple architecture and having epochs of 70, it takes quite a bit of time to train the model.
Once the network is trained it can be saved and loaded for using it with a different dataset. To save the network you can check out our previous tutorial saving and loading neural network since we have done the save and load operations of a network at the end of the tutorial.
Predicting the test_dataset and determining the list of class scores of each image, and also checking the classification report between test_dataset and predictions.
from sklearn.metrics import accuracy_score
accuracy_score(test_dataset.classes, predictions)
# indicates that our model predicts 57% of images and their expressions correctly
0.5778768459180831
To visualize the classification we can draw a heatmap of the confusion matrix. A confusion matrix is used to determine the performance of the model on the given test data. Whereas the heatmap of the confusion matrix gives the prediction of the number of images that lie in each class.
To get a clear report of how the network has classified the images into different classes we use a classification report. It’s very interesting to go through the values.
Support determines the number of images the models has used of a particular class to test them
Recall determines the number of images it has identifies them as a particular class and categoried them
Precision determines the number times/images the model classification made by the model is correct
from sklearn.metrics import classification_report
print(classification_report(test_dataset.classes, predictions))