Emotion detection with Python, OpenCV and Scikit Learn | Mediapipe | Landmarks classification

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
So on today's tutorial we are going to build an  emotion classifier and we're going to classify a   person's face into one of these three emotions,  happy sad and surprised, so let me show you how   it works, so this is my webcam and now this  is happy, sad I'm surprised, so now let's get   started, and now let me show you how to build this  emotion classifier so let me show you the data we   will be using in this tutorial we're going to  use the same dataset we created in one of my   previous tutorials where I showed you how to create  an emotion recognition synthetic dataset, we are   going to use this data over here you can see we  have four different categories which are angry,   happy sad and surprised, let me show you a few examples  of each one of these categories for example this   is the happy category and you can see we have  many pictures of many people which look very   happy right all of them are smiling all of them  are very happy and this is for example the happy   category now let me show you a few examples of the  sad category you can see we have many people which   look very sad right and now let me show you a few  examples of the surprised category and you can see   that all of them look very surprised so this is  the data I am going to use in this tutorial but   you can obviously use any other that that you want  if you want to use a different emotion recognition   dataset you can definitely do so, as long as you  have many pictures of people showcasing different   face expressions then you can just use any other  dataset you want but if you want to follow along   this tutorial and if you want to use exactly same  data I am going to use then you have two different   options, you can generate this data yourself if  you follow along that tutorial I am going to be   posting over there you can just follow all the  steps in the tutorial and you can just generate   exactly the same dataset yourself or the other option  is to just download this dataset which is going   to be available in my Patreon for all my Patreon  supporters, so you have these two options in case   you want to use exactly the same dataset I am  going to use today and now let's continue with   this tutorial now let me show you all the steps  we are going to take in order to complete this   process the first step will be to clean the data  right this is an amazing dataset we created with   stable diffusion but we are going to clean this  dataset in order to make it even more amazing   right we are going to clean this dataset and this  is going to be the first step in this process, then   we are going to prepare the data in order to  train this classifier and then the next step   in this process will be to train the model we are  going to use as an emotion recognition model, and   then finally the last step in this process will be  to test the model and I'm going to show you how to   test this model using your webcam so these are  the four steps we will take in this process and   let's get started over here with the data cleaning  let me show you how to clean this data, the first   thing we're going to do is to remove one of the  categories because I have been inspecting all   the images in these categories and I noticed that  the angry category looks very similar as the sad   category the images in these two categories look  very similar so it's going to be very hard for   the classifier we're going to build today to tell  these two categories apart so we are just going   to remove one of the categories like this right  now we have three categories which are happy sad   and surprised, then the next thing we will be doing  in order to clean this dataset is to remove many   images which are not the images we are looking  for in order to train this model remember when   we created this synthetic dataset we were looking  to create frontal faces of people looking at the   camera and although most of the pictures are like  that there are many other pictures which are not   for example this one over here he's not looking  at the camera right so this picture is not really   the type of picture we are looking for the same  happens with this other picture over here and there   are many other examples of pictures which although  they look surprised they are not really frontal   and they are not really looking at the camera  so these are not the pictures we are looking   for and then we are also going to remove many  pictures which don't look like the expression   we were trying to generate for example remember  we are looking at the surprised category and   this picture over here he doesn't really look  surprised right he looks more with a neutral   expression or maybe angry based on this part over  here so he doesn't look surprised so we are   just going to remove all the pictures of all the  faces which don't really show the emotion we are   looking for right there are many other examples  for example this one over here he doesn't look   surprised either this one over here he doesn't  look surprised either so we are just going to   remove all the pictures which are like this right  we are going to remove all the pictures which are   not frontal which are not looking at the camera  and which do not portray the emotion we were   looking for so this is what I'm going to do now  I'm just going to select all these pictures for   example this one doesn't look surprised this one  doesn't look surprised either this one either this   is not looking at the camera and so on I'm just  going to go through all these pictures and I'm   going to remove many of them, okay so that's going  to be enough you can see that now all the pictures   look surprised they look very surprised and all of  these are very frontal pictures of people looking   at the camera so this is going to be perfect now  let's continue and let's do exactly the same for   the sad category, okay now all the pictures look  sad and they all look frontal so everything is   it's okay and now let's continue with the happy  category okay and that's going to be pretty much   all for the happy category as well now you can  see all the pictures look frontal and all of them   look happy so that's pretty much all now we are  going to do something else you can see that now we   have 194 items in the happy category we have 160  images in the sad category and 96 in the surprise   category we are going to balance this dataset so  we make sure all of our categories have the same   amount of objects so we are going to remove images  in the happy category and in the sad category so both   of them have 96 items as the surprised category  this is what we're going to do now we're going   to the happy category and the only thing I'm going  to do is to select 96 pictures... 54... 78... 97... 96   okay and then I'm go going to invert my selection  over here and I'm just going to delete all these   pictures okay now we have 96 items in the happy  category and I'm going to do the same with the sad category something like... this... 94... 96... then edit invert selection and I'm  just going to delete all these pictures okay   now we have created a balanced dataset and this is  going to be exactly what we need in order to   train this classifier so now it's time to go to  pycharm because now it's time to get started   with all the coding of this tutorial now we are  going to move here to the second step in this   process which is preparing this data in order  to train this model and this is how we're going   to do remember we are going to train this model  using scikit learn and and we are going to detect   the face landmarks and we are going to train this  model using the landmarks so let me show you a   function we are going to use and it's going to be  very important in this tutorial which is this one   over here and it's called get face landmarks  we are going to use this function we're going   to input an image and the output is going to be  all the image landmarks right all the image face   landmarks and this is going to make things much  easier in order to work on this process this is   going to be a very quick tutorial so in order to  make things easier for us we are just going to use   this function and we're going to use this function  in order to extract all the landmarks of our faces   so let me show you how we're going to do now  I'm going to create a new file which is... prepare data... and this is the one we're going to use in  order to prepare the data remember to install   this project's requirements which are scikit learn  mediapipe opencv and numpy please remember   to do so before you get started otherwise nothing  is going to work obviously right please remember   to install the requirements now the first thing  we're going to do is to import this function we   have over here from utils import... get face  landmarks okay and then we are going to import os and we're also going to import opencv okay  and then we are going to iterate in all the   images in these directories and we are  just going to extract the landmarks of   all the faces we have over here and  by doing so we are going to prepare   this data in order to train the model  so the data directory is this one over here data remember we are here we are here this is prepare data.py  the script in which we are currently   working in and this is the directory  with all the data now let's get back   to pycharm and the only thing I'm going  to do is for emotion in os listdir data directory we are going to say for image path in os list dir os path join data directory emotion we're   going to create a new variable which is image path and this is os path join data  directory emotion and image path okay   okay and then the only thing we're going  to do is to call this function over here   get face landmarks and this is going  to be something like this face landmarks get face landmarks and we're going to input  the image obviously we need to create image first   so I'm going to say something like image cv2 imread  image path okay and then we are going to input   image like this so we're just going to use this  function in order to extract all the landmarks   but let me show you how it works you can see that  we input an image over here the first thing we do   is to convert this image from BGR to RGB and we  create this new object over here and then we are   creating this new variable which is face mesh we  are using mediapipe and we are basically just   using this object in order to process this  image and then we are just extracting all the   landmarks from this object over here right so you  can see it's a very straightforward process but in   order to save us some time we are just using this  function over here something that's very important   is that you can see that for every single landmark  we are extracting three values the X Y and   Z coordinates so for every single landmark  we are extracting these three values and then   something else which is very important is that we  are going to get the face landmarks but for every   single coordinate we are going to subtract  the minimum value of those coordinates because   this is going to help us to normalize this data right  this is like a way of normalizing this data so we   make sure that we are are just going to extract  all the landmarks and we don't really care where   the face is located you can see that we have all  of these faces over here and all of them are in   different positions right we have this face over  here then we have this face over here which is   way upper right in the upper part of this image  then we have this other face over here which is   much more central we have this other face over here  which is on the right we have all the faces in   different positions so in order to normalize all  this data we are going to subtract the minimum   value for each one of the coordinates and this is going to help us in order to normalize the data   and in order to train a much more robust model so  let's continue, now that we have extracted all the   landmarks we can just continue with this process  and maybe something else I can show you about this   function is this part over here which is going to  take care of all the drawing but we are not going   to do any drawing in this part over here which  is the data preparation but we're going to do   the drawing later on right we are going to use  exactly the same function in order to draw the   face landmarks in the last part in this process  over here right when we are going to... when it's   time to test the model so let's continue in order  to make sure everything's okay I'm just going to   execute this script and everything seems to be  okay now I'm going to print the length of this array   and let's see what happens... and you can see it  says this value over here which is 1404 so this is   how much data we have in this list over here which  is the one we are going to use in order to train   this model this is a lot of data these are a lot of  values we're going to use in order to train this   classifier we could do it which much fewer values  because if you think about this problem most of   the information is going to... it's going  to be in the landmarks in the lips right and   in the eyes and in the... and that's pretty much all  right so we don't really need all the landmarks of   the face if I'm not mistaken mediapipe extracts  something like 468 landmarks we don't really need   so many landmarks in this problem but in order  to keep things simple we're just going to use   all this data we have over here we could use much  fewer landmarks but let's just keep things simple   and we're going to use all this data so let's do  something we are going to make sure that we are   working with all this data because there could  be other situations in which we do not detect   any face whatsoever and this is going to be an  empty list right and there could be potentially   there could be other situations in which we have  detected a fewer amount of landmarks right we   don't know what's going to happen so in order  to make sure we are working with all the data   we need we are going to say something like this  if we have 1404 values then we are going   to continue and then we are going to create a new  list which is output and we are going to... output append... face landmarks okay and then  obviously we also need the label   right because otherwise we will not be  able to train this model so I'm going to   say something like this for emotion index  emotion in enumerate... enumerate... okay... we are going to do exactly the same  but now we are going to extract the emotion I mean   the... the emotion name happy sad and surprised but  we're also going to have an index an integer value   for each one of these emotions and this is what we  are going to do now we are going to append another   element to this list over here... which is going to  be the emotion right the emotion index actually   is going to be this value over here and that's  pretty much all so now the last value we have   in this list is going to be the emotion and then  we are just going to append this list over here   to another variable which is output and this is the one  we are going to use in order to train this model   just to make sure this is an integer I'm going to  something like this I'm going to cast this value   to integer and that's pretty much all and now  the only thing I'm going to do is to call np save txt I'm going to name this  value something like data. txt and then I'm going to input  the... this variable over here   which is output but I'm going to  convert into a numpy array right first okay and I obviously going to import numpy  otherwise this is not going to work import numpy   as np... okay so this is pretty much all if I'm  not mistaken this is pretty much all so now   I am just going to run this script and this is  going to be pretty much all in order to create   all this data something I'm going to do first  is to quit this... my browser because I have   noticed that every time I execute this script it  takes a lot of memory a lot a lot of memory so   this is going to be pretty much... this is going  to be much better in order to execute this   script so I'm just going to press play... okay and  that's pretty much all and now if I go to my   directory over here you can see I have a new file  which is called data.txt and this is exactly the   file we need in order to train the model so I'm  going to do something which is creating a new   file which is called train model and this is the  file we are going to use in order to train this   model remember we are going to use scikit learn in  order to train this classifier and we could do   all the coding ourselves but this is a very good  problem to use chatgpt to just ask chatpgt to provide   all the code we need I'm going to show you how  to use chatgpt for a problem like this remember   programming is changing is evolving so it's  very important to get familiar with all the   tools which are just going to make our lives way  easier and this this is a very good problem this   is a very good situation to use chatgpt to just  make everything much easier so I'm just going   to ask chatgpt something like this, give me Python  code to train a model using scikit learn random forest   classifier the data needs to be loaded from a txt  file called data.txt each row is a different data   point the label is the last value let's see what  happens... okay and this is what we got and obviously   I have already prepared this tutorial so I know  how this should look like and yeah I think this   is perfect this is exactly what we need so I'm  just going to copy and paste it over here... and   you can see that we are importing numpy and scikit  learn then the file is called data.txt we are   loading this file the label is the last value  so everything is okay and all the features are   all the other values... then we are splitting the  data... we are splitting the data like this let   me do something which is... do not show hints  okay, uh I'm going to do something else which is stratifying... stratify... according to the labels which are   Y this is very important because we want to   make sure all these two sets the training set and  the test set they have the same amount of objects   or actually the same proportion of objects of all  of our categories this is very important and also   obviously shuffle equal true right and that's  pretty much all and then let's continue I'm going   to create this random forest classifier without  any parameters I'm just going to use all the   default parameters but other than that everything  is just perfect everything is exactly what we need   right, so this is a very important tool remember  programming is changing is evolving so you need   to be familiar with how to use chatgpt in order to  make your life way easier so I'm just going to   execute this script as it is and let's see what  happens I'm going to close the browser again so   we make sure we can execute this file with no...  without any issues, actually this training process   is not really that memory consuming we are not  really consuming that much memory or that many   resources so we didn't really need to close the  browser in this case but it doesn't matter let's   continue so you can see that we're getting an  accuracy of 77% let's do something else which   is we are going to import confusion matrix...  and we are going to print the confusion   matrix as well so we have even more information  regarding what's going on... y test y pred okay so these are the results and you can see  that we are training a very powerful classifier   you can see that for each one of our categories we  have an almost perfect result this one over here   is where we have the best result but nevertheless  in these two other categories we also have very   good result now that I think about it I'm not  really sure how this model is going through   these categories I think this is alphabetical  order but I'm not completely sure only to make   sure I am going to do exactly the same but I'm  going to prepare the data again... sorting this list   over here so we make sure all the emotions are  sorted alphabetically now I'm going to train the   model again sorry I'm going to prepare the data  again... and now I'm going to train the model again and   let's see what happens now, okay now we have very  similar values so yeah this category over here is   happy this one is sad and this one is surprised so  so you can see that for all of our categories we   have very good results especially for the happy  category which is the one we have a much better   classification a much better accuracy right but  nevertheless the other two categories we have a   very good accuracy as well and overall we have  a 77% accuracy so this is a very good and a very   powerful model so this is pretty much all in  order to train this classifier you can see how   easy it was once we used chatgpt maybe I can explain  super quickly what exactly we are doing in all   these instructions so over here we are loading  the data here we are just splitting data into   the features and the labels here we are calling  train test split in order to produce the training   set and the test set then we are just creating  a classifier we're going to use random forest   classifier and if you ask me why we are using  random forest and we are not using any other   classifier, just because, there's not any particular  reason this is a classifier I really like a lot   because it performs very well so that's the only  reason why we are using it because I like it now   let's continue then we are calling the fit method  of this classifier and the input is the training   data right these are the training features and  these is the training labels and then then we   are just calling the predict method and we are  predicting the test data and we are comparing   the prediction against our ground truth which is  y test and then we are doing the same over here   with the confusion matrix and that's pretty  much all and that's a very quick explanation   regarding how this script works and now let's  continue with the next step in this process   which will be to test the model this is where  we are going to use our webcam in order to test   this model so I'm going to create a new file... new  file... test model... and we are going to do something   like this we could use chatgpt as well but I'm just  going to do it myself it's not really that... it's   not really that much so I'm going to import cv2  and I'm going to create a new object which is cap   cv2 video capture and I'm going to use my webcam  with the index number two... in your case this   depends on how many webcams you have connected to  your computer so if you have only one you have to   say the number zero but if you have more than  one then you just have to select the one you want   to use in this project, then I'm going to say  something like this ret frame equal to cap read and then while ret I'm going  to copy this instruction... we're just   going to read frames from the webcam  one after the other and we are going   to visualize these frames right  imshow frame and this is going to be called frame okay... and then we are going to wait something  like 25 milliseconds in order to produce   something which is going to look like real time  we're going to wait 25 milliseconds after we   show each one of our frames and this is  going to make the visualization look real   time and then at the end we're just going  to say something like release and then cv2 close... close all windows something like  this if I'm not mistaken... destroy all   windows okay, okay then we are going  to use this classifier... something   we didn't do over here is to save  this classifier so I'm going to import... pickle and then pickle. dump... we  need to say something like... with open model this is going to be this object over here... and then f, this is actually w because we're  going to write a file and this is wb because we're   going to write this file as bytes okay and that's  pretty much all so I'm going to train the model again... okay now we have saved the model over  here this is the model we have trained and now   the only thing we need to do is to load this  model model over here so I'm going to import pickle... and then with open I'm going to input  the model path which is this one this is going   to be rb as f... and then this is something  like model equal to... f no sorry pickle.   load... f... okay and that's pretty much all then  we need to extract the landmarks so I'm going   to import... from utils import get face landmarks  and then we are going to say something like   this get face landmarks frame and we also  need to adjust another argument which is   static image mode we're going to set it in false  because now this is going to be a video okay we   are not going to draw anything for now so  this is going to be false for now later   on we are going to say to true but for  now let's just leave it like this okay and   these are going to be the face landmarks  okay and now we're just going to use this model... predict face landmarks okay and this is the  output... let's print output just to make sure   everything is okay and then we are also going  to plot this value into our webcam but let's   take it one step at the time, so let's see what  happens I'm going to press play... okay so we   are getting some predictions now we are  predicting sad which is the index number   one if I smile... now we are predicting zero which  is happy so everything seems to be okay now we   are going to print the... the emotion the  category we are predicting we are going to   print this value in the frame and in order  to do so we are going to use cv2.put text okay this is going to input the frame  is going to input the text we are going   to uh print and this is going to be  something like this emotions equal to... happy... sad... surprised... okay then the position which is going to  be something like... something like 10   and the... the Y Position will be something like frame... shape... 0 minus one we want to print this  text in the bottom of the frame so this   is going to be something like this and then  we need to input other values, one of them is   the font we are going to use so I'm just going  to copy and paste this value over here   something like this and then the text  size which I'm going to set in three the color... which is going to be green and  then the width which I'm going to set   in five... something like this... oh I made  a mistake actually this goes over here okay... now everything should be okay and now we need  to input the emotion which is something like this output... zero... okay let's see now what happens I am going to convert this value into integer okay now let's see  if it works so sad... happy... and surprised... so everything is working just fine  now let's do something else which is plotting   all the landmarks and in order to do so this  is going to be very simple the only thing we   need to do is to draw equal to True  okay let's see now and you can see that   now we're also visualizing all the landmarks  in my face so let's try again this is happy...   sad... and surprised... so this is going to be all for  this tutorial my name is Felipe I'm a computer   vision engineer and this is exactly how you  can build an emotion classifier using python   scikit learn and mediapipe, this is going to be  all for today and see you on my next video
Info
Channel: Computer vision engineer
Views: 1,669
Rating: undefined out of 5
Keywords:
Id: h0LoewzGzhc
Channel Id: undefined
Length: 34min 42sec (2082 seconds)
Published: Thu Feb 01 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.