Image Augmentation (Improve your Dataset) | with Imgaug, Opencv and Python

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

this happens we started with just three simple images of three different docs and after the augmentation this is what we get so we run the code we get the same dog rotated with a fine transformation with higher contrast with some better effects and all for these dogs and you can see hi welcome to this new video tutorial my name is sergio and i help companies students and friends to ease and to apply visual recognition to their projects we are going to see today image augmentation and how this technique is powerful to improve our custom object detector it means that if for some reason the detector is not performing well with the image augmentation we can dramatically improve the custom detector at the beginning on this video we're going to see how it will work and what is the concept behind the image augmentation and second in the second part of the video we will see how from scratch we can implement image augmentation using the emg lg library from python so if you're ready let's start to explain the idea and the concept behind image augmentation i want to start with some really simple simple oh i took the image of a dog actually have a few dogs just three dogs and but let's suppose that we have a lot of dogs like these that we have in these images and we want to create a custom detector to detect dogs on an image if we give to the custom detector deep learning model only dogs which are just frontal and just horizontal like this way it will be able to attack only this kind of dogs to give you an idea if you to the detector after the training give you give this kind of dog just because the dog is rotated it will not perform well it will not be able to understand that this is a dog but it's just rotated also if you give an image which has slightly different colors it might have some problem to detect them so image augmentation it means applying a lot of filters which will perform in different way uh rotation they will perform some zoom zoom on each image they will cut some part of the image they will change the brightness of the image they will change the color of the image and so on there are so many methods and filters that we can that we can apply but we will mostly the most common use are just five or six so we're going to see only the most common and i will also give you some reference on how to find other methods now that you know the concept behind this let's start writing the code and let's see how to apply image augmentation before starting writing the code we need to have a library which is emg aug so first of all if you don't have that you need to install the library so we open the terminal on windows we type cmd and on ubuntu linux or on the mac you need to open the terminal uh similar as i'm doing on windows and then we need to install the python library how do we do that with a simple command pip install e m g a u g and then after this we press enter this library requires a few dependencies like opencv numpy and a few other libraries which need to be installed if they are not installed don't worry they will be installed automatically with just these simple commands so just to know that if you don't have them it might take a while for the installation i have everything already installed so that's why in my case all the requirements are already satisfied once we have the library and we are sure that the library is working we're going to import this library import a emg g dot augmenters because that's what we're going to use right now as i a a so this is just to make the name shorter as it's it's too long to repeat the same name over and over and this is the standard way that it's used now i'm not going to use this library right away i'm just going to run this one to make sure that this library is installed correctly so if you run this and you don't get any error it means that the library is installed correctly if you get any error you need to find why you haven't installed the library now the second thing that i'm going to do is to import the images that i have right here on the folder so as i told you i i have three docs of course i'm using the docs you can use multiple images of your data set or if you don't have any image you can just download all the files and the uh and the things that i'm doing right here from the link on the post blog below now i will start loading one image and then we will move on from that one so emg equal equals cv2.in read i forgot to import cv2 so we we need of course to import opencv so import cv2 opencv is the library for image processing and manipulation and we need this to just load the image savitsu dot in read what do we want to read we want to read the image which is on this path images images and then dog dot jpg now to make sure that we're loading the image correctly let's show this image on the screen see if it so cvtsu.imshow image what do we want to show we want to show emg now await key went to keep the image on hold cd2.weight key zero and now let's run this one and here i have the first image of the dog now considering that we we have multiple images in one folder we want to load multiple images because our data set it can be never just one image we have we need to have a lot of images so instead of loading the second like this way mg mg2 and then the second image this way we can put we cannot put one line per image so we're going to take all the images that are in that specific folder how do we do that well i'm going to import glob this library is going to give me the path of all the images how do i know that we can say images path equals glob dot glob we want to find all the images that are on the folder images so we say all the files dot jpg so all the images in jpg format if i print this one images path we should get all the images that are on that specific folder so you need to put your path i have three images so i'm going to get just three links one link per image once we have the so let's call this one one load data set once we have this what is the second i mean it's still the first step uh let's loop through them for emg path in images path so that we can load them one by one so emg path instead of the just single path we're going to put this one on that it's on the loop so let me show what we're going to get right now so emg we have the first image the second and the third and that's all because i have only three images once we have these images let's put all the images into one array so that later we can load them with the emg lg library [Music] well how do we do that images we need to create an empty array images so at the beginning it's empty each time we load an image we're going to put the image inside these images so images dot append or emg so we take all the path of the images we load them one by one right here we extract them once we have one by one we put them inside images one by one and then we can move further we have all the images into one array now let's move to the image augmentation so i know that it took me a while before getting straight to the point of image augmentation but i couldn't skip this one because you know uh you need to know how to properly load the images before performing the augmentation otherwise you can get lost if you have never done anything like this so that's why i took sometimes for these lines just there image augmentation now image augmentation it starts this way augmentation equals i a a dot sequential right here we need to put all the methods for image augmentation that we want to apply now i'm going to skip this step now let's show the image so that in real time we can know we can see what is really happening there so we can go on three show images now i'm going to again loop through the images so that i can show them now normally as they are right now and then we will see the difference when we apply the augmentation how do we do that uh what we can do the idea is to take that they are stored inside images so if i do cv2.inshow in show here we can put the the name of the window it doesn't matter what we call let's say image and and then what do we want to show let's show from images let's show the first image and now a weight key went to keep the image on hold cv2.weight keep this we need always to have this one on opencv when we show something if i showed this one oh that's the image that was stored in images if i show the one with index one uh we're going to get the second image if i show index two we're going to get the third image of course we can loop through them so we can do four emg in images in images then let's show emg one two and three okay now the idea will be to apply here different in this space augmentation methods and later that we can see them in real time so we apply the method we run the code and we can see in real time what the method is doing on the specific image and to do that let's start with just some methods that can that does the augmentation we are going to use some method let's now just rotate the image eaa dot rotate and with the rotation method we can give a value which starts from minus 30 to 30 which is this one as default let's now see what this one is doing so if i run this one still you don't see anything because we are just loading the images but we are not applying the method so let me show why we load the images we have we define an augmentation the augmentation here and then we're just displaying the images but we're not applying the augmentation to the images so that's why we don't see them so let's now apply the augmentation actually we can do that right here augmented images equals to augmentation and then where do we want to apply this augmentation we want to apply it to the images so images equals to images and now instead of displaying just the original images let's display the augmented images and let's run this one as you see now we have the image so each each image that we see right here is rotated and how does the rotation happen well it's pretty much random so if you see it goes from the values that we set in this case was from minus 30 to 30 degrees uh let's uh this was just to test this one uh let's try different example i'm going to use the five most common augmentation methods the first one is the flip flipping the images it means to just flipping this one either horizontally like this like the mirror effect you just flip everything this way or vertically this way so how does this work i a a dot flip of course we have horizontal flip which is flip i'll air lr and let's apply this horizontal flip like this so if i go to the original image so let me find the original image okay so i've got here the original image now you can see that this image has the flip effect so it's mirror image and the same is for the other images that i have so you see here is the dog and we have the horizontal flip and also for the other dog what is this value this value is the percentage that needs to be applied on the images so generally we wouldn't do this flip always but common is done only 0.5 so half of the time otherwise we see the image always flip on one side it doesn't make sense so we need sometimes the normal sometimes it must be mirrored flip and so on so if i run this one we might see that now for example it's flip but sometimes it's just original as it was before and we can do the same with the vertical flip i a dot flip ud this is the vertical one and we can also do for this one point five uh let's make sure that we add a comma each time after this okay and then let's run this one again so as you see in this case we have the vertical flip as well not always so i'm just running this again okay in this case we have the flip and all these methods are applied together so it can happen that we have at the same time the horizontal and also the vertical flip then we have the affine transformation so everything that regards uh the affine transform so we have different methods at once or where there is the translation on of the image on the x-axis on the y-axis we have the rotation which is included in this method and something else which we will see so let's make two a fine then i a dot are fine uh let's start now with the translation on the axis and let's do to run translate percent now how does this work we need to put a dictionary and then we define x for the x axis and we can choose the translation on the x axis i'm going to put some random values and later we will adjust and explain why and what this does let's start from minus 0.5 minus 0.5 to 0.5 and if we run this one you see that we have the translation of the image so it's moving on one side like like this of from minus 0.5 to 0.5 so it's random so so from half of the image to plus half of the image and by default the values used are from 0.2 to my from minus 0.2 so from 20 of the image on one side to maximum twenty percent of the image on the other side so here is from minus point two on the left side and then when it's on the right side is uh point two like this one the same is on the y-axis so we do also we add y and then point from the same minus point two so 0.2 like this then a fine on a fine we have also let's p we have translation now we use translation percentage from point two to from minus point two to point two so twenty percent of the image but also we have the translation in specific pixels so in case you have some specific object where you have a definite number of pixels you might use this different method most companies not use with pixels they now rotate so let's also apply the rotation so rotate rotate we need to define some degrees let's say from minus 30 to 30 degrees and let's run this one and here we have of course a random rotation so the same object can be detected by the deep learning no matter what the rotation like this way of course we don't need to make uh like 360 degrees rotation because we have 30 degrees rotation but also we have the flip horizontal horizontal flip vertical flip so if we put the flip plus the rotation it will already have much more angles that minus 30 to 30. then there is another one let's check among the ones that we have right here and we have also the scaling so scale scale let's make it now from 0.5 to 1.5 so here we have either its shrinks so zoom out or we have some zoom in with the scaling so this is really useful to train the deep learning neural network to adapt with different sizes sizes of the same object because you give even if you have some standard size on on your data set this is giving a huge variety of sizes and so this is enough for the affine transform let's go now to the third really common method which is the multiply multiply has one specific purpose it's multiplying what the channels so it means that it's going to either make the image brighter or make the image darker iaa dot multiply multiply and we need to put also in this one some limits so from by default we have from point eight to point two to one point two so from point eight to one point two and let's run this one so you can see of course let's put always a comma between each method otherwise we get an error it's making either the image brighter or darker probably you won't see that right here because you don't see that much of a difference so let's increase the values just that it's clear what is happening and then we will put them back to standard from let's say 2.7 1.7 so you can see how bright now this image is and also this one now it's bright probably too much and that's why we will change later we put some standard values but i wanted to show how this works and now it's really dark and are all dark and bright so the default standard from 0.8 to 1.2 now third matter no fourth method linear contrast linear contrast which has one goal increase the contrast of the image i a dot multiply no a linear linear contrast also this one we go with the stand the standard is from 0.6 to 1.4.6 to 1.4 let's put a comma before that one and let's run this and this one is increasing the contrast now let's go with the last one which is the gaussian blur gaussian blur so the objects might might have some blur so for example when you are working with the camera and you take some picture the object that is on the foreground is a weak focus while the rest is blur so if also your data set contains some blurring it means that it will improve also the detection on the object that's out of focus which will be detected as blur object gaussian blur let's add first of all comma before the previous one and i a dot gaussian blur and here we go also in this one with the standard values from zero zero zero point zero two three point zero and there is an an extra method i will go with the last one which is not a matter but but it's to make it blur or only sometimes so we can add i perform let's say uh perform methods below only sometimes i a a dot some times and how often do you do we want to make them we can say half of the time so 0.5 the more you increase this one the more often these method are performed the lower of course the less the least they are performed so that the blur is applied only half of the times and let's run this now one last thing that i want to make to give you an idea of how the deep uh neural network will understand this when training is that i'm going to put everything right here on the show on a loop so while true and everything will be on a loop and also we need to put the augmented images inside the loop so the idea is that now when i run this one we will see these images not stop so after the three images are displayed we load again a new augmentation and we will display again the three new augmented images again so this happens we started with just three simple images or three different docs and after the augmentation this is what we get so we run the code we get the same dog rotated with a fine transformation with higher contrast with some very effects and all for these dogs and you can see that there is no real limit of how the data set can dramatically increase using this method so if your experience is a problem with the deep learning custom detector this is one way that you can use to improve it and of course this was just one of the method to improve the custom detector of course there is much more to it when you want to make some project especially if it's some professional project if you're interested in more advanced stuff and if you are a beginner you want to learn properly how to apply and create custom object detector from scratch i recommend to check my courses on pisource.com this is all for this tutorial see you on the next one

Info

Channel: Pysource

Views: 2,927

Rating: 5 out of 5

Keywords:

Id: kM5jEGsRa4I

Channel Id: undefined

Length: 28min 11sec (1691 seconds)

Published: Tue Jun 08 2021