How person re-identification (RE-ID) works with Computer Vision | Opencv with Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
if we want to make sure for example this person with the green t-shirt in the next frame is always the same person how can we do then personal identification hi welcome to this new video I'm Serge a comput consultant developer and Constructor at by Source we build Compu Vision solutions to help companies improve their process efficiency reliability and scalability today I'm going to discuss about person identification on retail store and airports because it is one of the lately most common request I get is it possible to track people track unique people inside stores airports crowded places and also another question get that I get a lot can we build a solution that works either on Raspberry Pi or on an Nvidia Jetson Nano to do this we're going to discuss everything in this video Let's Start sharing the benefits of such Solutions then we can get more into some technical aspect and show some a simple and some protype of the solutions work there are benefits for both sides for the operator and for the client passengers let's take the example of an airport for the passengers the benefits will be shortest waiting time because of the better que management and also another thing that I I also saw in some of the airport you can have some predictability of your journey for example now you can can get also the waiting time you can see that I remember I was a few months ago at Milan airport and there was a huge queue before the security check and I was expecting there it would take 1 hour or even more like waiting because there were so many people but there was showing on the screen that there was written like 15 to 20 minutes waiting time that can be done with Compu Vision because you can estimate how many people are there considering the security checks are open in the average waiting time the information was very accurate so this is good benefit that passengers can get another thing is also with the better Q management the passengers will get also more safety because you can avoid overcrowded place this client side how about the operator also for the operator there are huge benefits because whether it's for the airport or for the retail store you can have a better layout of the store due to the information that you can get from from such solution we'll get also more into the information later in the video layout you can have real time information so that you can know how to allocate resources whether it's stuff uh whether it's security for example and then also more inre increased sale and revenue because of better layout that you can build thanks to all the information that you get let's now get into some technical aspect about this solution because based on the questions I get I see that there are a lot of misunderstanding based on what you can achieve and also I want to emphasize how person detection and person reidentification are to complete different things and there is a huge gap in terms of what you can get from them and in terms of complexity I'm not I'm not going to show you some of the samples I have a video from from retail store and I'm now going to run person detection on this video when I run personal detection and you might have seen this on a lot of AI and computer vision videos on YouTube or anywhere in the internet when we have person detection we see that we are detecting the person and we are surrounding the person with a bounding box so what you see right now is not now in all of them it's possible also to improve the accuracy of this but that's not the goal in most of the people at least that that are well visible in this video we see a bounding box because we are detecting the person this is person detection person detection works on a frame by frame it means that we are processing the video image by image and image after image we are drawing a bounding box on each person what does this mean this mean that we don't identify uniquely each single person so on each frame we don't know if the person that we saw before is the same person or or if it's a new person and this is a crucial step for such solution to have a unique identification because if you want for example to count how many people are entering in a store you can't rely on a single detection because you want to make sure that the person that it's Crossing so if I uh let this video go we want to make sure for example this person with the green t-shirt in the this Frame is this person in the the next frame is always the same person this is personal detection a second step to improve personal detection is person tracking let's pay very close attention to the left side to this two people the guy with the green t-shirt and the guy with the black T-shirt now we see we have ID Associated which is 31 to the guy with the green t-shirt and 33 with the guy to the guy with the black T-shirt now let's follow them carefully while they are moving the guy with the green t-shir 31 for the moment we lose track of that guy it reappears and it keeps still the3 one because somehow the object tracking was following the motion and was able to do that despite the was occlusion so object tracking somehow it's some good approach it's a it's not a very basic algorithm there is some sophisticated algorithm B behind object tracking and someh it works the other person with a38 we lost the ID of that person and if we go further we see that now the idea of the guy with the green t-shirt is 42 the one with the black t-shirt is 44 so we lost also the ID of that person and now let me pause this what's the problem that if you want to track and identify the specific person this cannot work because now for only two people we got five different ideas so if we were checking if we want wanted to check information like the uh estimated waiting time uh of the people in the store then we have five ID for two people so it's totally unreliable how can be this problem solved the way to solve this problem is to identify correctly the person the normal approach that are used for example on Security checks like passport control to identify a specific person is facial recognition for example facial recognition is not possible we such low resolution images especially from the CCTV cameras because whether they can have even higher resolution but the resolution of the face is always very small it's a very small amount of pixels and we need we need at least 100 by 100 pixels for the face to have some reliable accuracy for the tracking of the face to have a unique facial recognition so it's not possible in this case so we need to find another way of doing person reidentification how can we do then personal identification there are a few algorithms for personal identifications that are available and I want to show how what's the concept behind them and also some live implementation on this specific video now we will do person reidentification what do they do we want to when we detect the people we want to extract the image of the person we can store that image and then when we get new ideas we want to make a comparison of the idas that we get so let me give you an example right here what I did in this specific code that I'm using right now I made a real time extraction of the ideas that we get for example we have ID 42 and 44 for each frame that we have this I'm going going to extract the images so if I go on 44 id44 over here ID 44 you see we have the frame so per each frame where we take 44 we have the person so if I click any of them and I zoom this one we have this person let me go on ID 42 so for it two we have the other person with the green t-shirt if we go on any other ID let's say ID 21 we have only one ID only one picture I want to find something where we have more pictures let's say 53 okay 53 we have another person on all this folder what we want to do is we want to compare them we want to see which ID is the same for example we know that ID 42 this person with the green t-shirt and ID 31 are the same what the algorithm is going to do is going to compare all of them and re identify the same ID so ideally an algorithm that's that's working well will give again the ID 31 uh to the same person the specific algorthm that I'm using is oset I'm going to leave also some links down below so that you can read more about the research of this algorithm uh is made specifically for personal identification and the idea is that we can't get the facial recognition of the person but we can get the features of this person so we will see that the person has a t-shirt of a certain color it will have the the the clothes of certain color or certain type and also we can have with a good camera placement we can have like the simp person from different angle so it will be precise also when we get again the person from a different angle because if you take a front picture of the person and then you try toidentify the person taking it from the back of course that person will be different and it will not be reliable so it's good to take into account all of this information so what I'm doing right now I have implemented personal identification on this code so in this case it's not in real time it's elaborating now I will press R to identify so now what the code is doing is taking all the images all the ID all the images for each single ID and it's processing the images for this specific ID and we need to get now back the ID for it to so you see in this case it's working well we get again the same Ida 31 and 33 for the people that were detected originally with that specific ID and of course this a very basic sample on this video uh it's very complex this solutions because it takes a lot of computational power to do this because now I'm using a graphic card very powerful graphic card to do this operation and it takes a lot of seconds to do this because it has to process a lot of frames for each single person to correctly make the re identification also another aspect to take into account is that to this solution to be reliable need to be trained on each single scenario so you can't just place a camera apply person identification algorithm and and the algorithm will work right away you need to properly study the camera placement make a training so a lot of data uh from that specific location and then you can have a relable tracking and identification of the person let's now address Hardware can you build such Solution on Jetson Nano or Raspberry Pi I want to address this question because I I get this asked a lot not only for this specific project but for different projects and what i notic is that either projects are underestimated in complexity or the over the Jetson Nano is overestimated in capacity you cannot do this in in with the Jetson Nano because it requires a lot of computational power Jetson Nano is a very small nice device that you can use uh to do some simple small prototypes but you cannot do on the Jetson Nano which has a very limited capacity such uh complex Solutions there are some limitations some something you can do with with Jetson Nano you can do object detection you can do also object tracking and so it could be reliable I can give you some uh examples so if you have a camera placement right here let me draw let's say that you have a camera right here you want to place camera with jets sonano and jets sonano will be looking only let's say people that are coming down from this stairs so you see there are stairs there people that are coming from these STS with jet soan you select a limited area and from top you can count and even with good accuracy how many people are passing why because you have a very limited area you are tracking people from the top so there is no problem with overlapping because you will be tracking the head so you can get a very good Precision on that but that's the most you can get with the Jetson Nano you can't have an infrastructure where you can do person reidentification and so on you might achieve that with the raspberry Pi I don't work generally with the Raspberry Pi but it's very limited and not suitable usually for uh for such project especially when you want when you want something in real time because you need a lot of uh computational Power to work with such solution you need a more complex infrastructure where you also need a server to process all the data so especially if there are multiple cameras on a big store and you want to make the reidentification on multiple cameras then you need a proper infrastructure to handle this very big amount of data do you need to send it to the cloud not specifically you don't need to send it to the cloud but you need to to have an inhouse server uh powerful enough to process all of this and I know that uh mostly you don't want to send this to the client for all the Privacy reason data protection but you need the infrastructure to handle that let's Now quickly discuss about the kpi that you can get from from such solution for the retail store you can get kpi like for example Dell time D time and of course you can get this only once you have the personal identification so not widget so now and so on you can get dual time you can get heat map hit map and from this view that you see see right here for example it's possible to create a two-dimensional uh two dimensional map where even if you have this VI we can see the people that are from the top and you will see dots instead of the people so if we have 33 theid 33 and the D3 something like this the D 59 53 let's suppose that this is the the map from Top so here is ID 59 for example here's ID 53 and you can follow this ID so you can create the heat map you can follow the ID you can get so the flow where people are the most spend the most time where people spend the least time and so on you can get also the total number of people that enter in the shop total number the people that leave the shop there are so many things that you can get with this and this is all for this video please let me know if there is anything else you want to discuss about this specific subject or similar computer vision Solutions below down in the comments this is all for this video see you in the next one
Info
Channel: Pysource
Views: 3,861
Rating: undefined out of 5
Keywords:
Id: SMRLT-jbwgo
Channel Id: undefined
Length: 17min 46sec (1066 seconds)
Published: Tue Apr 09 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.