Object detection using Tensorflow Lite C API on Android

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello everybody we continue our series about uh a cross-platform object detection using tensorflow lite in the previous videos we converted an object detection model from tensorflow to tensorflow lite and we wrote our object detector in c plus plus which using tensorflow lite c api and in this video we are going to develop the android application now this video will focus uh around integrating tensorflow lite with android and not about developing an android application uh so let's start with setting up the environment obviously the first thing that we need is the tensorflow lite libraries for uh for android which is this lib tensorflow like c and it's you can get it from my tflight dist repositories under releases uh this is what we did for windows so again uh it's the same file i just uploaded a new a a new file for version 241 which has a better a folder structure for android so if you already downloaded this file please download it again and what you will see uh i extracted here to see tools under the flight this libs there is android folder and un and under android folder we have a web folder per cpu architecture and under under which we have the tensorflow lite c library so this is the flight disk and in this project we are also going to use opencv so we need to download opencv for android as well so under opencv.org releases just download the android sdk and extract it here to some folder i extracted it to c tools as well this is the opencv android sdk and this is what we have under it samples and the sdk okay so these two are required now we are going to reference these folders from our from our application so we define two environment variables one is opencv android which referenced this open cv android sdk and tf light test which referenced this tflight test so this is also important and after you define the variant variable go ahead open visual studio we are going to start a new project i said visual studio i meant android studio obviously and this is going to be a s plus plus enabled project and if you are not familiar with the native code inside android application you can watch the video i have on this channel and where i did a opencv in android and i explained the relatively a lot about the native inside the android we'll name this project tflight and let's put it under our project so the flight goes platform and create folder android okay and next we use the default toolchain and that's it uh now give the project some time to initialize and i'll be back when it's finished okay it finished and we have our project so we have the main activity and here under c plus plus we have the c make list which is the configuration file for building the c plus plus code and the negatively cpp which is a domain the mnc plus plus file for our library um okay next thing let's go to the build guidel of our app and here i want to change the minimum sdk version i'll change it to 24 and opencv will not work with the sdk 16 i don't know what is the exact minimum required sdk version for uh for opencv but i know that 24 is okay and the next thing that we want we want to configure in this file is uh to include our model inside inside our inside our application so if we'll open this showing explorer and if we go to our root folder under models we have our tensorflow lite model and the label files and we want to include or to embed these two files inside our apk so the way to do it we just reference i define the source set and i believe it was this is android app so we are here so it's one two it's two levels up models so this is one thing and another very important thing is is to give this this this flag to all all files uh all tflight files uh don't compress them when embedding into the apk just embed as is that's it now we can we can think sync our project and we can see that we have here the under assets we have the model and the labels so that's good that's this now let's go ahead and configure our our simplest plus a project so the third thing that that we want to do is to include the opencv so this is the way this is the way we do it first we set this flag that we want to statically compile and link a statistically linked opencv then we define this opencv variable which is using our environment variable the opencv android uh and we go reference to sdk native jni um if we see here so under sdk native jni we have all the all these make files for opencv and this make files are going to uh to define all the headers and libraries that we need in the project and so this is what the find package will actually do and we also need to do the same for a tensorflow lite so somewhat the same so this is we are not going to statically link this is we are going to actually dynamically link or bring it as a shared let's share the library into our into our apk so we define this variable tensorflow like tier which which is referencing our environment variable and if i remember correctly the environment variable was tflight dist and not dear okay so this is this and then we are declaring this slip tensorflow light see a library here we give it some properties including its location which is which is here and note here that we are there we will we are referencing the cpu architecture so we are going to embed all four uh all four libraries for all four cpu architectures so this is a this is for declaring the opencv and tensorflow live libraries and the next thing here here we define our code our c plus plus code so we are going to build the library which is called native lib and this is the code that we are going to to use to build this library so the first file obviously is native lib but this is the place where we also want to include our uh our object detector code so we can fix the intention we can just look for it one more here object detection and object detector c plus plus okay so we are going to compile our object detector as well and the next thing that we want to declare here is additional include headers so this is what we have here um we want to again i think it was one more so by referencing the entire object detection library we will get the object detector header file and there's a tensorflow tensorflow lite there which is referencing our environment variability if like this we just take the include folder so this will give us the tensorflow lite headers so this is additional include folders and now here this is predefined for us it's including a library the log library the if you would like to do login from the c plus plus so it's already here and what we are going to add now is another library which is the android library and we need this library in order to read uh in order to read embedded assets from c plus plus code so from within c plus plus we are going to read our model file so we need this android library to uh to extract files from the from the embedded assets and the last thing is here we are defining what libraries we want to link our module our library with so the log lib is already here but we are going to add also our our tensorflow lite library that we defined here and we are adding all the opencv libraries this environment variable is defined or declared for us when we do find package opencv so this is going a fine package opencv is actually loading the cmake files of opencv and inside these files this environment variable is defined and the android library which we defined here and this is our make file and if we did everything correctly we should be able to build our project so let's start a build and i'll pause here okay so it finished finished successfully this build and now here under the cpp includes we don't have they include files and our object detector file that we defined here so what we can do is go to file and sync project with radio files and hopefully this will update yes it's updating all the files so we can see that we have all the opencv headers here and the tensorflow headers our object detector and also in here we have the object detector code so this is really convenient to have these files here at the file explorer uh now let's go to our c plus plus file and let's add some some opencv code and let's try to build it and let's also try to build our detector and see let's see what's happening so i'll copy some code here um so watch this code i uh first we're including some some headers so this is the android logger this is the asset manager which we will use to extract the model from the embedded assets open cv headers core and image processing i go we don't need i believe and obviously our object detector then there is this function to rotate and open cv image we are going to get images from the camera and the camera will also provide us with the with the image orientation and we are going to use this orientation to rotate the image so it will be an upright image so this is this rotate matte function so now that we have all this code let's try to build the game the project and if everything is good uh we know that uh that we configured the project correctly uh both in terms of opencv and in terms of tensorflow because our object detector is actually getting compiled here and it is using tensorflow so it it finished and it was successful so it seems that our setup is okay um so what i'm going to do now i'm going to bring bring in code of an android application that has an activity that opens the camera and processing the camera images and one when it's done uh we will continue here in implementing the interface between our object detector and the and the android application so i'll see you soon okay i'm back and so this is our application this is running on a real device not a simulator obviously because we need the camera i'm using this scr cpy tool to reflect the the android screen you can check it in in this github repository it's a really cool tool so this is our main activity and then we have a single object detection button which opens the object detection activity and when we open this activity we're going to allow the camera and we just should have the camera feed and that's it now this is running on a very old samsung device so the performance are not going to be are not going to be the best uh hopefully it will survive and i think it's like a j4j5 device uh and now we will want to run object detection on uh on the camera feed this is our goal okay so let's get to it so i'll we don't need this anymore and we'll stop running okay i want to so this is the applications uh there is the main activity under the object object detection activity they you can take your time and and and examine the code i'm not going to go over the main activity here and also in our object detection activity um most of the code that handles the camera and everything is already here i'm using a camera x so you can read the camera x documentation if you want to see how to how to set it to set it up yourself or you can just read the code here in terms of the layout of this object detection screen we have the camera x preview view this is where we see the the camera the camera image and i added this surface view which has a red which is the same dimensions uh as the as the camera view so they are one on top of the other should be at least and on this surface view we are going to draw the rectangles of the detections so we have the preview view of the camera view a of the camera and surface view and this is where we are going to draw our detections and and i also have the labels here so this is the this method here load labels uh i'm just reading the labels text file from the embedded assets and read it line by line and add it to this to this label this label map structure that's it so later in the code we can see we got class five so we are going to the map and check what is class five and we write it on on the output now in term of a of the camera so let's see so the oncreate we don't need to go into it anymore this is handling permissions and this is where we get the permissions result and this is starting the camera feed there is a lot of code here but honestly i just took it from the examples of camera x i didn't do anything special here and we are defining a preview so we want to get a preview and we also define image analysis image analysis and this is where we are going to get the the frames from from the camera and so this is this is important and this is our analyzer this is the camera i mean we are using the camera executor and the analyzer is this so we are going to see below that we are implementing some some i don't know interface function that we are saying that we are handling the camera the the camera feed and that's it this is just opening the camera so again this code i mainly took it from the camera x examples and this is the analyzer method and we are overriding it's coming from the interface analyzer that we implement so we are going to get an image from the camera now when we get the image okay we'll get to this thing and this is our detector we will get to it in a minute all this code basically is just taking the buffer that we got from af from the camera and we convert it to we convert it to yuv image okay we are not converted it to rgb we are converted it to yuv and then you will see in opencv we'll take this yuv and convert it to a to rgb so this is this code and also important is the rotation so remember that in our native in our c plus plus native lib we have this rotate a rotate function so this is the rotation that we get from from the camera so the image is not necessarily a upright in the upright position so this is the rotation of the frame and so we'll take this rotation and we like rotate the image in opencv so the image will be in the upright position now before we go into the c-plus plus code i want to [Music] explain shortly and where's my paint uh i want to explain shortly about how we're going to do this at the interface between a kotlin and our c plus plus so this is going to be our a cochlin code specifically our object detection like and this is our native code the native lib cpp okay so our object detector if you remember from the windows application it's a class so the first thing that we need to do is we need to create an instance of this class in c plus so here we somehow need to create an instance of our object detector object detector and our object detector has this method that we need which is the detect method and we should call this detect method from the kotlin from here this is where we process our images so how we are going to exactly do it we cannot pass we cannot pass a class from a reference a class reference form from native from c plus plus to kotlin but what we can do is here in kotlin we can create an instance of of our class okay so we can create the object detector object detector here and we can send the address to kotlin so here we are just passing uh the address the others of of this object that is living inside our native leap cpp and then from here when we are going when we want to uh to do the detection so when we want to call a detect we are going to pass as the first parameter the object detector address okay so this detect method we are going to define we are going to declare this detect method here in the c plus plus um in in the c plus plus code it will be it will not be void but yeah for simplicity it will be a detect and the first parameter is going to be the object detector address then we get some more parameters and what we will do here we will take this a object detector address and we will do some sort of some sort of casting to object detector because we know that this address represents an object detector class okay so this is how we are going to do it so we are not going to call directly from kotlin to our object detector instance because it's not possible we are going to call some method in c plus plus and in c plus we have access to the instance of our detector and we have the address that we can send back and forth from kotlin to the native code so this is what we are going to do okay so let's begin we are in our let's go to our native lib and we want to define our method so here we define method that are exposed to kotlin and the way to do it is using this is using this funny this funny syntax so it's extend c and then here the name of the function must use exactly the namespace of the class that is going to use this function so we are going to do it from our object not class activity from our object detection activity and the and the first function that we want to create is is to create an instance of the detector so we're going to call it init sorry we are going to call it a init detector and this function is not going to return the return value is not a string it's going to be the address of the detector which is which is long okay and now the parameters so all uh all jni functions that we export to java the first two parameters are this n and this j object this okay so this is always the first the first two parameters and inside this method we are also going to use our a our asset manager so we will get the asset manager from they will pass some reference to it from a java it's not real reference it's actually i don't know what it is it's interesting because we are passing an object from from java to c plus plus interesting okay so in order to create our object detection our object detector sorry in the constructor we need to pass our model and this model buffer we need to read from here from this embedded asset and this is what we are this is what we are going to do here so just copy the code and then we'll go over it okay so here we define the buffer into which we want to read the model and the size of the model this is our j object that we pass from java this is checking that it's not null and then we are creating a instance of asset manager in the c plus plus side so this is asset manager for a from java and then we are opening our object detection model or the modality of light we check that we got something then we take the size of this of this buffer and here we are here we are allocating we are allocating the memory and we are reading the asset and we are closing the asset and now our model is inside this buffer and now that we have the model buffer we can create an instance of our object detector which gets the model buffer and the size of the buffer and you can see that the result of our object detector we are converting it actually to j long okay so here we are just going to have the address of this instance and we are returning the this is the result that we are returning so the the result of our function is actually the address of this object it's not the object itself and what i do here i'm freeing the i'm framed the model buffer because object detector in the constructor is going to duplicate the model buffer and the object detector is going to be responsible for for this buffer so this is how we create an instance of our detector and like we [Music] like we created we created we created instance of the of our detector we also need a way to to destroy it so when we close our activity for example it's important to destroy the detector and it will release all the resources that that it's occupying okay so again the first two are always the end and then pds and now we want to destroy this instance okay so we will have to get the address uh the others of this of this object and this method is going to actually be void and all we do is just delete the object if we if we got anything so checking if we if we got a like if you get a pointer it's not zero or something uh we take this pointer or we convert it to the we know that it's a pointer to to object detector and we just delete it that's it so we initialize the detector and at some point uh we would we will destroy this detector and now comes the uh the last part of this function is we want to run the we want to run detection so again i'll take i'm always copying it just because of the function the function definition here which uses the long name so let's call it detect and this is this is our detect method and the first two are this and here we are going to get a quite a lot of more parameters so we have this so what are the parameters that we expect yeah we expect to get obviously a reference to our object detector jbyte we are expecting a byte array this is going to be the image a byte array of the image that we want to run detection on uh within the height of the image we need it in order in order to in order to create an opencv image from this byte array we need the width and the height and the rotation a the rotation of the image so or the required rotation to apply to the image in order to make it upright image and our method is going to return the detection result so the detection result result we are just going to return it as a as a float array okay so we are going to encode all the result all the detection results into into an array because again i don't know i don't know if or how we can return a for example a c plus plus struct or or array of structs which is the detection result to kotlin uh but float array it's it's i know it's possible so this is this is what we are doing okay so the first thing that we want to do in this inside this function is preparing the image to run the detection on and this is what we are doing and this is what we are doing here so the third thing we are taking this j byte array and we are actually uh taking a taking the the bytes uh pointer to the bytes then we are creating a then we are creating an opencv image that represents uh the image format yuv so this is this is how we do it and we initialize this image using the bytes that we got and then we are then we are declaring another another another opencv image this is going to be our rgba image and this is the conversion so we are convert converting the yuv image into our rgb we call it frame image and the conversion is yuv to bgra nv 21 this is the format of a of the image that we get from a camera x and finally we just called to rotate made a rotate net and we applied the rotation if needed so now hopefully a frame is in rgb image in the correct in the correct orientation and now we can release we can release this this byte say we don't we don't need it anymore and next we are going to run the detection so for to run the detection obviously we need the detector and this is where we are using our detector address that we got we take the detector address and we know that it's referencing an object detector so we just convert we just cast do a casting to an object detector pointer and we can run the detection so this is how we run the the detection on the detector and we get a detect result now let's look at detect result so this is detect result the text result is we don't get oh okay this is an array of detection results okay so it's a reference to a detect result struct so we are going to get a several actually we are going to get five because this is the detect num we are going to get five detections each detection is extract with a label and if you remember the label is just a numbers so we get a label the score and the the rectangle position minimum maximum axis this is our detection result so here we got array of this detection results and what we need to do now is encode all this detection results into a float array and this is what we are going this is what we are going to do next so first we are going to uh we are going to declare this array so we compute what is what is the length of this array so it's going to be the number each each detection has one two three four five six each detection has six values so this is a six multiplied by the number of detections that we expect and we add one more a one more cell to our results array because and here we are we're allocating the array and the reason that we added one more is because our first cell in the array is going to be the number of detections that inside the array so when we are going to decode this array in in kotlin we are first going to read the first value and we know how many detections we are expecting to receive inside this array and the then we just loop over all the detections okay so we are looking over all the all the detections we are calculating the current position in the array and we put the values in this order and it's important to remember these orders because when when we are going to decode this array in kotlin uh we will want to get the values in the correct order yeah so here we are just filling our result array with all the values and the last thing we need to take we need to take our our jfloat array and convert it to jfloat okay this is our return type so this is just a pointer to jfloat and this is this is how we do it yeah so output is the result of our function which is which is a j float array and we are copying uh we are copying j res this is the array with all the values we are copying it into our output array and return the output and that's it so this is all the code that we have in in the c plus plus side so the first one is we are initializing a detector using the model that we are reading from the embedded asset note that we are using our object detector as we developed in the previous video we didn't change anything in the in the detector this is the very same detector detector that is running in windows here we have a method to destroy the detector for example when we close the object detector activity i don't think that i currently do it and this detect we are going to run on every frame that we get from the camera we are going to run the detection and this is how we run the detection so again we are just using our detector we didn't change anything in our in our detector or there to run this detection in android and then we take the results and because we cannot pass this struct to kotlin and we need to encode the result somehow so we just encode it into a float array and we return the result so this is a this is the c plus plus and now we go and see how we use this uh this code for me from kotlin so back in our object detection so this method camera x is going to call this method every time it gets a new image that is ready for a for processing and this function is a synchronous function there is nothing asynchronous here so as long as this as long as this function is running camera x is not going to provide us with new images so this is good and this is some safeguard in order to convert the image to yuv we need to we need to make sure that we have three planes okay the yuv buffers so this is this is the planes uh if not you can maybe you can log here or something it shouldn't happen and now we need to initialize our detector so detector address this is a this is a variable that we define here in our class and we initialize it to null to zero and first time that we first time that we run if this if this detector if this detector is there we just need to create our detector so we are calling to init detector and we are passing the asset manager okay important this is really important uh here the fact that we defined all this method here in in native lib obviously we cannot use it as is in kotlin and we need to make a just a declaration of the methods okay so this is the declaration of the method marked as external function so init detector which gets parameter the asset manager so note that the first two the n and pd's the object we don't need to mark them in in kotlin they are passed in by uh by the oil the system i believe only the rest the rest parameters that are specific to our function and so this is entity detector we declare it here and the return value is long and same for destroy detector and for the detect method uh so this pointer long so this is our this is the reference to our detector and then the byte array which is the image the width and the height and the rotation and the result is float array so this is als this is only the uh the declaration of the method and during runtime android is is actually going to search or look for the implementation of of these methods and in order for for it to find them we need to load our a library so all this all our code is going to be compiled into the native lib library and we are loading this library here in the cons in in the initialization of our a of our activity class a system load library native lib if you will not load the library so during runtime you are going to get an exception when you will want to call any any one of these external methods so if you do get an exception that it will tell you that init detector is not found so know that you didn't load the library okay so we were here at our init detector so first time only we create an instance of of the detector we pass the embedded assets and then we get the image rotation we get the yuv buffer we create an nv21 image and this is how we create the image and we are calling to our uh we are calling to our detector and here okay i commented the code i didn't see it wrong so the result so this is we are calling detect and again detect is is this method that we defined here the external method so this is calling to our detect method in in in native lib um this detect method we are passing the first parameter is the detector nv 21 this is the buffer our image buffer the image width the image height and the and the rotation and we get the result now this result is a float array of all the detected the only detected object and we are going to draw this detector this detection on the screen so we are going to draw it on the surface view i'm not going to i'm not going to get into the code but what i do here you remember that in the first in the first in the first cell of the result array we we put the number of detections uh so the first cell zero index is going to have the number of detections uh so this is the loop that we are doing here and for each detection we are going to call this method draw detection and we are going to pass the entire the entire the entire array and the index of the detection that we that we want to draw here of course that if you want you can take this result array and first decode it into some some structure that represents a detection uh you can do whatever you want with it now let's look at the door detection uh this is just the drawing on canvas i'm not going to get into it uh i just want to to show you how we take the values from from the array so based on the detection index that we want uh that we want to draw we calculate the starting position of the of the values of this detection in the array and this is how we do it each detection has six value and plus one because the first a value in the array is not related to the detections and android they read the values and very important the order the order we encoding we encoded the the detections so it was here it was score label x min y mean x max y max and this is how we read it we need this score instead of labeling i call it here class id x mean y min y mean x max y max and then we just draw a draw it on the canvas and we also print the label okay so our label map this is the label file that we read here when we loaded the label uh let's look at this file so this is just just each each each class id has the uh the description of the class so this is what we we do here we form a from this label map we get the the description of the class according to the class id and we just draw it on the canvas uh that's it so i'm gonna build and run the application and let's see how it looks okay so the code compiled and runs successfully but on the old samsung device it was really really slow it the detection worked but it was really slow and i didn't want to show it because it's a really bad experience so i switched to my my pixel device and going into object detection so you can see that this detection detecting our this apple and in the parenthesis you can see the the scroll and let's move to this mouse so it does see the mouse and part of the keyboard so here is the keyboard sort of okay let's try this monitor so it says tv which makes sense and keyboard okay let's see cell phone and the other cell phone and the mouse okay so it's working okay not not the best but fine and oh one one thing and here i do filter the results with this course okay so it's if if it's below 60 i i don't care a i don't render the the result no no let's do 40 let's let's see if it changes anything if we get more results maybe we will get some false detections where it was here object detection so the app looks good mouse keyboard okay so it's working better the keyboard is it was low config it low it has low score so dvd lots of tvs and here are the cell phone and even the mouse there at the top of the image and even the keyboard nice so this is not bad it was just a matter of the of this course uh okay so this this concludes our our third video about running object detection in android again very long video i'm sorry for this i hope you will find it useful and our next video is going to be in ios so bye bye for now

Info

Channel: The Coding Notebook

Views: 3,741

Rating: 4.8709679 out of 5

Keywords: TFLite, TensorFlowLite, OpenCV, Android

Id: axsE34RzbrI

Channel Id: undefined

Length: 54min 16sec (3256 seconds)

Published: Sat Apr 10 2021