TensorFlow Lite for Edge Devices - Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
TensorFlow light allows you to do machine  learning on small devices. bhavesh is an   experienced instructor and He will teach you  all about TensorFlow light in this course.   Hello, everyone. In this tutorial, you  will learn the basics of TensorFlow light,   and how TensorFlow light can help you  create really efficient models that you   can deploy on edge devices. So without wasting  any further time, let's kick start the tutorial.   Let us kick start today's discussion  about the flight with a small story.   I have a friend whose name is john. JOHN  really likes traveling to different places.   One of his favorite applications is Google  lense. Whenever he visits a new country,   he removes his cell phone takes a photograph  of the monument that is in front of. And Google   essentially tells him which monument it is. So  now, just by the sheer fascination of the tool,   john goes forward and creates his own  neural network for detecting landmarks.   He goes through a rigorous process of  collecting data, labeling data, cleaning   data. And finally he creates a machine learning  model that can tell john, which monument it is.   So everything looks good, he is able to  reach a very high accuracy score as well.   Now the only challenge that he has is where  should he deploy the model that is created.   So technically, he has two options. The option  that he explores first is cloud computing.   He takes his train model, and he deploys it on  Cloud. He exposes the API that he is created.   And essentially, he creates an Android  application that kind of queries the API   and fetches the result once he's kind of passed  in an image. Now one of the major challenges   that he saw when he created the solution is  the network latency. So the images that are   captured generally by cell phones today,  range anywhere between three to 10 Mb.   So transporting such huge files take up a lot of  time in the entire process of making a prediction.   The second piece that adds complexity to the  entire solution is the cost. The neural network   that john has created requires to resources, a  storage resource wherein he can save the model   weights, and a compute resource for making  an inference, the overall project becomes   a costly affair for john. What does he do  next, he creates an Android application.   And he finds a way to make an inference on  Android using this huge model that he is created.   So john now faces a new issue. The  issue is the model is really huge.   And the cell phone is not very capable enough  of storing and processing such a big model.   So now john, is really confused. He's tried  out different techniques to make this work.   But nothing is solving this issue. Well, it  is here that TF light comes into picture.   Before we jump and discuss about the flight,  I wanted to throw some light in terms of what   edge devices are. So edge devices are  your normal cell phones that you use.   So if you're planning to create some amazing  TensorFlow based applications, then essentially   one of the main platforms that you can utilize  is your cell phones, bat or Android cell phone,   or an iOS powered cell phone, anything works.  The other pieces of hardware devices that you can   classify into edge devices or microcontrollers.  So there are some amazing applications that have   been built using very small compute power.  And all of it is thanks to microcontrollers.   Given how recently, the variable  devices that you were essentially   have increased their computation  power by using a more or faster CPU,   you can also put your wearable devices such as  your smartwatches into the edge computing bracket.   Now let me go forward and give you a  formal definition of edge computing.   So edge computing is basically the practice of  moving compute and storage resources closer to   the location at which it is needed. So that is  where your edge devices would come into picture.   Now, if you recall it, john had two options to  deploy the model. The first option was to deploy   the model on Cloud. Now many of you would have the  impression that a machine learning model running   on the server using a large GPU is much more  better as compared to running it on the device   itself. Well, the truth is edge devices have  become an important platform for machine learning.   Why is it gaining so much of popularity  that you essentially have to run your entire   machine learning models on the edge devices?  Well, let me share details one at a time.   The first and foremost reason why edge computing  as a whole is gaining popularity is because of   latency. The use cases that require real time  speed, definitely require models to run on the   device. For example, you might be able to  reduce the inference latency of resnet 50,   from 30 milliseconds to 20 milliseconds. But  the network latency can go up to seconds. So   essentially, it also depends on where your model  is deployed. Where are you waiting the API from.   So if you take into account all of these factors,  then essentially deploying a model on server   would not be the best possible situation when  you want inferences in near real time, or   exactly real time as well. The second reason why  creating machine learning models on edge devices   is important is because of network and activity.  So if you go back to our earlier example,   wherein john wanted to create his own version  of Google lense for detecting landmarks,   if he happens to visit a country where there  is little to no connectivity, it is here that   creating a model that sits on the device would  be much more better as compared to deploying   it on server, because there is this additional  dependency on the network that comes into picture.   The third reason why it's very important  for you to create machine learning models   that run on edge devices, is user  privacy. Putting your machine learning   models on the edge devices is also appealing  when you are handling sensitive user data.   Machine learning on the cloud means  that your systems might have to send   user data over networks, making it  susceptible to being intercepted.   Cloud computing also means that storing data  of many users in the same place or location,   which means a data breach can affect many people  at once. So it becomes really important to create   machine learning models that can run on edge  devices, which can preserve the user privacy data.   Let me now go forward and show you some examples  of on device machine learning use cases.   The first one that I'm showing you right  now is the feature to try out various   cosmetics using AR on YouTube. The entire  computation piece that you're seeing here,   is essentially happening on the device itself.   Now the second example that I want to show you  is something that you might be already aware of,   which is Google Translate. Google Translate  has a feature that allows you to capture text   with your phone camera, and translate them in  real time without any internet connection. All   of this is essentially possible using edge  computing, and more specifically, TF light.   Now that you've seen the amazing new applications  that you can create at your end as well,   you might be wondering, the second alternative  that john had taken initially, that is to create   an entire huge TensorFlow model, and then run the  inferences from the device. Why did that fail?   Well, to answer that question, here are  some of the challenges that you might face   when you create chaos or TensorFlow models,  and deploy them directly onto edge devices.   Edge devices, not only restricted to mobile  phones, but your microcontrollers as well have   limited compute power. Limited memory. battery  consumption is also a factor that you have to   account for, as well as the application size.  If I consider a simple microcontroller is well,   the processing power isn't so much  that you can essentially run inferences   on a three or four GB model. If you consider the  storage capacity of majority of the edge devices,   well, then ideally, you wouldn't have a lot  of storage that you can utilize for just   one model. So these are the challenges that john  faced when he took the second approach as well.   So what is the solution? Well, the solution  is TensorFlow light. And so flow light is a   production ready cross platform framework  for deploying machine learning models   on mobile devices and embedded systems. And flow  light at this point of time supports Android,   iOS, and any IoT device that can run  Linux. So essentially, if you have   any of these hardware devices handy with you  Then you can quickly create a TensorFlow model,   convert it to an equivalent TF light model,  and start using the amazing TF light model.   Now, you might wonder, what exactly is  the workflow to create a TF light model?   Well, let's look at that as well. So the  workflow is fairly simple, you start by creating   a TensorFlow slash karass model. So in the  entire process, you would have to collect data,   you would have to clean data, pre processed data,  then create models, iterate over multiple models,   and based on the metric that you're chasing, if  you are chasing for a higher accuracy score, then   you would choose a model that would give you the  best possible accuracy. And that's about it, you   have your TensorFlow model ready. Now from that  TensorFlow model, you convert it to a TensorFlow   light model. So there is a format change that  happens. I'll talk more about it as we go along.   Once you've converted your model  from TensorFlow, to TensorFlow light,   then you go forward and deploy the entire TF light  model and run your inferences on the edge device.   When I mean inferences, I mean, predictions. Okay.  Let me now explain this using a block diagram.   So this essentially is the workflow that I've  just mentioned, you start off with your high   level curiosity is create a model. Once you  have the model ready, then you can essentially   use a TF light converter. And the converter  basically takes your save TensorFlow format file,   and converts it into a flat buffer file.   I'll give you an idea in terms of what I mean  by flat buffer file. So let's move forward.   So TensorFlow lite represents your  model in a flat buffer format.   Now flat buffer format is an efficient  cross platform serialization library   for c++, C sharp, go Java, kotlin, JavaScript,  Python, and so on and so forth. It was originally   created at Google for game development and other  performance critical applications. But slowly,   Google realized that you can use the flat buffer  format in deploying models on edge devices.   Now you might have an obvious question, why not  use our old tried and tested protocol buffers.   And why shift to something  that is as newest flatbuffers   protocol buffers just to give you context,  all your kiraz models that you create all   your TensorFlow models that you create  are essentially protocol buffer format.   protocol buffers are essentially very similar to  the flat buffer format that Google has created.   The major difference is flatbuffers do  not need a parsing or an unpacking step   to a secondary representation,  before you can access the data.   And the code essentially is also larger in case  of protocol buffers. It is your while using   TF lite, we make use a flat buffer format, and not  protocol buffer format. So now let's move forward.   So far, we've looked at various aspects  of edge computing, we've looked at how   deploying models on edge devices is better  as compared to deploying it on Cloud.   We've also looked at what TensorFlow light is,  and what all it can support at this point of time.   Now is where things would get  interesting when I show you through code,   how TF light can actually compress your  model size without compromising on the   accuracy piece. So now let's go forward  and witness the magic of TF light.   Now that we've understood the basics of  TensorFlow light, and edge computing,   let me show you the power of  TensorFlow light using Python.   So for this example, I'm using Google collab.  For those of you who don't know, Google lab is an   online environment wherein you can write Python  code, you can create machine learning and deep   learning models, you can also make use of deep  learning models that Google gives you access to   for good amount of time. So this is the  interface that I'll be using. I'll be   attaching the link to the GitHub repository  in the description section of the video,   feel free to access the code from there.  Also inside the GitHub repository. I'll   also give you a link that can open a Google collab  notebook directly. With all the groundwork done,   let me now go forward and show  you the magic of TensorFlow light.   So the process that I'll follow in this  particular tutorial, is I'll create a deep   learning model using TensorFlow slash Eros.  I will scale it down to a TF light equivalent   model. Once that is accomplished, I will show you  the size difference between the original model   as well as the compression model. I will show  you techniques how you can keep compressing   the model even further without having  to compromise on the accuracy piece.   With that, let me create an instance  on Google collab by pressing Connect.   So currently, Google is allocating  some space for my computations.   If you're planning to replicate the entire  thing that I show in today's video in your   local machine, then you will require some  set of installations as well. Given that I'm   working with Google collab, all the dependencies  that I require for this example are already met.   So now let me go through the different things  that are required for this entire tutorial.   First things first, I require the voice  module to essentially read my files.   Next up, I'll import NumPy as NP I  will require this particular library   for mathematical operations. I  will also require h5 p y library.   The h5 p vi library is a pythonic interface to  the HD f5 binary data format. So technically,   whatever models I create in chaos, I will  basically save it into the h5 format.   Next up, I require matplotlib. This  is again used for visualization.   I require TensorFlow, I'll import kiraz.  From TensorFlow. These are some of the   layers that will require when we come to  the deep learning model creation aspect.   If I want to calculate how good my model  is performing in terms of accuracy score,   this is where SK learn dot matrix module will  give me the accuracy score functionality.   And from the system library, I'll also require  the function get size of. So these are some of the   things that I require in order to create a TF like  model. So now, let me go forward and run the cell.   So when I run the cell, what  essentially happens is Python   essentially runs that piece of code. So  let me now go forward and run the cell.   So I don't see any error. That means all of  our imports are in place. Let me go forward   and show you the TensorFlow version that  I'm using for this particular tutorial.   I'm currently using TensorFlow 2.6 point  zero. If there are changes that creep up   with respect to the API. Feel free to  refer to the TensorFlow documentation.   There are two functions that have  created the first function name   is get underscore file underscore size.  Essentially, I'm passing in the file location   using the OAS library and specifically the  function get size, I'm able to get the size   of a particular file that I pass in, in byte  so let me now go forward and run this cell.   In the previous function, that is  get underscore file underscore size,   the value returned would be in bytes. So  now rather than comprehending the values   of a file size in byte, I've created a helper  function called as convert underscore bytes,   which essentially takes in the input size  in bytes and converts it in either KB MB.   So this is something that I've  created. I've not included DBS because   so this is more of an explainer video  wherein I intend to create a smaller model   or rather a proof of concept rather than  like a huge model. So that is why I have   restricted my units to cavies or MDS.  So let me go forward and run the sale.   Now for this particular example, I'll be  basically using a very famous data set   and deep learning, Carla's fashion m&s  data set, so let me unhide the cell   so fashion emnes data set  contains 70,000 grayscale images,   which belong to 10 different categories.  So categories would include the shirt,   trouser, pullover, dress, code, sandal,  shirt, sneaker bike, and ankle boot.   So these are the different categories  that are part of this data set.   The entire activity is a supervised  learning task, I will have a set of images,   and every image will have a label associated with  it. And I'm trying to train a deep learning model.   So now let me go forward.   So you don't have to worry about the download  pieces. Well, if you have installed TensorFlow   correctly, then essentially you just have  to call the chaos dot data set dot fashion   amnesty function, save the entire data set into  a variable called as fashion underscore m nest.   So that is the first step that I've done here.  Once you've done that, then essentially what   you have to do next is you have to split  your data set into training and testing.   The way you can achieve that is ideally by  calling a function called as load underscore data.   So this is what you have in terms of the  function. Once you call this function from   fashion underscore m nished, the variable  that you just created, you would be able to   split your data into training images, training  labels, st images, and test labels. So that's   how simple it is. So let me go forward and run  the cell. So we've downloaded the data files,   we've split our data into training and testing.  I've also created a variable, a list variable,   called as class underscore names, which  contains all the names of the classes   that are part of this entire activity.  So let me go forward and run the cell.   So if you recollect, we had  70,000 samples in our data set,   we've already split that entire data set into  training and testing. So let me go forward   and show you how many images are part of the  training data set. So let me run this cell.   So the shape of my training data set is 60,000  comma 28, comma 28. So I have 60,000 images.   Each image has a size of 28 cross 28. So 28 rows,   28 columns represent each image,  and I have 60,000 such images.   Given this is a supervised learning task, I  would also require 60,000 labels. So let me   check if the total number of labels in my training  data set are 60,000. So let me run this cell.   So as you can clearly see, I  have 60,000 labels as well.   Now given that, I've already mentioned that there  are 10 unique classes, let me verify that as well.   So as you can clearly see I have class numbers  ranging from zero to nine. And the mapping is   what I've created here, which is contained  in the variable class underscore names.   Let's now go forward and explore that testing  data set as well. So let me quickly unhide this   let me show you the total number of images in  the testing data set. So that come out to be   10,000 60,000 for training in 1000. For testing,  each image is again of the size 28 cross 28.   Similarly, if I look at test underscore  labels, I'll have 10,000 samples.   What I intend to show you next is I want to show  you a sample image. So let me show that to you.   So this is a sample image that is part of  this data set. This is clearly an ankle   boot. The size of the image is 28 cross 28.  So 28 rows 28 columns is what you see here.   Before we go forward and train a neural network,   a good practice is to scale In the intensity  values of the images, which range between   zero to 255, zero to one. So that is what I have  done in this piece of code. So let me run this.   So now my train images will have values  ranging from zero to one, and not zero to 255.   So far, we've downloaded the data set, we've  split our data set into training and testing.   And we've done some sort of pre processing as  well, now is the time when we'll create a simple   neural network that 10 classify the entire images  into one of the 10 categories that are there.   So in this piece of code, I'm calling the  sequential class from the chaos library. I'm   passing in the first layer as a flattened layer.  Now, if you recollect, the images were 28 cross   28. If I have to pass it through a layer,  then I have to basically flatten it first,   I am not creating a convolutional neural network  given that the data set is fairly simple,   I will stick to a normal deep neural network. So  the first layer that I add is a flattened layer,   wherein I pass the input shape, which is 28,  comma 28. The second layer is the dense layer.   And the activations supplied to this dense layer  is relu. The final layer is again a dense layer,   given that I have 10 different classes  to classify between. So that is what   I have here. So let me quickly create an  instance of the model. So let me run this.   Before we go forward and compile the model,   I'll show you the structure of the model  as well. So I'll say model dot summary.   So this essentially is the model summary.   So for the given architecture, we have  close to 100k trainable parameters.   Now the next step is to compile the model. I'm  passing in the optimizer, I am passing in this   past categorical cross entropy loss. Given that  our classes are mutually exclusive, I'm using the   sparse categorical cross entropy loss as compared  to the normal categorical cross entropy loss.   And the matrix attend casing for his accuracy. So  I want to create a model that is fairly accurate.   So let me now go forward and run the cell.   So I've created an instance of the  model. I've also compiled the model,   now is the time when I'll pass  in the training images as well   as the training labels to train  the entire trainable parameters.   So let me now call the model dot fit function,  wherein I'll be passing in the train images,   train labels, and I kind of run the entire  exercise for 10 epochs. So let me run the cell.   So with every epoch, you can see  that the accuracy is increasing.   So we've successfully trained our model and we've  reached a training accuracy score of around 91%,   which is Something that's reasonably good given  that I've trained the model for only 10 epochs.   So let's go forward. And remember one thing,   the objective of the video is not to train the  most accurate classifier at this point of time,   but to show you the power of TF light, so  that is why I've kind of stopped at 10 a box.   Now the next thing that I do is I create  a variable called SK Ross underscore model   underscore name. This is something that will  be used as reference, or this will be the   baseline model performance that I evaluate  later on with the TF light models as well.   So the name of this particular variable  is pf underscore model, underscore fashion   underscore m NIST dot h phi. So let me quickly  run the cell. Now let me go forward and call the   model dot save function and pass in the filename  that I just created. So let me run the cell.   So as soon as you run the cell, you  would have a file that would be created   in your Google collab session  or in your local directory,   which is essentially your saved model  file. So let me show that is well.   So this is our saved model file that has been  created. So let me quickly and I the cell again.   Now I'll go forward, and I'll show you the size  of this particular file that we've created.   So let me call the two functions that I've  created, that is convert underscore bytes,   and get underscore file underscore size, I  pass in the same file name, and I want the   file size to be in MB. So let me run the cell.  So currently, I have a model that occupies 1.2   Mb. So I'll go forward and I'll create a variable  called SK Ross underscore model, underscore size,   and save the byte equivalent size into this  particular variable. So let me run the cell.   We know for a fact that the model is performing  really well on the training data set. But then   essential litmus test is to check how well  the model is performing on unseen data   that is my testing data set. So let me go forward  and evaluate how good our model performances on   their testing data set. So I call the function  model dot evaluate, I pass in the test images,   test labels. And I save the results into  two variables called test underscore loss,   and test underscore accuracy.  So let me quickly run the cell.   So as you can clearly see, the loss is at a very  small value, which is around point three, seven.   And I've reached a testing  accuracy score of around 88%.   So we've completed the first part, now  it's time to move on to the next part   that is creating a TF light equivalent  of the same model. So let's go forward.   So I start the activity by creating a variable  called as TF underscore light underscore model   underscore file underscore name. And I pass in an  equivalent name to this particular TF light model,   which in our case, currently is TF underscore  light underscore model.tf light. So let me quickly   run the cell. Now the process of converting a  TensorFlow model or a karass model into a TF   light model, essentially requires just a couple of  steps. So this is what I'll highlight right now.   So the first step is to call tf.light.tf light  converter dot from Kara's underscore model,   I pass in the model that I created. So if  you remember the name of the model variable   was essentially model so that is what  I'm passing in here in the first line.   Once I've created an instance of the TF light  converter from Eros MADI, I save the entire   piece into a variable called as TF underscore  light underscore converter. And finally,   what I do next is I call the Convert function.  Once the conversion happens, I want the result   to be saved into a variable called as TF light  underscore model. So let me quickly run the cell.   So if you look at the output, it says that assets  are written into a particular temporary file.   So from that temporary file, I basically  have to retrieve the model weights   and save it into a TF light equivalent file. So  that is what I've done using this piece of code.   I've created the first variable which is TF  light underscore module underscore name. And I'm   passing in the initial name that I've created  in the first line of this particular section.   I open the file name with write access,  and I write this particular temporary file   into this file that I've created. So that's  how simple it is. So let me quickly run this.   So there is a particular output that is displayed.  This tells me the total number of bytes that have   been returned to this particular file. Now let me  go forward and show you the exact size of this TF   lite model in kilobytes. So let me run this. So  the overall file size is close to 400 kilobytes.   So we started off with a model which was occupying  around 1.2 Mb. And after just running a couple of   lines of code, we have brought down the file  size to around 400 kb. Now, let me go forward   and save this file size into a variable called  s TF light underscore file underscore size.   This is something that will make a lot of sense  once we go forward. So let me quickly run this L.   Now we've already converted a model from kiraz to  tf light. But one thing that we've not validated   currently is how good the model is in terms of  performance. Is it actually good on unseen data?   Or has it dropped in terms of the accuracy  score. So that is what I want to check next,   that after compressing the model using TF  light, are we losing out on accuracy or not.   So in this section, I'll go over how you  can validate the results in terms of how   good your TF light model is performing.  So now, let me quickly unhide the cell.   Now, don't get scared by looking at this piece of  code, I'll help you understand what I'm trying to   achieve. You're now loading a model, or TensorFlow  or a karass model into a TensorFlow session is   fairly easy. But here, what we have done is we've  kind of created a TF light model. If you go back   to the discussion that we had, TF led models are  essentially flat buffer format files and not your   normal usual protocol buffer files. So in order  for us to actually make an inference out of TF   light files on our TensorFlow or a Python session,  we require something called as an interpreter.   So it is your that will be basically making  use of tensor flows interpreter to load the   TF light file, and then make inferences or  predictions. So let me now take you through   each and every line of code. So in the first line,  I create an instance of the interpreter class,   I pass in the TF light model name that  we've just created. So if you look at   this particular section, you will also have a  TF lite file. This is what I'm passing in here.   Now once we've created the interpreter object, the  interpreter object saves details about the model.   It will have details about the input that it  expects the value that type of values it expects,   and in terms of the output, it will tell  you what the shape of the output should be.   In terms of the output, it will tell you the  output shape as well as the output the type   that is the output values, it will predict what  are the values and what are the type of values.   So all of that is what this particular interpreter  will actually have details about. The details it   is fetching is again from the interpreter object  that we created and we passed in the TF lite file.   So all of the details would be captured in this  particular TF lite file, which is what is read   by this interpreter object. And that is what we  are trying to accumulate from input underscore   details and output underscore details. Once  we have the input and output details, I also   am interested in the shape of the input that is  expected and that type of inputs are the variable   nature of the inputs that are there. So let  me quickly run this cell to make more sense.   So if you look closely, the input shape is 128 28.  The input type that it expects is NumPy, float 32   the output shape is one comma 10 that is  one so basically has one row and 10 columns   and the output type is again NumPy float 32.   So this is essentially what the TF lite file  contains. Now if you look at this particular one,   this denotes that the A flight is expecting  one input at a time. Now I want to check   how good it is performing for 10,000 inputs.  That is where I'll have to reshape the input   shape to a particular value, which is what  I'll be achieving in this piece of code.   Just to reiterate, again, the input shape  is 128 28. So ideally, I have to pass in   just one image sample. And essentially, I  would get a corresponding output for it.   But essentially, in my use case, I want to  validate how good the TF light model is performing   for the testing data set that I have,  which essentially contains 10,000 images.   So if this idea is clear to you, let's go forward.   So now I want to validate how good my TF light  model is performing on my testing data set.   So I call the resize underscore tensor underscore  input function. I pass in the details that I want   to resize this particular index value, and I  pass in how I have to resize it. So currently,   I have 10,000 samples. So that is what I have  entered here, that is 10,000 comma 28, comma 28.   Similar resize operation is what I'm doing at the  output side. So you can see your 10,000 comma 10,   from the initial one comma 10.  So that is what I've done here.   Now, once the resize operation has happened, I  want to call the allocate tensors. to actually   change the entire structure of the interpreter.  This is what it's read using the TF light file.   And now when I print the input details and  output details, I should be able to see   that the entire TF light input output values have  changed. So let me quickly run this piece of code.   So as you can clearly see, the input shape  has changed from 128 28 to 10,028 28. So   this essentially will help me to validate  how good my TF light model is performing.   The other thing that I want to highlight  right now is s underscore images dot d type   is float 64. So if you look at the input shape  that the model expects is NumPy dot float 32.   So now the only other change that I have to  make in order to validate my TF light model   is I have to create a new array called as  test underscore images underscore NumPy.   Pass in the original array, and change the D  type. I can do it in the same area as well.   But I'm essentially choosing to create two  different arrays. So let me quickly run the cell.   So now if I show you the D type of test underscore  images dot NumPy, it will be NumPy float 32.   Now that we have the entire interpreter object set  up correctly for our set of inputs, that is the   testing data set. All I have to do right now is  firstly call the set underscore tensor function,   passing the test underscore images underscore  NumPy array that I just created and   call the invoke function. What the invoke function  would essentially do is pass in the inputs,   get the output. And once you have the output  ready, you call the get underscore Insert   Function, which will kind of have the output  ready for you and save it into a variable   called as TF slide underscore model underscore  predictions. So let me quickly run the cell.   Now the output that you see here, which is  prediction results shape is 10,000 rows and 10   columns. So every column would essentially contain  a probability score. So what I have to do next is   I have to pick out the value or the index between  zero to nine that has the maximum probability,   which is what I've done using this function  called as NP dot arg max. So this will help   me get numbers directly that is zero to nine  rather than having 10 different columns with   probability scores. Now let me calculate the  accuracy score. And let me print it out for you.   So the testing accuracy of the TF light model is  exactly the same that you see when you compare   it with your normal karass model. Now how much of  space Have you saved in this entire process? says,   Mel let me calculate a ratio between TF lite file  size and karass model file size. So overall, the   TF lite model occupies close to 32% of the overall  file size that my normal karass model occupies.   But the uniqueness is that I'm not losing out on  any accuracy. So this is the power of TF Lite.   I've been able to compress my entire  model from 1.2 Mb to around 400 kb.   And I haven't yet compromised a bit on the  accuracy pieces. Well, Isn't this amazing?   Well, if you think the story is ended here,  hang on for a second, there is more to go.   So far, what we've done is  I've taken a TensorFlow model.   And without any optimization, I've basically  converted that into an equivalent TF light model.   Now I'll show you how you can  compress your model even further.   without losing out on accuracy as such.   So now let me introduce you to a  new concept called les quantization.   So what exactly is this term that I've  just mentioned, that is quantization.   So for a given weight value that can be  represented in float 32 or float 64 format?   Wouldn't it be great if we can bring down the  size of those particular values and see very   little change in accuracy? Well, this essentially  is the concept of quantization, I'm reducing the   total number of bits for every weight value, so  that the overall size of the entire array reduces.   Just to be more clear, if I have a neural network  something like this, where this particular weight   value is 5.31345, this particular weight value is  3.8958. And you have the other way values is well,   what if I can change these representations that  occupies so many bits to something like this,   there will be a small hit in the accuracy.  But overall, I'll be able to compress my model   even further. How when whatever questions  you have in mind, just wait for some time.   If this entire idea is clear to you, let us  go back to the coding section. And I'll show   you how you can compress your TF light model even  further. So by default, in the previous example,   wherein we took a karass model, and  we converted that to a TF light model,   every weight value is essentially float 32 format.   Wouldn't it be great if I compress  it from float 32 to float 16.   This is the activity that I'll be performing  next. So I create a variable with an MTF   underscore light underscore model underscore float  underscore 16 underscore file underscore name.   And I basically give it a TF lightning which  essentially represents that the entire weights   inside it would be fluid 16. So that is what  I have here. So let me quickly run the cell.   If you look at the previous section in  terms of how we created a TF light model,   the first line of code is something that is  pretty much familiar to you, you pass in your   karass model, you call the TF light converter dot  from kiraz model, and you save it into a variable.   Even the last piece of code is also something  that you've already looked at. What you haven't   seen so far is the optimization. So when you  create an instance of the TF light converter,   there is a flag called as optimizations.  So you have the optimizations flag here.   I set it to tf dot light dot optimizer default,  so I want the default optimizations to take place.   And one other things that I do here. So I'll speak  more about the optimizations in the next section.   So hold on to that thought as well. Now here there  is one more flag called as target underscore spec,   and supported underscore types. Here is  where I set every weight value from floor 32   to float 16. So that is what I'm doing here.  So let me quickly run this piece of code.   So now I have a TF light model, wherein every  weight value would be a float 64 Automat,   I follow the same process again, wherein  I'm fetching data from the temporary file   and saving it into a TF lite  model. So let me run this.   I don't know if you've guessed it  already or not. This essentially   is a file size in bytes for the  newly converted a flight model.   So if I now show you the size of  this newly converted TF flight model,   then my size has drastically reduced  from 400 kilobytes to 200 kilobytes. The   only thing that I changed here was I changed  the individual representation of every made   value that's about it. Isn't this amazing, I'm  able to save so much of memory, just by changing   few values here and there. Say again, save the  file size into a variable called as TF lite   underscore float underscore 16 underscore  file underscore size. So let me run this.   Now if I compare it to the original karass  model, this particular model occupies 16%   of the size that the original model occupied. And  if I compare it to the previously created version,   then I can see almost 50% compression that  I'm able to achieve by changing the weights   from floor 32 to floor 16. So this is the  power of optimizations and TF light. If you   think this is it weird for the next section,  wherein I compress the model even further.   I'm not showing you the accuracy piece  right now you might be wondering,   why isn't he showing us the accuracy has  accuracy taken a toss? Well, the answer is no.   I show you the accuracy of an even compress  model. So that will give you a fair sense in   terms of how much compression is changing  the overall accuracy values as well. Okay.   So now we have reached the final section wherein  I will compress the model even further. Okay,   so here I've created a variable  called as TF underscore light,   underscore size, underscore current underscore  model, underscore file underscore name. And here   I want to see what the eflite file with this  particular name. So I'll quickly run the cell.   In the previous example, I  changed every weight value   from floor 32, to floor 16. Rather than  you deciding what is good for your model,   I would rather let TF light decide that for  me. So if you've been following along so far,   then this piece of code is something  that we've already covered.   This piece of code is also something that we've  already covered. This is something that is unique.   So here, I set the optimizations fly. And here  I just mentioned, optimize for size, there are   different values that you can kind of go through  in the documentation. So based on your needs,   you can kind of optimize for size, and the other  optimization that are also available for TF light.   So I'm not specifying what type of data type  I want, I just want the most optimized version   wherein the size that is occupied  by this particular TF light model   is the most compressed version.  Okay, so I'll quickly run the cell.   So I catch all of the file that is  saved into a temporary variable,   I save it into this particular variable that I  created. And if you've guessed by now, this is the   new file size in bytes. If I go to the kilobyte  section, then my file occupies around 100 kb.   So if you remember, we started from 1.4 Mb,   and we have brought down the file size of a deep  neural network. 200 kilobytes. Isn't this amazing.   Just to give you some numbers again, if I compare   this particular file size  with my original file size,   then my current file is almost  8% the size of my original file,   that is my karass model that I created. If  I compare it to the previous model as well,   I'm basically able to achieve 50% compression,  all because of optimizing for size.   This essentially is a power of TF light.   Now I'm really happy with the compression, I  have a 1.4 Mb file that I've compressed down   to around 100 kb. But is the accuracy still the  same? Well, we'll again follow the same process,   wherein I load the newly quantized model   into the interpreter object. I'll get details of  those objects, and I'll again reshape the values   and pass it testing data set entirely through the  interpreter object. So I'll quickly run the cell.   So as you can clearly see, 128 28 is the  interpreter object input values that it expects,   I have the output as one comma 10.   The input and output values that are expected  are NumPy, float 32. So that is all good as well.   Coming to the final section, I follow the  same process. Again, no change in the process.   I have a testing data set of 10,000 images, which  is what I pass in here. I allocate tensors, I get   the details, and I'll show the details to you as  well. So let me quickly run this 10,028 28 128 28.   So we've resized the tensor input values that we  are expecting to validate our testing data set.   You essentially don't need the step again,  but as kind of copy it from the initial part.   So I'm running it again. I pass in the values  again. Now I calculate the accuracy score.   Now is the litmus test for this highly compressed  TF light model. So let me quickly run the cell.   So the accuracy of a TF lite model that occupies  almost 8% of the size of the original karass model   is equivalent to the original karass  model. If I go up, I'm recording this   video in one go. So if I go up where did  this value go? here the value was at 7.66%.   Here it is at 87.59%. So this is  what you can achieve using TF Lite.   I started off with a very simple neural network,   the entire model occupied around 1.2 Mb. You  might also argue that 1.2 Mb is kind of small.   But the problem statement was fairly simple.  If you have like a really complex example,   wherein you have to classify images  into 1000 or 10,000 categories,   the model size would eventually increase.   So the objective then becomes can we compress the  model size? And the answer is yes, TF lite will   help you compress the model size without having  to compromise any bit on the accuracy front.   If you've reached this point, then I'm  assuming you've seen the entire video   or if randomly reached at this point,  whichever way you've reached this point.   I hope you enjoyed today's video, I keep  creating such amazing videos on data science,   machine learning, and Python. So feel free  to check out my channel in the description   section of the video as well. Thank  you so much for watching this video.
Info
Channel: freeCodeCamp.org
Views: 75,247
Rating: undefined out of 5
Keywords:
Id: OJnaBhCixng
Channel Id: undefined
Length: 59min 20sec (3560 seconds)
Published: Tue Oct 19 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.