TensorFlow light allows you to do machine
learning on small devices. bhavesh is an experienced instructor and He will teach you
all about TensorFlow light in this course. Hello, everyone. In this tutorial, you
will learn the basics of TensorFlow light, and how TensorFlow light can help you
create really efficient models that you can deploy on edge devices. So without wasting
any further time, let's kick start the tutorial. Let us kick start today's discussion
about the flight with a small story. I have a friend whose name is john. JOHN
really likes traveling to different places. One of his favorite applications is Google
lense. Whenever he visits a new country, he removes his cell phone takes a photograph
of the monument that is in front of. And Google essentially tells him which monument it is. So
now, just by the sheer fascination of the tool, john goes forward and creates his own
neural network for detecting landmarks. He goes through a rigorous process of
collecting data, labeling data, cleaning data. And finally he creates a machine learning
model that can tell john, which monument it is. So everything looks good, he is able to
reach a very high accuracy score as well. Now the only challenge that he has is where
should he deploy the model that is created. So technically, he has two options. The option
that he explores first is cloud computing. He takes his train model, and he deploys it on
Cloud. He exposes the API that he is created. And essentially, he creates an Android
application that kind of queries the API and fetches the result once he's kind of passed
in an image. Now one of the major challenges that he saw when he created the solution is
the network latency. So the images that are captured generally by cell phones today,
range anywhere between three to 10 Mb. So transporting such huge files take up a lot of
time in the entire process of making a prediction. The second piece that adds complexity to the
entire solution is the cost. The neural network that john has created requires to resources, a
storage resource wherein he can save the model weights, and a compute resource for making
an inference, the overall project becomes a costly affair for john. What does he do
next, he creates an Android application. And he finds a way to make an inference on
Android using this huge model that he is created. So john now faces a new issue. The
issue is the model is really huge. And the cell phone is not very capable enough
of storing and processing such a big model. So now john, is really confused. He's tried
out different techniques to make this work. But nothing is solving this issue. Well, it
is here that TF light comes into picture. Before we jump and discuss about the flight,
I wanted to throw some light in terms of what edge devices are. So edge devices are
your normal cell phones that you use. So if you're planning to create some amazing
TensorFlow based applications, then essentially one of the main platforms that you can utilize
is your cell phones, bat or Android cell phone, or an iOS powered cell phone, anything works.
The other pieces of hardware devices that you can classify into edge devices or microcontrollers.
So there are some amazing applications that have been built using very small compute power.
And all of it is thanks to microcontrollers. Given how recently, the variable
devices that you were essentially have increased their computation
power by using a more or faster CPU, you can also put your wearable devices such as
your smartwatches into the edge computing bracket. Now let me go forward and give you a
formal definition of edge computing. So edge computing is basically the practice of
moving compute and storage resources closer to the location at which it is needed. So that is
where your edge devices would come into picture. Now, if you recall it, john had two options to
deploy the model. The first option was to deploy the model on Cloud. Now many of you would have the
impression that a machine learning model running on the server using a large GPU is much more
better as compared to running it on the device itself. Well, the truth is edge devices have
become an important platform for machine learning. Why is it gaining so much of popularity
that you essentially have to run your entire machine learning models on the edge devices?
Well, let me share details one at a time. The first and foremost reason why edge computing
as a whole is gaining popularity is because of latency. The use cases that require real time
speed, definitely require models to run on the device. For example, you might be able to
reduce the inference latency of resnet 50, from 30 milliseconds to 20 milliseconds. But
the network latency can go up to seconds. So essentially, it also depends on where your model
is deployed. Where are you waiting the API from. So if you take into account all of these factors,
then essentially deploying a model on server would not be the best possible situation when
you want inferences in near real time, or exactly real time as well. The second reason why
creating machine learning models on edge devices is important is because of network and activity.
So if you go back to our earlier example, wherein john wanted to create his own version
of Google lense for detecting landmarks, if he happens to visit a country where there
is little to no connectivity, it is here that creating a model that sits on the device would
be much more better as compared to deploying it on server, because there is this additional
dependency on the network that comes into picture. The third reason why it's very important
for you to create machine learning models that run on edge devices, is user
privacy. Putting your machine learning models on the edge devices is also appealing
when you are handling sensitive user data. Machine learning on the cloud means
that your systems might have to send user data over networks, making it
susceptible to being intercepted. Cloud computing also means that storing data
of many users in the same place or location, which means a data breach can affect many people
at once. So it becomes really important to create machine learning models that can run on edge
devices, which can preserve the user privacy data. Let me now go forward and show you some examples
of on device machine learning use cases. The first one that I'm showing you right
now is the feature to try out various cosmetics using AR on YouTube. The entire
computation piece that you're seeing here, is essentially happening on the device itself. Now the second example that I want to show you
is something that you might be already aware of, which is Google Translate. Google Translate
has a feature that allows you to capture text with your phone camera, and translate them in
real time without any internet connection. All of this is essentially possible using edge
computing, and more specifically, TF light. Now that you've seen the amazing new applications
that you can create at your end as well, you might be wondering, the second alternative
that john had taken initially, that is to create an entire huge TensorFlow model, and then run the
inferences from the device. Why did that fail? Well, to answer that question, here are
some of the challenges that you might face when you create chaos or TensorFlow models,
and deploy them directly onto edge devices. Edge devices, not only restricted to mobile
phones, but your microcontrollers as well have limited compute power. Limited memory. battery
consumption is also a factor that you have to account for, as well as the application size.
If I consider a simple microcontroller is well, the processing power isn't so much
that you can essentially run inferences on a three or four GB model. If you consider the
storage capacity of majority of the edge devices, well, then ideally, you wouldn't have a lot
of storage that you can utilize for just one model. So these are the challenges that john
faced when he took the second approach as well. So what is the solution? Well, the solution
is TensorFlow light. And so flow light is a production ready cross platform framework
for deploying machine learning models on mobile devices and embedded systems. And flow
light at this point of time supports Android, iOS, and any IoT device that can run
Linux. So essentially, if you have any of these hardware devices handy with you
Then you can quickly create a TensorFlow model, convert it to an equivalent TF light model,
and start using the amazing TF light model. Now, you might wonder, what exactly is
the workflow to create a TF light model? Well, let's look at that as well. So the
workflow is fairly simple, you start by creating a TensorFlow slash karass model. So in the
entire process, you would have to collect data, you would have to clean data, pre processed data,
then create models, iterate over multiple models, and based on the metric that you're chasing, if
you are chasing for a higher accuracy score, then you would choose a model that would give you the
best possible accuracy. And that's about it, you have your TensorFlow model ready. Now from that
TensorFlow model, you convert it to a TensorFlow light model. So there is a format change that
happens. I'll talk more about it as we go along. Once you've converted your model
from TensorFlow, to TensorFlow light, then you go forward and deploy the entire TF light
model and run your inferences on the edge device. When I mean inferences, I mean, predictions. Okay.
Let me now explain this using a block diagram. So this essentially is the workflow that I've
just mentioned, you start off with your high level curiosity is create a model. Once you
have the model ready, then you can essentially use a TF light converter. And the converter
basically takes your save TensorFlow format file, and converts it into a flat buffer file. I'll give you an idea in terms of what I mean
by flat buffer file. So let's move forward. So TensorFlow lite represents your
model in a flat buffer format. Now flat buffer format is an efficient
cross platform serialization library for c++, C sharp, go Java, kotlin, JavaScript,
Python, and so on and so forth. It was originally created at Google for game development and other
performance critical applications. But slowly, Google realized that you can use the flat buffer
format in deploying models on edge devices. Now you might have an obvious question, why not
use our old tried and tested protocol buffers. And why shift to something
that is as newest flatbuffers protocol buffers just to give you context,
all your kiraz models that you create all your TensorFlow models that you create
are essentially protocol buffer format. protocol buffers are essentially very similar to
the flat buffer format that Google has created. The major difference is flatbuffers do
not need a parsing or an unpacking step to a secondary representation,
before you can access the data. And the code essentially is also larger in case
of protocol buffers. It is your while using TF lite, we make use a flat buffer format, and not
protocol buffer format. So now let's move forward. So far, we've looked at various aspects
of edge computing, we've looked at how deploying models on edge devices is better
as compared to deploying it on Cloud. We've also looked at what TensorFlow light is,
and what all it can support at this point of time. Now is where things would get
interesting when I show you through code, how TF light can actually compress your
model size without compromising on the accuracy piece. So now let's go forward
and witness the magic of TF light. Now that we've understood the basics of
TensorFlow light, and edge computing, let me show you the power of
TensorFlow light using Python. So for this example, I'm using Google collab.
For those of you who don't know, Google lab is an online environment wherein you can write Python
code, you can create machine learning and deep learning models, you can also make use of deep
learning models that Google gives you access to for good amount of time. So this is the
interface that I'll be using. I'll be attaching the link to the GitHub repository
in the description section of the video, feel free to access the code from there.
Also inside the GitHub repository. I'll also give you a link that can open a Google collab
notebook directly. With all the groundwork done, let me now go forward and show
you the magic of TensorFlow light. So the process that I'll follow in this
particular tutorial, is I'll create a deep learning model using TensorFlow slash Eros.
I will scale it down to a TF light equivalent model. Once that is accomplished, I will show you
the size difference between the original model as well as the compression model. I will show
you techniques how you can keep compressing the model even further without having
to compromise on the accuracy piece. With that, let me create an instance
on Google collab by pressing Connect. So currently, Google is allocating
some space for my computations. If you're planning to replicate the entire
thing that I show in today's video in your local machine, then you will require some
set of installations as well. Given that I'm working with Google collab, all the dependencies
that I require for this example are already met. So now let me go through the different things
that are required for this entire tutorial. First things first, I require the voice
module to essentially read my files. Next up, I'll import NumPy as NP I
will require this particular library for mathematical operations. I
will also require h5 p y library. The h5 p vi library is a pythonic interface to
the HD f5 binary data format. So technically, whatever models I create in chaos, I will
basically save it into the h5 format. Next up, I require matplotlib. This
is again used for visualization. I require TensorFlow, I'll import kiraz.
From TensorFlow. These are some of the layers that will require when we come to
the deep learning model creation aspect. If I want to calculate how good my model
is performing in terms of accuracy score, this is where SK learn dot matrix module will
give me the accuracy score functionality. And from the system library, I'll also require
the function get size of. So these are some of the things that I require in order to create a TF like
model. So now, let me go forward and run the cell. So when I run the cell, what
essentially happens is Python essentially runs that piece of code. So
let me now go forward and run the cell. So I don't see any error. That means all of
our imports are in place. Let me go forward and show you the TensorFlow version that
I'm using for this particular tutorial. I'm currently using TensorFlow 2.6 point
zero. If there are changes that creep up with respect to the API. Feel free to
refer to the TensorFlow documentation. There are two functions that have
created the first function name is get underscore file underscore size.
Essentially, I'm passing in the file location using the OAS library and specifically the
function get size, I'm able to get the size of a particular file that I pass in, in byte
so let me now go forward and run this cell. In the previous function, that is
get underscore file underscore size, the value returned would be in bytes. So
now rather than comprehending the values of a file size in byte, I've created a helper
function called as convert underscore bytes, which essentially takes in the input size
in bytes and converts it in either KB MB. So this is something that I've
created. I've not included DBS because so this is more of an explainer video
wherein I intend to create a smaller model or rather a proof of concept rather than
like a huge model. So that is why I have restricted my units to cavies or MDS.
So let me go forward and run the sale. Now for this particular example, I'll be
basically using a very famous data set and deep learning, Carla's fashion m&s
data set, so let me unhide the cell so fashion emnes data set
contains 70,000 grayscale images, which belong to 10 different categories.
So categories would include the shirt, trouser, pullover, dress, code, sandal,
shirt, sneaker bike, and ankle boot. So these are the different categories
that are part of this data set. The entire activity is a supervised
learning task, I will have a set of images, and every image will have a label associated with
it. And I'm trying to train a deep learning model. So now let me go forward. So you don't have to worry about the download
pieces. Well, if you have installed TensorFlow correctly, then essentially you just have
to call the chaos dot data set dot fashion amnesty function, save the entire data set into
a variable called as fashion underscore m nest. So that is the first step that I've done here.
Once you've done that, then essentially what you have to do next is you have to split
your data set into training and testing. The way you can achieve that is ideally by
calling a function called as load underscore data. So this is what you have in terms of the
function. Once you call this function from fashion underscore m nished, the variable
that you just created, you would be able to split your data into training images, training
labels, st images, and test labels. So that's how simple it is. So let me go forward and run
the cell. So we've downloaded the data files, we've split our data into training and testing.
I've also created a variable, a list variable, called as class underscore names, which
contains all the names of the classes that are part of this entire activity.
So let me go forward and run the cell. So if you recollect, we had
70,000 samples in our data set, we've already split that entire data set into
training and testing. So let me go forward and show you how many images are part of the
training data set. So let me run this cell. So the shape of my training data set is 60,000
comma 28, comma 28. So I have 60,000 images. Each image has a size of 28 cross 28. So 28 rows, 28 columns represent each image,
and I have 60,000 such images. Given this is a supervised learning task, I
would also require 60,000 labels. So let me check if the total number of labels in my training
data set are 60,000. So let me run this cell. So as you can clearly see, I
have 60,000 labels as well. Now given that, I've already mentioned that there
are 10 unique classes, let me verify that as well. So as you can clearly see I have class numbers
ranging from zero to nine. And the mapping is what I've created here, which is contained
in the variable class underscore names. Let's now go forward and explore that testing
data set as well. So let me quickly unhide this let me show you the total number of images in
the testing data set. So that come out to be 10,000 60,000 for training in 1000. For testing,
each image is again of the size 28 cross 28. Similarly, if I look at test underscore
labels, I'll have 10,000 samples. What I intend to show you next is I want to show
you a sample image. So let me show that to you. So this is a sample image that is part of
this data set. This is clearly an ankle boot. The size of the image is 28 cross 28.
So 28 rows 28 columns is what you see here. Before we go forward and train a neural network, a good practice is to scale In the intensity
values of the images, which range between zero to 255, zero to one. So that is what I have
done in this piece of code. So let me run this. So now my train images will have values
ranging from zero to one, and not zero to 255. So far, we've downloaded the data set, we've
split our data set into training and testing. And we've done some sort of pre processing as
well, now is the time when we'll create a simple neural network that 10 classify the entire images
into one of the 10 categories that are there. So in this piece of code, I'm calling the
sequential class from the chaos library. I'm passing in the first layer as a flattened layer.
Now, if you recollect, the images were 28 cross 28. If I have to pass it through a layer,
then I have to basically flatten it first, I am not creating a convolutional neural network
given that the data set is fairly simple, I will stick to a normal deep neural network. So
the first layer that I add is a flattened layer, wherein I pass the input shape, which is 28,
comma 28. The second layer is the dense layer. And the activations supplied to this dense layer
is relu. The final layer is again a dense layer, given that I have 10 different classes
to classify between. So that is what I have here. So let me quickly create an
instance of the model. So let me run this. Before we go forward and compile the model, I'll show you the structure of the model
as well. So I'll say model dot summary. So this essentially is the model summary. So for the given architecture, we have
close to 100k trainable parameters. Now the next step is to compile the model. I'm
passing in the optimizer, I am passing in this past categorical cross entropy loss. Given that
our classes are mutually exclusive, I'm using the sparse categorical cross entropy loss as compared
to the normal categorical cross entropy loss. And the matrix attend casing for his accuracy. So
I want to create a model that is fairly accurate. So let me now go forward and run the cell. So I've created an instance of the
model. I've also compiled the model, now is the time when I'll pass
in the training images as well as the training labels to train
the entire trainable parameters. So let me now call the model dot fit function,
wherein I'll be passing in the train images, train labels, and I kind of run the entire
exercise for 10 epochs. So let me run the cell. So with every epoch, you can see
that the accuracy is increasing. So we've successfully trained our model and we've
reached a training accuracy score of around 91%, which is Something that's reasonably good given
that I've trained the model for only 10 epochs. So let's go forward. And remember one thing, the objective of the video is not to train the
most accurate classifier at this point of time, but to show you the power of TF light, so
that is why I've kind of stopped at 10 a box. Now the next thing that I do is I create
a variable called SK Ross underscore model underscore name. This is something that will
be used as reference, or this will be the baseline model performance that I evaluate
later on with the TF light models as well. So the name of this particular variable
is pf underscore model, underscore fashion underscore m NIST dot h phi. So let me quickly
run the cell. Now let me go forward and call the model dot save function and pass in the filename
that I just created. So let me run the cell. So as soon as you run the cell, you
would have a file that would be created in your Google collab session
or in your local directory, which is essentially your saved model
file. So let me show that is well. So this is our saved model file that has been
created. So let me quickly and I the cell again. Now I'll go forward, and I'll show you the size
of this particular file that we've created. So let me call the two functions that I've
created, that is convert underscore bytes, and get underscore file underscore size, I
pass in the same file name, and I want the file size to be in MB. So let me run the cell.
So currently, I have a model that occupies 1.2 Mb. So I'll go forward and I'll create a variable
called SK Ross underscore model, underscore size, and save the byte equivalent size into this
particular variable. So let me run the cell. We know for a fact that the model is performing
really well on the training data set. But then essential litmus test is to check how well
the model is performing on unseen data that is my testing data set. So let me go forward
and evaluate how good our model performances on their testing data set. So I call the function
model dot evaluate, I pass in the test images, test labels. And I save the results into
two variables called test underscore loss, and test underscore accuracy.
So let me quickly run the cell. So as you can clearly see, the loss is at a very
small value, which is around point three, seven. And I've reached a testing
accuracy score of around 88%. So we've completed the first part, now
it's time to move on to the next part that is creating a TF light equivalent
of the same model. So let's go forward. So I start the activity by creating a variable
called as TF underscore light underscore model underscore file underscore name. And I pass in an
equivalent name to this particular TF light model, which in our case, currently is TF underscore
light underscore model.tf light. So let me quickly run the cell. Now the process of converting a
TensorFlow model or a karass model into a TF light model, essentially requires just a couple of
steps. So this is what I'll highlight right now. So the first step is to call tf.light.tf light
converter dot from Kara's underscore model, I pass in the model that I created. So if
you remember the name of the model variable was essentially model so that is what
I'm passing in here in the first line. Once I've created an instance of the TF light
converter from Eros MADI, I save the entire piece into a variable called as TF underscore
light underscore converter. And finally, what I do next is I call the Convert function.
Once the conversion happens, I want the result to be saved into a variable called as TF light
underscore model. So let me quickly run the cell. So if you look at the output, it says that assets
are written into a particular temporary file. So from that temporary file, I basically
have to retrieve the model weights and save it into a TF light equivalent file. So
that is what I've done using this piece of code. I've created the first variable which is TF
light underscore module underscore name. And I'm passing in the initial name that I've created
in the first line of this particular section. I open the file name with write access,
and I write this particular temporary file into this file that I've created. So that's
how simple it is. So let me quickly run this. So there is a particular output that is displayed.
This tells me the total number of bytes that have been returned to this particular file. Now let me
go forward and show you the exact size of this TF lite model in kilobytes. So let me run this. So
the overall file size is close to 400 kilobytes. So we started off with a model which was occupying
around 1.2 Mb. And after just running a couple of lines of code, we have brought down the file
size to around 400 kb. Now, let me go forward and save this file size into a variable called
s TF light underscore file underscore size. This is something that will make a lot of sense
once we go forward. So let me quickly run this L. Now we've already converted a model from kiraz to
tf light. But one thing that we've not validated currently is how good the model is in terms of
performance. Is it actually good on unseen data? Or has it dropped in terms of the accuracy
score. So that is what I want to check next, that after compressing the model using TF
light, are we losing out on accuracy or not. So in this section, I'll go over how you
can validate the results in terms of how good your TF light model is performing.
So now, let me quickly unhide the cell. Now, don't get scared by looking at this piece of
code, I'll help you understand what I'm trying to achieve. You're now loading a model, or TensorFlow
or a karass model into a TensorFlow session is fairly easy. But here, what we have done is we've
kind of created a TF light model. If you go back to the discussion that we had, TF led models are
essentially flat buffer format files and not your normal usual protocol buffer files. So in order
for us to actually make an inference out of TF light files on our TensorFlow or a Python session,
we require something called as an interpreter. So it is your that will be basically making
use of tensor flows interpreter to load the TF light file, and then make inferences or
predictions. So let me now take you through each and every line of code. So in the first line,
I create an instance of the interpreter class, I pass in the TF light model name that
we've just created. So if you look at this particular section, you will also have a
TF lite file. This is what I'm passing in here. Now once we've created the interpreter object, the
interpreter object saves details about the model. It will have details about the input that it
expects the value that type of values it expects, and in terms of the output, it will tell
you what the shape of the output should be. In terms of the output, it will tell you the
output shape as well as the output the type that is the output values, it will predict what
are the values and what are the type of values. So all of that is what this particular interpreter
will actually have details about. The details it is fetching is again from the interpreter object
that we created and we passed in the TF lite file. So all of the details would be captured in this
particular TF lite file, which is what is read by this interpreter object. And that is what we
are trying to accumulate from input underscore details and output underscore details. Once
we have the input and output details, I also am interested in the shape of the input that is
expected and that type of inputs are the variable nature of the inputs that are there. So let
me quickly run this cell to make more sense. So if you look closely, the input shape is 128 28.
The input type that it expects is NumPy, float 32 the output shape is one comma 10 that is
one so basically has one row and 10 columns and the output type is again NumPy float 32. So this is essentially what the TF lite file
contains. Now if you look at this particular one, this denotes that the A flight is expecting
one input at a time. Now I want to check how good it is performing for 10,000 inputs.
That is where I'll have to reshape the input shape to a particular value, which is what
I'll be achieving in this piece of code. Just to reiterate, again, the input shape
is 128 28. So ideally, I have to pass in just one image sample. And essentially, I
would get a corresponding output for it. But essentially, in my use case, I want to
validate how good the TF light model is performing for the testing data set that I have,
which essentially contains 10,000 images. So if this idea is clear to you, let's go forward. So now I want to validate how good my TF light
model is performing on my testing data set. So I call the resize underscore tensor underscore
input function. I pass in the details that I want to resize this particular index value, and I
pass in how I have to resize it. So currently, I have 10,000 samples. So that is what I have
entered here, that is 10,000 comma 28, comma 28. Similar resize operation is what I'm doing at the
output side. So you can see your 10,000 comma 10, from the initial one comma 10.
So that is what I've done here. Now, once the resize operation has happened, I
want to call the allocate tensors. to actually change the entire structure of the interpreter.
This is what it's read using the TF light file. And now when I print the input details and
output details, I should be able to see that the entire TF light input output values have
changed. So let me quickly run this piece of code. So as you can clearly see, the input shape
has changed from 128 28 to 10,028 28. So this essentially will help me to validate
how good my TF light model is performing. The other thing that I want to highlight
right now is s underscore images dot d type is float 64. So if you look at the input shape
that the model expects is NumPy dot float 32. So now the only other change that I have to
make in order to validate my TF light model is I have to create a new array called as
test underscore images underscore NumPy. Pass in the original array, and change the D
type. I can do it in the same area as well. But I'm essentially choosing to create two
different arrays. So let me quickly run the cell. So now if I show you the D type of test underscore
images dot NumPy, it will be NumPy float 32. Now that we have the entire interpreter object set
up correctly for our set of inputs, that is the testing data set. All I have to do right now is
firstly call the set underscore tensor function, passing the test underscore images underscore
NumPy array that I just created and call the invoke function. What the invoke function
would essentially do is pass in the inputs, get the output. And once you have the output
ready, you call the get underscore Insert Function, which will kind of have the output
ready for you and save it into a variable called as TF slide underscore model underscore
predictions. So let me quickly run the cell. Now the output that you see here, which is
prediction results shape is 10,000 rows and 10 columns. So every column would essentially contain
a probability score. So what I have to do next is I have to pick out the value or the index between
zero to nine that has the maximum probability, which is what I've done using this function
called as NP dot arg max. So this will help me get numbers directly that is zero to nine
rather than having 10 different columns with probability scores. Now let me calculate the
accuracy score. And let me print it out for you. So the testing accuracy of the TF light model is
exactly the same that you see when you compare it with your normal karass model. Now how much of
space Have you saved in this entire process? says, Mel let me calculate a ratio between TF lite file
size and karass model file size. So overall, the TF lite model occupies close to 32% of the overall
file size that my normal karass model occupies. But the uniqueness is that I'm not losing out on
any accuracy. So this is the power of TF Lite. I've been able to compress my entire
model from 1.2 Mb to around 400 kb. And I haven't yet compromised a bit on the
accuracy pieces. Well, Isn't this amazing? Well, if you think the story is ended here,
hang on for a second, there is more to go. So far, what we've done is
I've taken a TensorFlow model. And without any optimization, I've basically
converted that into an equivalent TF light model. Now I'll show you how you can
compress your model even further. without losing out on accuracy as such. So now let me introduce you to a
new concept called les quantization. So what exactly is this term that I've
just mentioned, that is quantization. So for a given weight value that can be
represented in float 32 or float 64 format? Wouldn't it be great if we can bring down the
size of those particular values and see very little change in accuracy? Well, this essentially
is the concept of quantization, I'm reducing the total number of bits for every weight value, so
that the overall size of the entire array reduces. Just to be more clear, if I have a neural network
something like this, where this particular weight value is 5.31345, this particular weight value is
3.8958. And you have the other way values is well, what if I can change these representations that
occupies so many bits to something like this, there will be a small hit in the accuracy.
But overall, I'll be able to compress my model even further. How when whatever questions
you have in mind, just wait for some time. If this entire idea is clear to you, let us
go back to the coding section. And I'll show you how you can compress your TF light model even
further. So by default, in the previous example, wherein we took a karass model, and
we converted that to a TF light model, every weight value is essentially float 32 format. Wouldn't it be great if I compress
it from float 32 to float 16. This is the activity that I'll be performing
next. So I create a variable with an MTF underscore light underscore model underscore float
underscore 16 underscore file underscore name. And I basically give it a TF lightning which
essentially represents that the entire weights inside it would be fluid 16. So that is what
I have here. So let me quickly run the cell. If you look at the previous section in
terms of how we created a TF light model, the first line of code is something that is
pretty much familiar to you, you pass in your karass model, you call the TF light converter dot
from kiraz model, and you save it into a variable. Even the last piece of code is also something
that you've already looked at. What you haven't seen so far is the optimization. So when you
create an instance of the TF light converter, there is a flag called as optimizations.
So you have the optimizations flag here. I set it to tf dot light dot optimizer default,
so I want the default optimizations to take place. And one other things that I do here. So I'll speak
more about the optimizations in the next section. So hold on to that thought as well. Now here there
is one more flag called as target underscore spec, and supported underscore types. Here is
where I set every weight value from floor 32 to float 16. So that is what I'm doing here.
So let me quickly run this piece of code. So now I have a TF light model, wherein every
weight value would be a float 64 Automat, I follow the same process again, wherein
I'm fetching data from the temporary file and saving it into a TF lite
model. So let me run this. I don't know if you've guessed it
already or not. This essentially is a file size in bytes for the
newly converted a flight model. So if I now show you the size of
this newly converted TF flight model, then my size has drastically reduced
from 400 kilobytes to 200 kilobytes. The only thing that I changed here was I changed
the individual representation of every made value that's about it. Isn't this amazing, I'm
able to save so much of memory, just by changing few values here and there. Say again, save the
file size into a variable called as TF lite underscore float underscore 16 underscore
file underscore size. So let me run this. Now if I compare it to the original karass
model, this particular model occupies 16% of the size that the original model occupied. And
if I compare it to the previously created version, then I can see almost 50% compression that
I'm able to achieve by changing the weights from floor 32 to floor 16. So this is the
power of optimizations and TF light. If you think this is it weird for the next section,
wherein I compress the model even further. I'm not showing you the accuracy piece
right now you might be wondering, why isn't he showing us the accuracy has
accuracy taken a toss? Well, the answer is no. I show you the accuracy of an even compress
model. So that will give you a fair sense in terms of how much compression is changing
the overall accuracy values as well. Okay. So now we have reached the final section wherein
I will compress the model even further. Okay, so here I've created a variable
called as TF underscore light, underscore size, underscore current underscore
model, underscore file underscore name. And here I want to see what the eflite file with this
particular name. So I'll quickly run the cell. In the previous example, I
changed every weight value from floor 32, to floor 16. Rather than
you deciding what is good for your model, I would rather let TF light decide that for
me. So if you've been following along so far, then this piece of code is something
that we've already covered. This piece of code is also something that we've
already covered. This is something that is unique. So here, I set the optimizations fly. And here
I just mentioned, optimize for size, there are different values that you can kind of go through
in the documentation. So based on your needs, you can kind of optimize for size, and the other
optimization that are also available for TF light. So I'm not specifying what type of data type
I want, I just want the most optimized version wherein the size that is occupied
by this particular TF light model is the most compressed version.
Okay, so I'll quickly run the cell. So I catch all of the file that is
saved into a temporary variable, I save it into this particular variable that I
created. And if you've guessed by now, this is the new file size in bytes. If I go to the kilobyte
section, then my file occupies around 100 kb. So if you remember, we started from 1.4 Mb, and we have brought down the file size of a deep
neural network. 200 kilobytes. Isn't this amazing. Just to give you some numbers again, if I compare this particular file size
with my original file size, then my current file is almost
8% the size of my original file, that is my karass model that I created. If
I compare it to the previous model as well, I'm basically able to achieve 50% compression,
all because of optimizing for size. This essentially is a power of TF light. Now I'm really happy with the compression, I
have a 1.4 Mb file that I've compressed down to around 100 kb. But is the accuracy still the
same? Well, we'll again follow the same process, wherein I load the newly quantized model into the interpreter object. I'll get details of
those objects, and I'll again reshape the values and pass it testing data set entirely through the
interpreter object. So I'll quickly run the cell. So as you can clearly see, 128 28 is the
interpreter object input values that it expects, I have the output as one comma 10. The input and output values that are expected
are NumPy, float 32. So that is all good as well. Coming to the final section, I follow the
same process. Again, no change in the process. I have a testing data set of 10,000 images, which
is what I pass in here. I allocate tensors, I get the details, and I'll show the details to you as
well. So let me quickly run this 10,028 28 128 28. So we've resized the tensor input values that we
are expecting to validate our testing data set. You essentially don't need the step again,
but as kind of copy it from the initial part. So I'm running it again. I pass in the values
again. Now I calculate the accuracy score. Now is the litmus test for this highly compressed
TF light model. So let me quickly run the cell. So the accuracy of a TF lite model that occupies
almost 8% of the size of the original karass model is equivalent to the original karass
model. If I go up, I'm recording this video in one go. So if I go up where did
this value go? here the value was at 7.66%. Here it is at 87.59%. So this is
what you can achieve using TF Lite. I started off with a very simple neural network, the entire model occupied around 1.2 Mb. You
might also argue that 1.2 Mb is kind of small. But the problem statement was fairly simple.
If you have like a really complex example, wherein you have to classify images
into 1000 or 10,000 categories, the model size would eventually increase. So the objective then becomes can we compress the
model size? And the answer is yes, TF lite will help you compress the model size without having
to compromise any bit on the accuracy front. If you've reached this point, then I'm
assuming you've seen the entire video or if randomly reached at this point,
whichever way you've reached this point. I hope you enjoyed today's video, I keep
creating such amazing videos on data science, machine learning, and Python. So feel free
to check out my channel in the description section of the video as well. Thank
you so much for watching this video.