TinyML: Getting Started with TensorFlow Lite for Microcontrollers | Digi-Key Electronics

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Applause] [Music] [Applause] [Music] tensorflow is a free and open-source library that makes implementing various machine learning algorithms much easier there are lots of tutorials out there on how to get started with it but I wanted to show a niche application for it running it on a microcontroller tensorflow lite is a subset of the tensorflow repository intended to run on embedded devices and there's a small piece of that subset that's meant for microcontrollers since microcontrollers are usually very limited in resources you generally don't want to train a machine learning model on them so a basic workflow is to train a model using full tensor flow on your computer or some other server you'll then convert that model into a tensor flow light flat buffer file this dot TF Lite file works with regular tensorflow Lite but you'll need to convert it to an array in your language of choice for microcontrollers for us we'll create AC array and then transfer that over to our microcontroller so it gets loaded into memory with the tensorflow light for microcontrollers library we can now perform inference using our model on our microcontroller inference is the process of taking new data that has not been seen by the model before and feeding it to the model the model then infers some information about that data for example if we feed it a photo a model train to look for cats would tell us whether it thinks there's a cat in the photo or not I'm not going to get into training a neural network in this episode as that takes a lot of time and there are plenty of other great videos out there that show how to do that for this episode I'm gonna start with a pre trained model that I created in a previous episode please check out the intro to tiny ml episode 1 video which I'll make sure is linked in the description in it I show you how to use tensorflow and Karos in google collab to create a three layer neural network that predicts the output of the sine function it's a terrible way to create a sine wave in a microcontroller but it's a great demonstration of how to train a very simple neural network for our purposes all credit goes to Pete worden and the tensorflow team for putting this example together once we've trained the model we want to test it by feeding it a few numbers I'll specifically be looking at an input of two if the model runs correctly in our microcontroller we should see an output of 0.9 zero five nine this should be close to sign of two which is about point nine zero nine three from there we save the chaos model as a dot h5 file although it won't be very useful for us we need to convert it to a tensor flow light file first by calling the TF light converter dot from Charis model function note that we optimized the model for size but we're still doing everything with floating-point values there is a way to quantize the models inputs outputs weights and bias terms to eight bits to save even more space but I'm not going to get into that here finally I wrote a quick function to convert the TF light file into a C byte array which I'll save in a dot H file if you look at the file explorer pain in collab you should see the model files that we created I recommend downloading them all you can use a program like net Tron to examine the Karos and TF light model files this can help you understand the input and output formats as well as how the layers are connected our generated dot H file should have the TF light model converted into AC array which I'll give it the global variable name of sign model we just need to include this header file in our microcontroller code to import the model now that we have our model we need to prepare tensorflow light as a library what I found is that tensorflow really doesn't want you to use it like a library they want you to modify automatically generated source files and then compile everything within their file structure well this might work for some instances I honestly find it very hard to use when you already have an embedded build system setup for this example I'm going to show you how to import tensorflow light into stm32 cube ide and that's only because i'm most familiar with stm32 stuff right now I'm using an STM 32 l43 to KC new Cleo board for this as it's small but running an arm cortex-m for processor the steps I'm about to show you should work for any build system including make you'll need to run a make file inside of tensorflow which will generate a template file structure you then copy those files over to your project and include the necessary header and source files in your build system to generate the tensorflow lite file structure you need to use linux or mac OS I have not gotten this part to work on Windows yet if you're on Windows feel free to dual boot or use a live USB distro of Linux I'm just going to SSH into my Raspberry Pi since I always seem to have one lying around you'll need to have make Python three pip three git and zip installed use whatever package manager you want to install them if you don't from there use git to clone the entire tensorflow repository I'll use a depth of 1 which means I only grab the last revision saving some download time next I'll go into the base tensorflow directory and run the make command from there I'll point it to the make file located in the tensorflow Lite micro tools make directory and give it the portable optimized tag I don't know if it's necessary for this demo but this tag seems to generate a different implementation of a couple of source files optimized for microcontrollers so I'll leave it in we give make the generate non kernel projects target which leaves out a few template projects that we don't need in this instance make doesn't build or compile anything it simply generates a bunch of template project directories for us to use which can take a few minutes while make is running let's create our microcontroller project I'll be using stm32 cube IDE but as I mentioned you can use almost anything just note that the steps for including source files and linking object files might be different for your development environment but the process is essentially the same I'm going to start a new project for my new Clio l43 to KC board note that we must use C++ as the target language here which means choosing to use a C++ compiler as most of tensorflow is written in C++ I'm going to enable one of my timers to continually tick every microsecond which we can use to measure how fast inference is running for our model since I want this to run as quickly as possible on my microcontroller I'm going to use the high speed internal oscillator and kick up the main clock to its maximum of 80 megahertz let's save to have cube MX generate code in this IDE I need to rename main dot C to main dot CPP so that the build system treats it as a C++ file at this point the make process on Linux should be done to start I'm going to copy the sign model header file into my project directory then I'll navigate into the filesystem of my Raspberry Pi which I've mapped to my Windows machine using SSH FS navigate into the tensorflow repository then into tensorflow lite micro tools make gen and linux something this folders name will likely change depending on your specific computer and operating system in which you called make in prj you should see a whole set of project templates I recommend ignoring the test ones for now but you can look at any of these to get an idea of how you should structure tensorflow projects note that tensorflow wants you to change the main program in one of these templates to get your project to work but we're not going to do that go into hello world and you should see a number of build systems that the hello world template was generated for once again we're going to ignore this as I want to use tensorflow Lite as a library in my build system of choice make should be neutral enough so let's go into that directory and copy the tensorflow and third-party directories these will act as our tensorflow Lite library navigate back to your microcontroller project directory and create a folder named tensorflow underscore lite paste the tensorflow and third-party directories in there the third-party folder contains some open source tools needed by tensorflow that should have been automatically downloaded when you ran make in the tensorflow folder go into light micro examples and hello world this folder holds the main application that we're supposed to modify feel free to look in the main and main functions files to see how they want you to make a tensorflow light for microcontrollers application while you could you this template and make it work with your specific microcontroller we're not going to do that so delete the examples folder and everything in it back in our IDE we want to refresh the project Explorer so that it sees our newly added files this includes the tensorflow Lite folder we added as well as the model file now we need to tell the build system to include these header and source files if you're working in eclipse or one of its derivatives go into project properties in C C++ general go to paths and symbols and open the includes tab click Add and select the tensorflow Lite folder from your workspace you'll want to edit to all languages and configurations repeat the process for the third-party flat buffer include gem low P and ruie folders because of how the includes are structured in the tensorflow source code you must add these third-party directories just like this check to make sure that the included directories show up in the C and C++ languages as well as in the release configuration go to the source location tab and add the tensorflow Lite directory to both debug and release configurations click apply and click yes if asked if you want to rebuild the index note that if you're working in make you'll want to ensure that your header and source search paths include these directories there's one file that we need to edit in the tensorflow library that will allow us to print out debug information go into tensorflow lite tensorflow lite micro and open debug underscore log CC comment out the include c standard input/output line as well as the debug log function on a separate line rewrite the debug log function but include the attribute week symbol and leave the function body blank by using the week symbol we can override this function in our main program we need to keep extern c as this is AC function which will be called inside of a c++ program back in main implement the debug log function however you see fit I'm going to simply print out the string argument over my UART connection which is connected to a USB to sear Converter on my nucleo board since the function does not provide me with an array length I'm going to use the string length function to figure that out and pray that the string is null terminated we'll need to include the C string library for that to work I'm going to give only a broad overview of this code for the sake of time but I'll add a link to it in the description so you can dig into it if you wish first we need to add our includes these point to just the tensorflow like functions that we'll need in our program notice that I also include the model file that we created at the beginning of this video next we want to define some global variables we'll put them in an anonymous namespace so they're only accessible in this file these are pointers to our error reporter model and input and output buffers the important thing to note here is the tensor arena size you'll have to play a bit of a guessing game here I'll start with two kilobytes of space for the arena which is just a chunk of memory that tensorflow light needs to perform its calculations if your allocate tensor arena function fails later you'll likely need to increase this size in main I'll add a few variables that our program needs such as a buffer for printing out strings over UART return status codes and a timestamp in microseconds just before the super loop will add some initialization code first we start our timer so that we can measure how long inference takes next we create an error reporter which is needed by the inference engine to help us troubleshoot problems I'll write a message out over the error reporter to make sure it shows up in our serial terminal next we read in our model array note that the variable name here should match the name of the array in the signed model header file we also check to make sure that the tensorflow light operations schema version in the model matches the version of the tensorflow light schema we're using here then we want to create an operation resolver where we only include the specific operations required by our model if you want to see the tensorflow light operations that are supported check out the micro ops h file since our model only uses fully connected or dense layers we need to register the fully connected operator with the OP resolver we create an interpreter object and pass in pointers to our model op resolver arena buffer and error reporter we then call the allocate tensors function which configures the arena buffer we created if this fails in your code it likely means that you didn't create a large enough arena so you'll need to go back and adjust the tensor arena size variable finally we assign a couple of pointers to our input and output buffers to make accessing them easier at this point we're ready to start making inferences but I like to check the dimensions of the input tensor first so we'll print out the number of elements in the input tensor buffer just to be sure which in this case should be equal to one element a floating-point number in the while loop I'll put the floating-point number two point zero into the input tensor buffer note that I'm using a for loop here to demonstrate how you might fill up the input buffer with more than one element but for this particular model there should only be one element will grab the current microseconds and then we call the invoke function to run inference this is a blocking function so your processor might hang for a while at this point when it's done I'm going to get the only element in the output tensor buffer once again with other neural networks this output buffer might have more than one element then I'll construct a string showing the output of the neural network and how much time has elapsed since our original time stamp I'll print this string out over the uart port in some build systems like mine here you might need to add a special linker flag to print floating point values with % F so I'll go into project properties C C++ build settings tool settings tab MCU g plus plus linker miscellaneous in other flags I'll add the - you printf float flag for both debug and release configurations finally I'll add a delay of 500 milliseconds before repeating inference again when you build the project you might get some warnings I recommend reading them to see if there's something you need to fix in my case this warning looks to be something inside the tensorflow light library that I probably won't use so I'll leave it for now with everything built we can upload the program in stm32 cube IDE I do that by clicking run and debug I'll accept the default debug configuration settings and open the debug perspective I run my program and connect a serial terminal to my new Cleo board if everything works I should get some data showing that the output of the neural network is 0.9 zero five nine which matches the original test from the beginning of this episode also we can see that it takes about three hundred sixty eight microseconds to run inference with this simple three layer neural network on our 80 megahertz microcontroller this is great everything seems to be working there's just one last thing I want to try as three hundred and sixty eight microseconds seems to be a little bit too long let's stop our running program I'm going to go into project and set our active build configuration to release I'll rebuild the project while that's going on if you look in the tool options for debug configuration you can see that there's a debug flag being set which is not present in the release configuration my understanding is that some of the code in the tensorflow Lite files changes depending on this flag which can speed things up if we remove it it does mean that we'll be giving up some helpful debugging information though let's create a new run configuration and set the newly built release elf file as our application this is what will get uploaded to the microcontroller we'll set the build configuration to release in case we need to build the project again when we click run it looks like any new code will be built this is an interesting point to stop and look at the output of the goose eyes tool from my understanding add the text and data fields together to get the flash usage which is about 50,000 bytes you then add data and BSS together to get the predicted RAM usage which is about 4700 bytes in this case when it's done the program will be uploaded to the stm32 but it won't load the debugging perspective bring up the serial terminal and you should see that the program is already running without that debug flag it looks like the neural network runs over three times faster it's now taking only about a hundred and four microseconds to run inference this seems to be pretty quick to run a neural network but the real question is can we do anything useful with such a small neural network that will have to be a discussion for another time with that please subscribe if you'd like to see more videos like this one and happy hacking [Music]
Info
Channel: Digi-Key
Views: 16,220
Rating: undefined out of 5
Keywords: DigiKey, machine learning, ai, edge ai, neural network, TinyML, IoT, STM32, STM32CubeIDE, X-CUBE-AI, TensorFlow
Id: gDFWCxrJruQ
Channel Id: undefined
Length: 18min 35sec (1115 seconds)
Published: Mon Jul 06 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.