TinyML: Using Machine Learning on Microcontrollers to Recognize Speech: Shawn Hymel

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
i'm admitting everyone uh joining in so i'll just give it a few more moments hello all right well uh welcome everyone to remoticon 2020 my name is bruce dominguez i'm one of your room moderators here and uh depending on your time zone uh good morning good afternoon and good evening today uh sean himel will be hosting a discussion on tiny machine learning and uh if you have any questions or comments please post them in the chat to the room and uh you know it's a community so definitely be involved uh respond to each other and with that uh sean is an electrical and embedded engineer freelance content creator and he and harris kenny host a podcast the hello blink show where they discuss various aspects various aspects of starting a business from sales to hiring and sean has created his own company scourrisa that helps companies create compelling technical content in electronics and embedded systems and sean is an advocate for enriching education uh through stem and he believes that the best marketing comes from teaching uh he can be found giving talks uh running workshops and swing dancing in his free time and with that i'd like to introduce you to sean himal awesome thank you bruce and welcome everyone i know that we still have people kind of trickling into their waiting room here um as bruce mentioned uh i have been doing a bunch of machine learning recently mostly in the embedded world i used to work for spark phone electronics um if anybody ever watched that channel i wear my bow tie and i'm doing it today so nice recognition there i got to keep the brand going and one of the things i've been doing for the last year or so is learning machine learning um i'm going to take off my glasses you'll have to forgive me as my as my room gets a little toasty here my glasses are going to fog up so normally i do actually wear glasses they're not fake but um since i'm nearsighted i'm going to take them off so i can look at the computer so we're going to be doing machine learning and machine learning specifically for embedded systems and i've asked people to pick up some hardware if you want to follow along if not i believe these are going to be recorded so that you can come back to them later and um i will continue to produce content out there on the internet about how to you know say get some type of machine learning thing running on an arduino or running on an stm32 um i've asked people to grab an stm32 l476rg nuclio board um mostly because that is what i'm familiar with uh i have recently learned how to get this running on an arduino board that's a little easier but when i made this when i made this workshop i'm going with what i know so let's jump in because i know that um there's a lot to cover here um let's click share screen share hopefully everybody can see that and it looks like bruce just dropped in the hackaday project link so if anybody wants to take a look at that and i'm going to go ahead and share this all right hopefully everyone can see this if you have not done so already please solder the headers to your microphone breakout board do that while we're talking please install stm32 cube ide just google search that term download it install it um i think you're gonna have to sign up for a account on stms or st's site so heads up on that except all the defaults it's a pretty large download so please do that if you have not done so already you will also need some kind of serial terminal program like i'm on windows so putty it is um forget what the big one on mac is but you've got like serial or um or or a couple of others on um on linux uh the other thing is you will need to create a gmail account if you don't have it already and you will need to create an edge impulse account we will be using um not gmail necessarily but we're using google colab which will let us run a script that i wrote to help us curate the data and we're going to be sending that curated data up to edge impulse to help us do the training i chose edge impulse because it's a tool it's a graphical tool and honestly if i just gave you python scripts to run the training it would be doing the same thing anyway that you're just running through this script that i've provided to you the advantage of edge impulse is that they manage all of the packaging up in libraries for us so that it runs i'm gonna do my best to make this not a canned demo so that when you get this library i'll show you what you need to use in that library for your embedded system um so yeah the worksheet is on this github link and there we go the worksheet is on this github link um please go there we're gonna be working out of that please feel free to work ahead i really don't mind in fact i can't see a lot of you right now so um if you if you're like you know chomping at the bit go for it start working ahead um i will say if you run into problems um wait until i get to that point where i'm also working through it and then i'll try to catch up in the chat however please use the chat please ask us questions in chat um and if you see somebody else asking questions please try to help out if you've run across this problem i know on twitter a few people running into problems uploading code um if you use the demo program that we're going to go through in my github repo then it should work all the settings are there so that it uploads to the nucleo with no problem i have released these slides as creative commons so if you want to pick and choose some of them and talk about tiny machine learning in your own company or with your own community please do so i really don't mind all i ask is for a little bit of credit so what is embedded machine learning um something i have learned fairly recently so the class was called tinyml and what's the difference between tinyml and embedded machine learning well what i've learned is that tinyaml is developing as this community you can go to tinyml.org and sign up for their talks they do them like once a week or once every other week there's a forum where people are discussing running machine learning algorithms on an embedded system so things like single board computers and more specifically microcontrollers so tinyml is the community embedded machine learning is kind of considered the general broad concept of let's run machine learning on embedded systems so why do we care to do this right we've got these giant computers and servers out there that already do machine learning for us in fact they do it really well um you you know any number of amazon and google services um netflix recommending to you what movies you should watch that's machine learning so why do we care to run it on an embedded system well let's take an example here you are probably most familiar uh many of you probably have one of these devices sitting in your home right now and that is the echo the amazon echo or google home or whatever or even your smartphone for like hey siri that kind of thing most of the time when this thing is interacting with you or you're interacting with it it is streaming audio data to a server so we'll go with the echo here that's streaming audio data to an amazon server however it's not always listening it's not always streaming data it's listening in tiny little chunks up front for a specific keyword and for the echo initially by default that's alexa that is the keyword that listening for that keyword happens on the device itself it's a microcontroller that's doing embedded machine learning tiny ml when it hears that keyword then it opens up a stream to the amazon servers and just starts piping anything it hears after that where more complicated and more complicated machine learning algorithms take place that do things like natural language processing in order to determine yes i'm so sorry to everyone that just heard that word and all of their echo devices started beeping at them i apologize i will refrain from using that word as much as possible in this day and age it's almost like a curse word if you're streaming stuff so once you once it opens up that socket to um a server an amazon server out there that's where all the natural language processing comes from to determine what you're trying to ask it to do that's an example of embedded machine learning other kind of things you'll see are things like video object detection on a very very low level like like a person detection this is not trying to do like you know um self-driving cars we're trying to identify every object and roads and what what have you this is more like oh i just want to see if a person's in the frame and in fact i've got a security camera that can do exactly that and it's a microcontroller doing this um something else that i've seen is anomaly detection you take sensors like accelerometer accelerometers and you put them on like i don't know a um a servo or a motor or something that's going to give you some type of vibration data and you watch that over time and then you say okay is there an anomaly is this thing about to break that can be very very helpful for things like um yeah washing machine washing machine is another good one is this about to break um things like um like satellites right where things you can't get to you need to know beforehand so that you can take actions to prevent it prevent the break from happening in the future you can be a little predictive with this which is where machine learning comes in and that's why we want to run this on embedded systems because it allows us to do these kinds of things so uh if you didn't get the link already here is the github link to my keyword spotting that is going to be our worksheet start working through that um i will work through it with you um as we go through this but i will also stop and explain things um bruce linked it so please click on that um thank you for that i will i will continue to talk about concepts as people are working through this when we get to a stopping point that's a good place to for me to blab some more and answer some questions and there'll be like specific slides for that kind of thing uh once again i mentioned that this is i'm going to do my best to not make it a canned demo because that's annoying i want to give people the tools they can they can use to create their own say keyword spotting system and introduce them into the world of embedded machine learning data collection the first thing we need to do is data collection so we're going to head to this script i'm going to exit out of these slides here is the worksheet and we're going to scroll down this worksheet feel free to read this at some point there are two ways to do data collection for this script that this data this curation script that i made the first is to do it locally um you'd have to install a whole bunch of python and packages and stuff to make that happen so we're not going to do that i created a collab script for you so if you could just click on this opening collab button in fact let me make that a new tab so we've got this collab script if you've never run collab before it is basically a jupiter notebook that's all it is it's running it's jupiter notebook running on a google server somewhere this is we have access basically to a full linux box here um obviously they restrict you and what you can do but the idea is we can actually like upload files look at the file system and run python scripts and it's giving us a jupyter notebook interface to do it to run a cell so each of these oops each of these is considered a cell and they can run individually to do that we hold shift and press enter i'm sure there's a button somewhere to do this but this is how we've been doing it jupiter and jupiter notebook feel free to read through this however we are not going to be uploading our own keyword samples today this script does allow you to upload your own keyword samples something you can do is take your cell phone any recording device your computer and record yourself saying some keyword if you were at the bring a hack last night i showed one where i recorded myself saying trick or treat a whole bunch of times right that could be a keyword a keyword doesn't have to be one word it can be a phrase but in our case as long as it's less than a second that will act as a keyword for us um you need probably about 50 samples hundreds better thousands is even better than that and you want them to be a variety of pronunciations accents genders voices types anything you can think of the more variety you have the more robust model you're going to be able to create that it will recognize other people's voices in fact i've got a good story about this i went to my parents house to hand out candy to kids on tr on trick or treat night right halloween um because it's something fun to do and i like giving out candy to kids and i brought this pumpkin hack with me and because i only trained it on an adult male and an adult female voice it did not pick up the little kids saying it because they have very different voices so keep that in mind when you're creating this type of system for a demonstration purpose yeah your own voice is fine but remember if you're going to deploy this think about people might have different accents who are going to talk to this and that's going to be a problem if they can't interact with your device and in fact amazon used thousands and thousands and thousands of samples um to get that alexa keyword going all right so let's execute this next script this is all this is all text so not much is going to happen here's where the here's where the magic starts to happen let's shift enter on node.js uh-oh the notebook was not authored by google we're going to run anyway because it's part of my account and i'm going to tell co-lab to run it this is going to take just a moment we need to install the latest version of npm which is the javascript package managed or node package manager in order to run the edge impulse uploader tool uh it's a command line tool we'll install it and that's gonna allow us to just send our data to uh the edge impulse servers so the next thing we're gonna do here is we're gonna install these packages so we're gonna install sound file which lets us read and write um wav files um that's gonna be needed by the curation script and we're also going to install the edge impulse command line tool uh once again that will allow us to we're gonna curate a bunch of data when i say curate we're going to take a data set from google we're going to mix in some background noise pick the samples pick the classes of the words that we want as keywords one or two of those and we're going to create a new data set that's a subset of that of that that's been mixed in without all of the without uh with the background noise and we're going to send all that to edge impulse um question about python whatever's default in colab right now i believe it is pi i believe it is python 3. um i thought i printed that out but you should be good with python three um and that should be what's running in colab um if you have not just for me i forgot to check this um change runtime ah here we go um we're not doing machine learning so we don't i don't think we particularly need a gpu runtime since we're not doing any machine learning on this collab right now so we're just using this as a curation script so we don't need to worry about a runtime we're just going to uh do the cpu runtime these are some settings that you can you can modify if you wish but i recommend leaving them alone for right now for this particular demonstration um feel free to look through here we're going to download the google speech commands data set which is just a pre-made data set for us um there is verbiage in here if you go up to show you how to you can modify your own or add your own keywords to modify that data set um you will want to use the google speech commands data set and then add your own on top of that and i'll show you why in a minute um this also allows you to pick the output number of samples your word volume your background volume your sample time we're gonna be using a second and your sample rate of 16 kilohertz um also your bit depth all these things which are you know juicy good stuff that we're going to need to train uh our machine learning model with the reason we set them this way is because we want to look at the microphone so we've got our i know i'm probably small we got the microphone in the new the attached to our nucleo board and that microphone samples at 16 kilohertz and it gives us actually that microphone gives us 24 bits i believe um but we're going to truncate that to 16. so with 16 kilohertz sampling at 16 bit um at 16 bits pcm knowing what our microphone is that's what we want to um have all of our samples be when we go to train it otherwise there's going to be a mismatch when we go to deploy our neural network all right let's run this which doesn't do a whole lot it just sets a bunch of variables and then we're going to say hey go ahead and download the google speech commands data set and this is going to take a minute or two um what we're doing here is we're downloading the google speech commands data set from this link here http download tensorflow.org and it's going to be downloaded not to our computer but to this server right here this is attached you know somewhere across the internet and let's go take a look here ah when we go to start running the curation we're going to talk about how this curation actually works so i'm going to give you an a preview here um because we're gonna need to make an account on edge impulse first so let's yep so we actually have to do that next so while this is downloading for everybody see i'm gonna jump in back and forth between my slides because my slides are for you also for me to remember what i need to do next so if everybody has the stm32 cube ide make sure you're installing that or have it installed we need to create an edge impulse account so go to edgeimpulse.com sign in create your account and we're going to click login here and we're going to create a new project i'm going to call it speech recognition or keyword spotted whatever you want to call it and i'm going to call mine i don't know one uh oh i already have one named one oh oh two good version control right there all right so we're going to go to speech recognition o2 we're going to click on that and then up at the top we want to find keys that little tab that's just on your dashboard and keys this is our api key so double click on that and we're going to copy that don't worry if you double click and you don't see the rest of it it does copy the whole thing we're going to go back to our script it looks like it is done so extracting done and actually one second so we just ran that all right so if you click on the folder icon on the left side of colab you actually get the file system in fact you can go up a folder and if you've ever used linux this should look very very familiar our essentially home directory where we're going to be working in is this content directory so i recommend taking a peek in there and these are the files so we just downloaded the speech commands data set and we extracted it or untard it um rather and what you see is all of these files which contains a whole bunch of wav files the first thing we need to do is extract the background noise out because my script that i wrote for the curation does not take kindly to background noise being in one of these input directories so that needs to be its own separate directory outside of that that's all this cell does so let's go ahead and run this cell and then we're going to refresh this refresh your file system and you'll see that it pulled the background noise category out google speech command data set gave us samples of the background noise great we can use that feel free to also add your ad your own um your air conditioning running like because mine's very loud you're in a crowded room take a minute of recording make it a wav file drop it in there and my script will automatically extract background pieces of that background noise to mix into your keywords later so let's go back into this real quick i'm going to open up the background noise or excuse me the back wow not the background noise the the backward keyword and i'm gonna say download so feel free to download and check it so we're pulling in this file from this from colab and i've got when amp not wanting to play nicely because it really whips something so i don't know if people can hear that because it's my system audio but feel free to play one of these and you should hear the sound being played somebody a somebody providing a sample of back or backward or whatever the keyword is and feel free to try any of these and it's just people saying this word and there's like 2 000 samples in each of these it's a pretty big data set um once again you can add your own if you wish if you want to say you know record yourself adding these and add it to it so um this allows you to download my personal custom data set that i've been working on um it's mostly a bunch of garbage in there of like you know trick or treat hadouken drakeris just me playing around and just adding to this it's somewhere on github so it pulls it in but we're not gonna do that today we're just gonna stick with the google speech commands data set this is where you want to change some stuff so we've got the custom data set path we're going to leave this empty um because we did not create our own custom keywords as part of this next where it says the ei api key we're going to go ahead and paste that right in that we copied from edge impulse now let's pick some words we want to look at the available words for us initially as a demo i said stop and go that's pretty simple but let's go with let's try house and let's go with zero no that's just two words here uh where's the api key very good question so you have to log into edge impulse uh create a project or or go into an existing project that you have you want to go to dashboard and you go to keys and then it should be right here so you can just double click that and copy it that's where you find your api key good question thanks for the reminder to go back and show where to get that from maximize window it's truncating um so it is truncating this i can't make this my window is maximized i'm i'm broadcasting at like um 1920 by 1080 so that everyone you know it's a 1080p stream um it does copy the whole key so even though it looks like it's truncating here when you paste it in you get the whole key um it's just a a weird thing um so anyway we've got house and zero um as our keywords for you pick houston pick one word pick two what i have found is that picking three or more um the machine learning model is not able to quite keep up when we start getting bad classifications beyond like three or four keywords um you can make it 10 and see if it works i can't promise the model is going to fit on your microcontroller or that it's going to be very accurate um the other thing that i found is that keywords work better when you have multiple syllables and you've got um various sounds like consonants and uh vowels all mixed in there which is why amazon went with alexa um you have okay google it's it's multiple syllables rather than just like hey um because that sounds like a lot of other things humans can say so what i have done is picked house which is a single syllable we'll see how that works and zero which is two syllables the more syllables you add the better accuracy you're gonna get for that keyword so pick one or two from the list over here um if you have time on your own if you're going through this feel free to try creating your own and create a custom keywords uh set that you add into this and then so we're going to run that one and then we are going to download the curation scripts which is just on that it's in that github repo that we're using as the worksheet so i'm going to go ahead and refresh this data set curation you can double click to take a look um i mit licensed this so feel free to use it and it's just pulling in wav files mixing them with background um yeah and then spitting them out in a data set that's gonna be useful for us so you should have gotten to this point and now we're ready to run the actual curation so we're going to shift enter to run this and this is going to take about 20 minutes um 15 to 20 minutes so i'll come back and continue to check on this as we're going through and i'm going to talk about kind of what's going on why we need to do data curation here uh when it says mixing unknown it is just putting random other non-keywords yes that is absolutely correct and i've got a graphic for that so while that's going so i wanted to get to that point of having people run this part run this script because it's going to take a while uh i'm going to go back to my slides here and talk about exactly what's going oops that is definitely the wrong button and a little y control f5 doesn't work for me but it does not all right so what we've got going on here with this script and that is a very good very good question oh no such file or directory oh let me make sure was i supposed to add a link higher up in the like yeah um so if you go up uh the custom data set path is blank correct custom yes that is correct custom data set path should be blank um i'm trying to remember where i put this uh there should be something uh here it is uh make sure you've run this cell um it in the google speech commands data set got it if you see background noise it should be pulled out of the google speech command data set it should be on the same level uh the folder should be on the same level um otherwise it's considered a keyword and that's not what we want got it thank you sure target word cheese is not found in the sub directory um that is because you need to pick from the available keywords in the google speech commands data set so if you on the left here if you drop that down these are the available keywords you have yeah so um i don't have the time to show people how to create their own keywords but if you sit there and record yourself saying cheese like 50 times on your cell phone and then take something like audacity and create one second snippets out of that um you can then create your own keyword it's it's not that hard to do it's just time consuming and i didn't have time for this workshop but that in the github repo there is the walkthrough that shows you how to do that you need to go into audacity create these one second snippets and then you add it in as a custom keyword here and you can create your own custom keywords so yes you can definitely get cheese in there but for now for this workshop um please select one or two of the keywords available in the google speech commands data set uh missed the api key from edge so if you go to edge impulse um log in create a project go to dashboard go to keys and it should be right there um it's okay if it's truncated it will copy um it will absolutely copy the uh the cube the api key for you you're welcome all right so we've got this what's going on here this script is just a simple python script and it is going through each of the google speech commands data set it is pulling in each of the files or it's actually randomly going through and looking for um a number of files in there so like remember we said 1500 we want so we've got so say down right down is a category in the google speech commands data set it will randomly pull 1500 files from that and then put them in the down category on the output during that process it's also mixing in random snippets of background noise to help let us create a more robust model it does so for the up directory unknown is random samplings from all the other directories and noise is just random random snippets of background noise these are going to be the four categories that we will train a model to recognize the output does not recognize any of the google speech commands data set just the four categories that we've provided here um so in this example it's up down unknown in noise for the one that i'm training going to be training now is going to be house zero unknown and noise um and for you it's whatever keywords you you pick unknown and noise malfunctioning heap of scrap of the available data said yeah i know uh that would that would be quite amusing uh maybe long longer than a second but that would be pretty good i'm hoping somebody puts other like great quotes in there and uh train some models to recognize them um i really want to see like makers go nuts with this and like have like light up crazy stuff whenever somebody says like the keyword of the day um yeah that would be that would like pee wee's playhouse style right i want somebody to say a word and like they're like their house just like starts going crazy all right so uh data augmentation why do we do data augmentation um data augmentation is a extremely useful um technique we can do in machine learning that allows us to take a smaller data set and expand it out to a larger data set and it also helps us create a more robust model so for like image processing um what we can do is you know say we're trying to identify pictures of cats well let's say we start with 30 pictures of cats that's going to make for a pretty terrible model what we can do is we can balloon that into like a thousand pictures of cats by doing things like shifting the cat over rotating the cat a little bit it helps create a little more robust model to be perfectly honest 30 pictures is not going to nearly be enough because you're going to identify like one type of cat or your model is going to overfit that and just identify specifically those pictures and no general cats um the trick in machine learning is just you shove more data at it um not always that's not always the fix for things but usually it's like why isn't my model working and the first question is did you give it enough data so data augmentation helps us um helps us create more data from that and for our case what we're going to be doing is taking up and putting in background noise because we have a lot of starting data uh we have over 1500 and we're going to be creating 1500 samples for each category we can actually take a unique keyword or sample of a keyword and mix in random background noise and that just helps create a more robust model if we were to do real data augmentation we take each keyword mix each keyword with a random sample of background data so each one becomes like eight or nine samples so we start with you know 2 000 and we end up with like 10 000 samples because we've each keyword now has different samples with different background that's even better we just don't have the time because people would be waiting forever to upload all of that data to edge impulse and the data curation script would take like three hours i did this so i actually know on my local machine it takes like three hours if you do this method to end up with like 20 000 samples let's check on the script how is it going all right so for whatever bizarre reason collab does not like to show my progress bars so once it's done mixing the progress bar will just appear at 100 the progress bars do work at on your local machine if you run it locally i just never bothered to fix it on co-lab so that's why you're seeing a weird progress bar all right so it's finished with background noise and it's currently mixing the house great we have time to do some more talking all right while that's going a little bit of history of artificial intelligence ah you hear it artificial intelligence ai ml machine learning deep learning all these terms are thrown around and a lot of times interchangeably and the machine learning community has kind of come up with this concentric model of what ai versus ml means so if anybody here is familiar and works with ml i apologize because you've probably already heard this but for anybody who's new to it um this could be an interesting take and hopefully give you some vocabulary when you're talking about ai or ml so what is artificial intelligence it was coined in 1956 by john mccarthy as part of this conference he wanted to bring people together to discuss can we make computers behave like humans um can we give them intelligence and it was it was very conceptual a very small group of researchers doing this kind of work later he came later he comes back um in in in an interview and gives a much better definition ai is the science and engineering of making intelligent machines especially the intelligent computer programs okay john what is intelligence intelligence is the computational part of the ability to achieve goals in the world so according to john mccarthy here anything that tries to achieve a goal is considered intelligent and if we have a computer do that it's considered artificial intelligence that is hugely broad um from what i could tell if i program my thermostat to say keep the room around 72 degrees fahrenheit and it does so with like a few if statements that's a i okay so ai can be very very broad so what about machine learning so this comes about not much later after john mccarthy a guy named arthur samuel coined the term um if you're familiar with his work he was famous for making a computer uh play checkers and that checkers experiment i think it took him years to finally get it working and he wrote a paper about it where he coined this term machine learning this has more to do with can a computer learn from experience and apply that experience to data or situations it's never seen before so that's why he chose checkers because checkers has a fairly strict rule set um and can you teach it to learn from prior experience to eventually beat you in checkers tom mitchell in his textbook in 97 gave a much more sciency like much more science-y specific definition a computer program is said to learn from experience e with respect to a class of class of tasks t and performance measure p if it's performance in tasks t as measured by p improves with experience vader basically d does a computer program get better at some task as it's learned from prior experience so that is machine learning and we can apply that to things like we can show it a bunch of pictures of cats and have it learn what a cat looks like finally we get to what is this deep learning so rina dector wrote this paper um somebody's gonna correct me here i think she was doing biology research um when she wrote this paper around in 1986 and she defined it as a class of machine learning algorithms that use multiple layers to progressively extract higher level features from the raw input funny enough wikipedia of all places gives the best definition that i have found for deep learning basically it's machine learning but more complex so like okay what are we talking about complex the idea is that we start to run multiple layers of learning algorithms where the first one says does things like extract features and says what am i interested in and that feeds it to the next layer where it may say like try to classify those features that have been fed to it from the first layer the reason that's machine learning is because we're not directly adjusting these parameters we're not directly feeding it these features we're saying go ahead and learn here's a picture you tell me what features you want to learn right this might be it picks out you know for for facial recognition we might have fed machine learning algorithms originally like oh look at where your eyes are or your nose are and then we you know and then we feed that to some type of classification to do facial recognition but this deep learning says you know what here's an image of a face learn it actually here's a thousand features or here's a thousand images of faces learn what those are and so um it goes through and it might take out like ears and your hairline you're like i never thought to do that so the machine's doing stuff that we don't know about it's outside of our control at this point or uh even sometimes understanding and that's kind of what makes it this deep learning we don't know they become these black boxes now there are ways to dig in and figure out what the machine's looking at with these like heat maps which is really cool but we're not telling it look at these features the machine's learning to do it so you end up with this concentric view of ai ml and deep learning where machine learning is a part of intel ai and deep learning is a part of machine learning we're using more complex models so i hope that gives people a broad idea of what we're talking about here are there any questions well we're going to go back and we are going to take a look at our script that's running excellent i'm still on zero so i've got some ways to go i think everybody in this class when i ran this last night it took about 20 minutes and i'm curious if i'm if i'm just talking quickly or if i am no i just talked quickly for that i have no voice uh i guess people are trying to ask questions yes i do i talk quickly i'm trying to burn through this because i know i only have two hours uh when i'm when i'm making a pre-recorded video i go a little more slowly and stuff um but for this whenever i give presentations like oh we gotta get to all these things so we're going through these here so hopefully everybody's on this is every anybody stuck i want to ask is everybody up to here if you're following along oh we've got some some comments rolling in from anybody interested in other data sets uh here is a list actually let's go explore that that looks really cool good everybody's everybody who wants to follow along is following along and up to this point where they are um doing the curation uh sean will be helping the first time users of stm ide kind of i'm gonna show you what you need to do in order to run this demo um that is unfortunately one of those things that is a little bit canned at this point um because i don't have time to go into all the features of stm ide um i've got videos out there that show you like what you need to do to like get to blinky and all of this speak faster yeah [Laughter] okay so everybody's saying that the uh curation script is going about the same so everybody's right about here with the curation um and it has once it's done with zero it's got one more because give us kind of a if we have a minute an overview of the tool chain that we're using like oh clicking a lot but i wanted to have a better understanding of how everything's linked together yeah very good question thank you so the tool chain that we're using here is um so we're running i wish i had a slide for this that's a good question i didn't think about that beforehand we have um jupiter notebook so on a google server we have a jupyter notebook that's basically running python right this is all this is is just a python script more or less the jupiter notebook just lets us run that in you know chunks at a time it's very good for learning if you haven't played with it um i really like it and so we've got we've got python running here what we're doing with this python is calling this is the silly part with the collab which is why i recommend if you have time run it locally um because we're running it in colab we're actually using collab here to run system calls more or less so like this um exclamation point python we're actually telling linux to run python and do pip to install use pip to install the sound file package um so sound files like one library that we're using to read and write wav files later we'll be using the edge impulse cli so right now the best way to think about this and this is um is as a linux box right i don't really have a command line into it so we're using collab to be a command line uh uh ml ops i must have missed something um yeah i'll get through this question first we've got a few about audio so that's the first part and we're using basically strict python in this curation set um let's see what does it install what does it need uh sh util librosa sound file numpy it's some basic like audio manipulation and numpy for doing matrix type operations um this python script all it's doing is just curation right we went through that we've got curation so the the pipeline from there is once we've curated this data set we're going to send all of these to edge impulse which which is this online tool that does the training for us and once we get to here i'll show you what's going on where we're going to do feature extraction to to generate uh mel frequency ceptual coefficients those get fed as features into a convolutional neural network that's going to be trained using keras on top of tensorflow once that's done we're going to download a like it's going to be converted the model itself will be converted to a tensorflow lite model we're gonna download that and include it in our um in our ide our um our good lord our entire set for um stm for the stm code um that's gonna be a library first just like a c plus plus library so i hope that helps does that kind of give you an idea of like the technology pipeline we're using here yeah that was great thank you cool all right i've got a few other questions here i got to blinky last night in stm30 and s1032 and it was a lot yeah stm32 is like i now know how to do this in arduino and i would have definitely done an arduino stm32 stm32 cube id is is is a lot it's it's a professional ide it's built on top of eclipse um i am not a big fan of eclipse but generally stm 32 cube ide works out of the box when we gather audio data with hardware should we use the device or mobile phone so that is a very good question um the method we're doing with this with the male frequency step steps coefficients actually works from just about any hardware recording device especially considering we've resampled it and we're going to be doing the discrete cosine transform which then gives us an idea of the overall shape of this this spectrogram is kind of what's going on which means that we can record it with any device um depending on your machine learning application so initially when i was trying this i was trying it without the mfcc's i was doing it with just the spectrogram when you do it that way the microphone definitely um plays a much bigger factor into um what your sound signature looks like and you should use just that microphone um so my recommendation is if you were recording data yourself i would always try to use the same sensor if you can there are mathematical things you can do to extract features that where it doesn't matter which sensor you're using um to a degree so i hope that answers that question i programmed a base program didn't do anything just look into the download verified cool um we're using collab as ml ops machine so uh yeah i guess that was the question we're not this is not the collab is not the ml ops in this case which is hilarious it's just a way so that people didn't have to install a whole bunch of python stuff on their own computer um yes a collab is a decent way to do tinkering with ml ops um however please note that uh collab disconnects you from the runtime at 90 minutes um if you're if you're not using the mouse or interacting with it in fact we may see that happen at some point um you could that in that case you just click in and get to it um if you've been using it for 12 hours it will completely shut you down and reset the whole uh file system linux server and everything for you so you cannot use colab to do um end-to-end ml ops especially if you're training like large networks that take a day to train um it's really meant for research collab is a great um interface for for research and training small models um but not for big models um and like not for like setting up a full pipeline it will just disconnect you uh can we do the edge impulse training locally so we don't have to give our data to somebody else um i do not believe you can at this point um because they're doing all the server all the all the training on their side um that would be a question for edge impulse um if if any of the edge impulse people are here um they can answer answer that one i know you can set up like programmatic there's an api where you can set up programmatic stuff so you can do um you can feed it data programmatically but i still think it has to go through their server um for running locally uh that's where i like i i do have model i do have training algorithms out there um on my github for training stuff locally but it's not doing the full mfcc thing where you get a better model from edge impulse which is why i'm using edge impulse and it just makes it easier you can see the graphical flow uh anybody got oh here we go ain't got blinky working with os x um i've seen some issues with os x whatever you're running i don't have a mac so i'm unable to verify what's going on with um with mac stuff with stm32 um yeah the debugging connection loss shuts down other people have run into this try using the program for my github to see if that works um there have been other issues with just trying blinky um and and the deep where the debugger just shuts down can you log multiple channels at audio speeds that are synced can you log multiple this is on youtube i not quite sure um any if you have multiple channels coming in yeah right now we're doing mono so like 16 kilohertz mono if you've got two channels that creates another uh that creates something else that you would need to train the model with so you would have to create the entire data set from scratch because now you have twice the data being fed to the machine learning model okay some people are saying if you're running into problems with the uh debugger not working in stm32q ide go ahead and either try restarting the ide restarting your computer unplugging it re-plugging it back in all the good it stuff seems to work here and magically fix it sometimes and done good timing all right let's continue running this script so we've got the edge impulse tool done all this is going to do actually let's take a look at this first so on the left side you should see keywords curated let's go into there and you should see the four categories that you picked or excuse me the two keywords you picked plus unknown which is other keywords and random background noise feel free to try downloading one of these so i'm gonna try house and once again i don't think my system audio is saved i'm going to bring this up so try running a couple of these what you should hear is somebody saying a keyword from that original data set and mixed in background noise so give that a shot if you want on your own machine i don't think you can hear my system audio so um i'll let you guys try that finally run this last cell and that is going to upload stuff to edge impulse um using that using that api key that we gave it so i'm going to sit here and let it start going yay it's uploading everything if i come into here in edge impulse i go over to data acquisition um and you'll start seeing some of the data roll in um it may not be in real time you might have to refresh some stuff but it should happen in fact i think it does test first yes it does test data first which brings up a good point why are we splitting stuff into test and training sets so we're running that script while it's uploading for everybody what we're doing here and let me get rid of my my downloads what we're doing here is we're splitting the data set so we've got you know 1500 samples and 20 of those are going off to this test set which exists separately from this training set um the the samples were random the samples were randomized when they were chosen from the google speech commands data set so when they were given those file names of like point zero zero zero zero point zero zero zero one point zero zero two those have been randomized anyway um that's why i didn't use like a hash or anything they've already been randomized um and that so that that's why the naming doesn't particularly matter um for those individual keywords so we set aside 20 for testing and the rest of it's going to be used for training our neural network so why do we set aside stuff for testing well we set aside these keywords for testing because we need a way to verify that the model worked okay that seems simple enough but we absolutely do not want to test with any of the data that we used for training and the reason for that is because what you'll find often is that your model if usually when you start the model itself will overfit the data and that just means that the model is very very good at picking out the specific instances and samples that you've given it and it knows how i how to identify those and it's very bad at generalizing to data it's never seen before um so when you go to deploy it if all you've ever done is done is looked at training data and you go to deploy this you're like why isn't it working like well you probably overfit or you probably had the model over fit the data and so the test the test set helps us determine if that's been overfit because it's data that's never been seen by this model before um so it's very very useful um i would say you almost all the way you always want to have a test set um is it 20 is it 10 that can be up to you whatever your needs are but in the machine learning world 20 set aside for testing is kind of your normal and let's see how that's going this is still rolling so let's talk about features and what's going to be going on here with features so the first step once we've uploaded all of the data to edge impulse we're going to be extracting features features for us are going to be the mel frequency septual coefficients yay whatever the heck those are i'm going to briefly run through this and then give you a link to a site to go read on your own if you really want to scramble your brain for what these are um so here we go mel frequency sexual coefficients what we're doing is here's our sample here's an audio sample of saying somebody somebody's saying like zero or something right we're going to take a window we're going to take a slice of this and we're going to compute the fast 438 transform which gives us uh basically our power over our um frequency so we've got these bins right so like at zero hertz there's this much power in this spectrum and at 4 000 hertz we've got this much power in the spectrum and it kind of looks like that and it's no longer a time series we're not working in the time domain anymore we're looking at it from a frequency um i wish i had time to get into how fourier transforms work but uh one i forgot most of my signals in systems class and two we don't have that kind of time um that is a fun class if you ever want to get into how fourier transforms work and doing them by hand anyway the next step we're going to be doing is we're going to be taking the mel spaced filter bank mel spacing is basically taking a linear set of of filters um for lower frequencies and logarithmic filters for higher frequencies and then combining the energy that it sees in each of those in each of those frequencies and basically adding it up to create a set of numbers an array so this .002 comes from this first filter where it just takes the energy that it sees here sums it up that's the first number the second one is the second filter you notice they overlap a little bit so it adds up all the little blue line all this power all these power numbers you get an energy and it ends up in the second one and it continues on the reason they're specific in mel space is because mel spacing has to do with how humans perceive sound we don't perceive sound in a linear fashion in fact humans don't actually perceive sound in a time sequence like this either we don't process you know individual little samples here over over a series of time our ears have these hairs that resonate with various frequencies so how we hear is actually closer to something like a spectrogram and how those and how those individual hairs are vibrating has to do with um the frequency that they're they're basically tuned for essentially but uh though that tuning happens to be more linear in lower frequencies and more logarithmic and higher frequencies um so this replicates how humans hear um obviously if you're gonna do another animal this would look a little different but we're trying to go for humans here right we're using human speech once we have this set this array that remember comes from this little window slice we then compute the logarithm i think it's base 10 for this of each of these numbers and then so we got this array and it's just oops so we've got this array and it's just you know the log logarithm of each of these numbers from there we compute the discrete cosine transform um and what that's doing is it's basically like another fourier transform more or less but i i believe you're using real values instead of complex values if i remember how the dct works um and this helps us give an idea of what the shape of the original um fourier transform looked like or the fft rather um and by shape i mean how many like what frequency of things you're going going to see um so it's kind of like taking the fft of the fft in a bizarre world like think about it that way and that's kind of where that's kind of where i gave up on really trying to understand what's going on with msccs but my my basic understanding is that you've got um the lower values kind of give you an overall shape of your fourier transform and then the uh the higher values give you um information about what some of those peaks look like and then so we just move that window over our entire audio sample and we drop in each of these arrays the mfcc's over this over this whole sequence so we take the first one that becomes this first set of arrays we slide the window over we compute the mfcc's that becomes the second one and so forth until the entire and we get to the end of the audio sample another way to think about this is this is how we're doing training when we get to actually computing this in real time these are being performed basically in real time so it as the audio streams in it's taking these mfcc's and every time it computes a full section of mfcc's it then tries to classify that um and just sending that off to the neural network for inference so we've got the mfcc's it's just a two-dimensional array that's all it is so we get this two-dimensional array and it looks like an image which means we can actually use neural networks that are better at image classification um or meant for image classification to classify audio because these look like images and here's an example of stop here's an example of zero um even though they happen at different times you can kind of see that the bands of the mfcc's look a little different and that's what we're training this neural network to do okay so you got through all this you're like what was he just talking about oh my goodness um don't think about it too hard like i said i i my brain started to hurt after getting to that like dct step and going okay i kind of understand what's going on in the dct but i don't understand why we're doing the dct of this like logarithmic stuff there's a great article on practical cryptography if you want to go read more about it just know that mfcc's for features are very popular in things like automatic speech recognition which is what we're doing all right i hope that helps let's go check on our uploads they are still going but we are let's see zero so many noise zero i think it still needs to do house so this is still going and we can stop here and see if anybody has questions i'm gonna have to scroll all the way down so i apologize if i've missed stuff so we're image classifying the spectrograms or the numerical values of the arrays um very good question we're basically classifying the spectrograms but instead of spectrograms they're the mfcc's um i originally played with i originally played with this by classifying spectrograms it works it saves you from having to do that dct that discrete cosine transform step um which requires a lot of in fact that requires more computational power than the actual neural network um which you will see in a minute which is kind of crazy but you can classify the spectrograms as images and those images are just two-dimensional arrays they're grayscale images is all they are i know i showed you ones with color they're grayscale images uh is the input assumed to be exactly one second long yes the the mfcc's are going to create a basically something that looks like this and in order to get exactly i think it's like 16 mfcc's at least in this example um i don't remember what it's going to be an edge impulse but in order to get that exact number it needs to be exactly a second long so yes how do we figure out when to start the sample for classification um if you're if we're talking about deployment what it's doing is it's doing this um in real time so every time a new chunk of audio so like let's say this window of audio is 100 milliseconds i forget exactly what that number is but let's say this this window is 100 milliseconds every time a buffer fills up with exactly 100 milliseconds it takes that performs the mfcc's or calculates the mfcc's so you get this array and then you you fill up another buffer that looks like this and it's you just start cueing these so it just like continues to cue these and as you get past a second it starts dropping the first ones off so you've got this moving buffer of mfcc's and then every time you get uh let's say what do we say 16 so every time you get to like six or set like every time you add six or seven new mfcc's to the front of this that gets sent to the inference engine um that gets sent to the inference engine which then performs the classification so the moving buffer over lap yes the moving buffer overlaps so um new samples will come into this every time you add six new here and drop six even though it contained ones from the previous uh um sample that or the the previous buffer that we tried to perform inference with uh we send that to the inference engine um so in the on the uh arduino that's like every 333 milliseconds on the stm32 we can actually do that every 250 milliseconds uh yes it is exactly a slide it's it actually ends up becoming like uh it's not a sliding window sliding away it's a sliding window it's it's yeah um the sliding window analogy or the sliding window works better when you think about it taking samples and training um when you when we're doing inference on the microcontroller it's easier to think about it as it's just reading constantly reading audio in and then updating a buffer kind of like a queue and then every time that cue gets to a certain point or you've added x new q l x new q chunks to it that goes to the inference engine um but yeah like a sliding window and this thing is still going all right uh yeah you're welcome is the input assumed to be exactly one second long yes we answered that uh sorry i just answered that one so we're image classifying yep i got to that one sorry i'm going backwards through the questions here to see if i can get to any because we're almost done with this coming from music background i would highly recommend julius smith's work on the dft uh awesome thank you for that link is the process is this process run on each sample or on a collection of samples where a sample refers to a single audio file thank you that is a good question um because i often get confused when we start talking about sampling audio where you're taking an individual value at like 16 kilohertz or a sample for machine learning sense in a sample when we start talking about machine learning is a one second audio clip um and we're we're going to be sending those samples to uh for training i hope that helps and answers that question that's the one class i failed you must be talking about signals and systems yeah that was i enjoyed it because i actually enjoyed doing i like had this weird thing where i enjoyed doing the fourier transform by hand um it's like oh i just performed these steps i was way better at that stuff than like the math class where you're like oh figure out the permutations of people sitting around a table like i was terrible at that math class but i could do the fourier transform by hand i just i always thought that was uh really weird and i enjoyed signals and systems it's just like over time i just forget some of the nitty gritty details of it uh another recommendation for zack starr and uh 3b1b i guess that's is that the three blue one brown or three brown one blue uh that that i always mix those two up that's a great youtube um they have really good stuff about like fourier transforms convolutional neural networks um their stuff's really good definitely recommend uh checking them out i failed that when it became an electrical and an industrial engineer instead of electrical yeah that is unfortunately singles and systems ends up kind of being the weed out class for a lot of people for me it was junior year uh somebody linked in the practical cryptography for mfcc's uh yes that is that is the one thank you for posting that in the chat um that article really helped me get an understanding of what's going on it's still um quite confusing um and thankfully there's a number of tools out there that that just kind of do them for you all right while this is still going let's start talking about i'm going to jump after i'm going to check on this after we look at training a neural network so the fun part is once we get to the actual training step it goes a lot faster than this um the joke in the in the joke in the machine learning world is that you will spend eighty percent of your time um manipulating and massaging data and twenty percent of your time on actual machine learning brack propagation yes uh more or less we are doing back propagation i don't remember the exact algorithm we're gonna be doing for training um but yeah it's essentially back propagation um edge impulse it's like click button magic things happen but i'll show you where you can manipulate uh keras code um if you want to create your own neural network um but we won't be going into like you know uh picking out your your learning you can pick out your learning rates um but picking out like um uh like measurement values and things like that so training a neural network so we're going to be using a convolutional neural network in edge impulse they recommend a one-dimensional one um which essentially just takes so we've got that image right we're thinking about the spectrogram or the collection of mfcc's as an image and essentially what that's going to do is it's going to take a filter and just slide it across and you'll have to forgive me i don't remember if it's time wise or if it's um over the values um i will have to look at that one later but it's going to slide this one-dimensional filter across it um that's the first layer then we get to max pooling all that does is just looks an individual in this filter it says okay what are my max values after this filter has happened and then just gives us those um i'm in nlp and interestingly enough we're using filters and other convolutions to parse tokens and recognize grammar oh interesting yeah i don't know much about nlp um so that's fascinating i'm not surprised that it's very similar of this idea of like filtering and convolutions and um like those things definitely overlap um and a lot of this convolution stuff came from um vision processing of filtering images um prior to machine learning and that and that and by by basically saying we're gonna make these filters work and have the machine just figure out what filters it needs what values in these filters so that's just the machine learning part of this um we're going to actually do that filter and pick out the max values twice so that kind of like takes this image and squishes it down to its raw values and the idea is we're trying to get our features from that image right we've already extracted the mfcc features and now we want to have this convolutional neural network say what are the features in this image that i care about and so we're going to take those we're going to like process this down to like the individual like pixels that it cares about and by individual i mean you know there's still going to be a few dozen pixels that it's going to care about and maybe 100. um it's going to flatten that all to a one-dimensional array and then it's going to send each of these to a node in what's called a dense neural network um it's just one layer in fact it's just four nodes that correspond to each of our classes and the softmax the output of each of those feeds into the softmax function where softmax gives us basically a set of probabilities that it thinks the neural network heard one of the classes and these probabilities after the softmax layer should add up to one so if you if if it thinks it heard the word up it comes through this neural network and it spits out up like 0.9 and then the rest of these like less than 0.9 but they should all add up to uh one and then all we need to do in our code is say which of these outputs is the greatest in order to determine what it thinks the neural network picked as the class so that's what's going on here um and yes we are going to be using essentially backprop um and we're going to be using a tool that you just click it and i'll show you where you can modify some of the hyper parameters if you wish if you're into machine learning um and give you some recommendations as to where you go to learn more about that um haha look at this haha what what is he talking about if this is all new to you um i highly highly recommend andrew eng's class on corsair i took that about a year ago and it was by far the best introduction to machine learning it takes i think a couple of months to get through where you're doing about five hours of work each week um and you start with like matrix algebra and by the end of it you have an understanding of what machine learning is and you basically design your own very simple neural network um in uh matlab is what you're doing um and that helped tremendously and then from there it's like a bunch of books and things you can just like how do i learn karaos how do i make bigger neural networks what's what's new in industry with this and um right we're playing with things right here that are like what like a dozen or less layers and you get to industry and you start looking like oh how are they doing object detection for vision and you're like oh my god that's like 300 layers how do people come up with this and you're like oh this is how people get phds so or this is how microsoft makes money by developing these uh neural networks all right this is still going cool i want to keep checking on this but um once we get there we're going to evaluate on the test set and i want to wait to get there to do that but in the meantime if everybody wants to go ahead and start putting this together make sure you've got your headers soldered on your microphone um connect it to your nucleo board we've got the left right clock so that's just that's just a clock signal that toggles back and forth that's how you get two channels out of these um that goes to a2 you've got your data out that goes to d4 your bit clock that's the really fast clock that goes to d3 and you've got power ground make sure it's three volts not five volts um select line that determines if it's left if that determines if this microphone is left or right channel um i think it defaults left this board has this line pulled down to ground so you don't need to put anything on this line um and i think it defaults to the left and um my code just handles that like the demo code that we're gonna use um just handles that for us are we done ah we're done good timing okay this took a little longer than when i tried this yesterday i think it's because everybody's trying to upload data to edge impulse right now um where is well i had a chat i had a chat bar up so i was being able to watch some of that so we should have data all in edge impulse right now hopefully everybody's about up to here um check it make sure you've got uh am i in test data yeah here we go so so your training data you should have about 20 minutes of samples in each of your categories zero house noise unknown and if you go to test data you should have about you know 20 of your total data so that's about five minutes of each that's like 25 minutes of uh of each one of each class let's go to our impulse design this is the pipeline this is your ml ops basically from a graphical perspective this is where you can adjust your window size you know how much they hop we're going to keep everything at a second there's no really windowing going on here because it's just one to one second chunk of data click to add a processing block edge impulse recommends that we use mfcc's which we just talked about let's add that this is just feature extraction um it's going to this this block converts those audio files to the mfcc's add the learning block they recommend keras great or a basic neural network that's been developed in keras great and what i can't see because i've got that window over there is save impulse so make sure you click that that is hugely important and then you should see uh these blocks appear on your left side so underneath in impulse design you should see mfcc's feel free to play these um you can actually go through and view your different samples and you can play those uh and you can actually see what your uh coefficients look like what these images are gonna look like that are gonna be features what we need to do is actually convert all of them to features so click on generate features and then click generate features and once again we wait for a couple of minutes welcome to the world of machine learning um you know the joke the xkcd joke about uh the you know the programmers fighting on the chairs it's like get back to work oh we're compiling that kind of thing in the machine learning world it's oh i'm training or i'm you know extracting features like you sit around and you wait for this stuff to happen and what i've noticed a lot of times with machine learning um with the exception of diving really deep into the math of what's going on into each one it's often considered very black boxy um if you've ever done any like rf work um where you have these tools and you're like oh i want to you know design a dipole or whatever it is you're like i'm going to tweak it in this kind of way and you just run the simulation you come back in three hours and you're like oh well that didn't work and you're like bend it a little bit in the in the simulator and run it again and come back so that's kind of what happens in the machine learning world um you start with an idea maybe you look at a research paper you're like oh this model performed well for this task i'm gonna start with that model and then you start tweaking some of the um maybe you like add an extra layer or you uh you insert some nodes you're like you know what you know let's let's try a one-dimensional uh cnn instead of a two-dimensional cnn and see how that performs and you just kind of like test it and go and iterate and iterate and iterate you get to something that works best for your needs um because a lot of times what you'll find in machine learning is that it's not about it works 100 of the time um it's it's one of those you know 60 of the time it works 100 of the time right um you'll find that it misses data or or it triggers on the wrong thing so if anybody was um there last night for the bring a hack um my trick or treat thing definitely picks up chicken feet um that's that's mr jp over at adafruit gave me that that one it's like chicken feet oh it picks up that's a false positive so what that means is like i need to go collect a bunch of data if somebody's going to be commonly saying chicken feet around me then or around the device then i need to make sure it's trained to not respond to that so one one thing you might need to do is then collect a bunch of data that says people saying chicken feet and either that as a separate category or it's part of the unknowns now and a lot of the unknowns so that it's close enough but it knows that that is not the correct word and it starts to pick up features in that utterance that it knows is not uh trick or treat feel free to look at the feature explorer um i'm honestly not sure what's being combined here because each sample should be a full image um trying to see where that is and i can get you the exact number of i can't i don't i don't have a number number of coefficients so there should be 32 in each time slice number of filters in the filter bank yeah i'm not quite seeing it so it's like 13 by like 40 something i think um so actually when we consider dimensions going into the neural network this is actually like a thousand dimension um input which which makes it a little conceptually hard um another easy like another way to visualize this or another way to think about this is when you start looking at things like this you see your dots pointed out to like oh machine learning is all about finding the boundary and grouping these together like oh is this one class is the orange one class can we define a boundary a mathematical boundary that separates these classes for us um that's what's going on it's just that humans we stop being able to really comprehend stuff after about three dimensions um it becomes difficult we start talking like a thousand dimensions like it's really tough for humans to visualize that um so things like if you think accelerometer data right i'm gonna capture accelerometer data and like xyz and maybe uh those get clustered together and we can create a machine learning algorithm that you know identifies different types of positions or different types of um vibration and it's a little easier to deal with something like that but since we're dealing with audio we've got many more dimensions going into our neural network here so we're done with generating features let's go to neural network classifier um you know all this talk we spent like what like an hour almost an hour and a half talking about extracting features and training the neural network's gonna be like the easiest thing in this entire demo that's the funny part of this whole thing but i want to get to the deployment side of running it on our microcontroller keep all of the defaults i'm going to tell you that right now and click start training so while it's going i can kind of show you what's going on here number of training cycles um somebody mentioned the back propagation and you'll have to forgive me because i lost my chat window i was able to keep up with chat aha here it is uh okay go to impulse design after upload go to oh okay somebody's asked can we go over the impulse steps again mine just finished yeah let me let me briefly do that while this is training so we go to impulse design um you want to add the blocks here you want to add it's going to recommend them for you because it knows it's audio data so add the audio mfcc one add the keras block the neural network um that's been done in keras and then you want to click uh save impulse over here that will give us new blocks on the left side click on mfccs or mfcc click generate features click generate features and wait for those features to be done generating so now we have a big array of mfcc's on the server that's what's going to be used to train the neural network not the actual audio data and then go to your neural network classifier and click the train button uh or start training that's where we are all right uh that was a good question um since as people are finishing up there but that way um mine can stay a little ahead here as i'm as i'm blabbing so number of training cycles so each epic is what we're calling it here um takes some of the training data um remember this collection of mfcc's it sends it through the neural network so that would be your forward pass it comes out with basically your probabilities of um which class it thinks it is and it's going to be all over the place when it first starts because all these all these parameters these numbers inside this model are randomized and then it's going to say hey how close were these probabilities to the actual label if we sent it up or or uh house or zero is that what was predicted and there's a loss um that's there's a loss function that happens that says um how it kind of measures the distance mathematically between how close it thinks it was to the original label and then the whole idea of this back propagation and this training is it tries to minimize that loss so it updates it goes backwards through the uh neural network and it updates these parameters based on this back propagation algorithm um to try to make that loss smaller or more or less the output predictions closer to the original label um if it thinks it's zero then it should be it should be like one point zero perce like probability of being zero and zero for all the rest and if it's not that it goes back through and updates these um update these parameters these numbers inside this neural network it's an iterative process um for a neural network like like this this can this is this can take a little bit of uh time for some larger networks this can take hours or days uh the github says we need to add data acquisition upload data in ei so if there's there's two there's two tracks in that if you're looking at the github uh write-up there's two tracks one track is you're doing everything locally where you're doing curation locally and then what you do is you go to edge impulse if you've done it all locally you need to go to edge impulse and there's a button here that says upload existing data and it will just it basically just gives you a computer interface that just uploads way all the way files from your uh computer if you did it locally if you're doing everything in colab it's using the command line tool to just automatically send stuff to edge impulse waiting for the job to be scheduled i'm i wonder if the edge impulse guys are sitting there going oh what's going on right now yeah i think we're ddosing ei [Laughter] i'm gonna have to chat with them and be like hey did you guys notice an uptick of all your uh your servers how did how did what went on [Laughter] oh no that's good all right so hopefully we are done i really hope we're done okay so it looks like i lost my output here um and i don't want to retrain because i actually have everything going on here so in yours you should see something that says uh loss accuracy validation loss validation accuracy and um during the training sets during the training sets um they they pull out a small test set that get integrated back into the training set and they hold those aside for a second you know do one pass basically of your of your uh forward and back propagation and then they test the model with that validation set and then those get mixed and randomized um for the next epic basically it's my understanding how they're doing it for this validation set because normally when i've done it i've pulled aside a validation set to use uh separately uh do we leave the neural network settings as their defaults before we click start training yes please leave them as their defaults um give me just a second i'll go through a couple of those um but this is where it's like go learn a bunch about machine learning take andrew wang's class you'll understand more about what these these are usually called hyper parameters you'll understand more about what these hyper parameters do and you can feel more confident in manipulating these to serve your purposes um but for now just keep it as default so the what you should see if you scroll all the way up to your original first epic of it of that training pass you should actually see um that loss start to go down and you should see your accuracy go up both for the regular your training set as well as of the validation set um if your validation set starts to if your training set starts to split and become more accurate than your validation set it usually means your model's overfitting that means it's better predicting just things it sees in the training set and cannot generalize to things it has not seen before and we're going to test that with the testing data so for these um quick question about these this says how many steps does it do anything between 30 and 100 seems to work for this but you can actually manipulate this and do what's called early stopping you can change this to 30 if you see a lot of overfitting and have the model stop a little earlier to kind of like like like don't try to learn more um because then you're going to start over fitting to that data learning rate um there's you know pros and cons to higher and lower learning rate and minimum confidence rating doesn't really apply to us um it only applies to when it gives us testing here but we're going to get the raw probability scores out anyway when we go to the microcontroller do the data set duration of clips should they be the same yes your all of your the data you send to edge impulse i believe and the ei people can correct me if i'm wrong here but i believe they all need to be a second or at least what like they could be one second two seconds they all need to be the same um my script actually truncates anything over a second or it pads zeros to make it a second so if you're using my curation script it accounts for that just make sure where the utterance is isn't going to be truncated uh yeah start training if you've not done so um this output refreshed for me so what you should see is like a bunch of output here and [Music] and you should see a confusion matrix when it's done down here uh save for machine data i'm not quite sure what you mean by machine data do the same do the data set duration of clips should be the same um so they generally should otherwise you're gonna end up with different lengths of mfcc's machine sound data so like sounds from a machine or like when you record sorry i'm not quite following that question okay so sounds from a machine yeah i guess they should all be the same they should be the same duration um because otherwise like the you have to remember that your neural network expects exactly um a number of input uh um a number of inputs so like like that that image that is a like 13 by 40 whatever image it expects like exactly that array as an input if you don't feed it exactly that you start throwing a whole bunch of errors um okay so confusion matrix you have your uh let's see actual label unknown so you've got your actual labels on this side and your predicted labels on this side which you should see is the numbers in the diagonal should be the highest and that means it's doing a decent job of predicting and it's got an 80 accuracy i don't see any neural network classifier on my ei um it should be on the left side um i've done this before don't forget to click save impulse after you've added them over here and that should cause them to appear over here um yeah if you don't click save impulse they don't show up over here say 15 second clip to 15 clips of one second yes if you're if you're creating your own um samples they need to be 15 clips of one second each each of those is what we're feeding into edge impulse all right so you can do live you can do live classification uh no not live sorry that is if you actually have a microphone connected and it can read it we want to do model testing that's actually going to take our this is our test set the 20 we set aside so click the top to highlight all of those click classify selected you'll see what it's going to do here is it's going to take each of these samples that we set aside a while ago and it's going to send all of those through the neural network to classify us that for us um i don't see anything that says save impulse so if you go to impulse design and you should see audio mfcc so if i modify this here actually i can't modify because it's going to wreck everything past it so if you see this here like i don't see it it should be over here on the right side underneath output features uh what tool do i recommend cutting the clip audacity uh no worries audacity is definitely the the tool you should use to cut the clips um if you find a way to automate them great but um because you can't be quite sure where that utterance is like you can create something that says oh i see the utterance starting and then you can try to like center it or move it around and then you can come up with some automated tool to truncate exactly a second or crop out exactly a second for you um but because i was working with at most 100 samples i was just doing everything on audacity for the custom stuff all right testing should be done we can come back to here and it did not work for me let's try that again [Music] testing is at 65 awesome some other people's testing is rolled in uh what is mine at 73 oh mine was better than last night so this is a case of overfitting um most of the time you will find that it that it overfits um might actually like perform surprisingly well i got like 65 last night this is a general case because we train too long the model's not quite right um for whatever reason it overfit that data and now it's not as good as classifying unseen data there are a number of techniques you can do to combat this um one of them is to stop it early um if you want to i recommend going back and try training with 30 epics instead of 100 um the other thing is getting more data you can always get more data uh you can try um adding in different layers in the neural network so if we go to the neural network classifier um you can actually add new layers you can try uh you know do if you know what you're doing with neural networks you can try adding different layers you can try modifying um like your dropout rates you know maybe try 0.5 dropout rate right like dropout is great for preventing overfit over uh oh 15 minute one oh shoot i thought we were going to 6 15. we're going to do this faster all right let's go to deployment uh can you add keras early uh i don't think you can um so in deployment we're going to go to [Music] actually c plus library feel free to analyze if you want but i'm just going to say go ahead and build this will tell you um this will give you an idea of uh how much time it's going to take and what kind of processing you'll need note it's not giving you the mfcc calculation time so with that and i apologize i have to go a little more quickly here go back to the github repo download the code so we should have so once you've clicked build for edge impulse you've clicked build go to the github repo where the worksheet is and download that zip so you should have two and of course windows is showing me that so speech recognition um unzip both of these uh note that if you are on windows you will want um 7-zip you will probably run into windows running out of um space to print out its um file names or its file paths so yes i'm going to use 7-zip to extract this i'm going to use 7-zip to extract both of these uh intrinsics for single instruction multiple data i don't believe it does um because i don't believe these these are microcontrollers are set up for that so we you should have so this is our model that we downloaded from edge impulse um this is what the library looks like our model is actually in tf lite model uh uh you need to click the build button at the very bottom for edge impulse to get it to build and download for you and that just takes the model that we created we just trained this it basically compresses it into a tensorflow lite model um builds a library around it and downloads it for us um so what we need to do is go into the ei keyword spotting master or not go into there yet that's what we're going to bring in as our demo project so open up stm32 cube ide launch that oh end time is 6 15 minutes okay so just to confirm we've got uh like 25 minutes right awesome thank you i'm gonna like start burning really fast here okay so stm32 cube ide bring that up what we want to do is file import and we are going to import uh this is where actually i forget so this is why i wrote this all down so in this github go check out the embedded demos go to the l464761 there is a walkthrough this is where we are now is in this second walkthrough because it's gonna when i give more examples into this when i put more examples in this github repo i'm hoping to flesh out each one like how do you make this example work um this shows you your connections um and what looks like we need to do is file import existing projects into workspace this is the step that i always mess up so click on existing projects into workspace and this is where eclipse does it's i try to make everything a gui uh copy projects into workspace make sure that's all selected copy projects into workspace we want that uh yeah that looks good finish um okay so for anybody who missed it download my github repo the one where the worksheet is download the whole thing um and you'll also want to download the um uh the edge impulse model that we created so if you go to deployment there's a build button at the bottom that's what downloads everything so you want to select c plus library is what we're using and then click build so back in stm32 so we've got the project so this is from my github repo um once again i have to go to the worksheet now because i don't remember these exact steps projects explorer so we need to delete the model parameters in tf lite directories so this demo comes with a pre-trained model and that pre-trained model exists in the tf model as well as some information about it in the parameters folders so in ei keyword spotting delete that one uh yeah you deleted that from the whole file system and delete tf model tf-like model there we go uh file import general file system so here is one of the weird eclipse things when you're working in eclipse you've got the projects over here so i've got a few projects going on make sure you click on that project when you're working in it or bring up something like core source main.cpp to make sure you're working in there otherwise if you accidentally click one of these and you go to import a file it's going to import it into another project so always make sure you've got the project you're working on selected or feel free to close these wherever the close is there close project feel free to do that and it makes it a little neater and it's a little harder to mess up some of these steps so click on that file import general file system so general file system next uh we want to select the model parameters directory uh how do we open the project in cube ide um so we're on this walkthrough right now in where the github repo where my original worksheet was if you go to embedded demos stm32q ide keyword spotting the nuclio l476 keyword spotting the walkthrough continues for this specific microcontroller and this gives you the exact steps you need to perform to get the stuff to import i hope that helps because i can't go back and like show you this step again so we need to import uh where was i the model parameters directory um from the model we downloaded because we're replacing the model that was in there uh that is the speech recognition o2 so the model parameters select that folder it should look something like this we gotta we gotta select them get a create top level directory yep that looks good finish okay so that should have dropped oh i messed that one up so i put it in here that's where i messed up so delete that the idea is putting my cpu uh i don't know sometimes it can do that if it's the first time you're doing if your first time you're running the ide it might hose your machine for a while we do not record data using the microphone how will the model work um it's all because of how we did it with the m because the mfcc's um the mfcc's are what the key here is it allows us to record with different microphones create a model and then it listens from whatever microphone we want so let's try this again file import my love of eclipse is showing uh download model parameters select folder uh here we go so into folder it needs to go into ei keyword spotting so if you've selected this it should go in here otherwise you can click browse make sure it goes into eli keyword spotting it's very important that um our model exists in this directory because of how i have the includes set up so there we go model parameters is now inside of there we need to do the same thing with the tf lite model directory so we actually want to if you select this it helps i'll try it from here file import file system so let's do the tflight model select those so you see it tries to put in the base project directory so i'm gonna go browse here's my project ei keyword spotting select create top level directory selected and finish okay so the three directories so we've now started with my my basically my template demo project we've replaced the model with the one that we trained from edge impulse all right project build configurations set active releases oh here we go project build configurations set active it wants to default to debug we actually need to go to release the reason for that is because there is a compiler flag that gets set if we use debug that makes everything run really slowly um with tensorflow lite so we want to make sure we're doing release here um if you want to debug stuff go for it i don't recommend using this program to debug tensorflow lite um because you'll overflow your audio buffer it's like it's very very narrow margins when it's trying to fill up an audio buffer and then run um your inference uh okay so now we want to new configuration and then build the project project build project and then this is going to take a little bit because not only does this have to compile what we have going on this has to compile um the model and all the uh edge impulse stuff as well while that's going on i can show you a little bit in maine so unfortunately if you've never done stm32 cube stuff uh with their hal and in this ide this is going to be a little confusing like i said next time i do this workshop i'm gonna do an arduino because i think most people um taking this workshop have had some hope hopefully experience with arduino it would be a lot easier um lesson learned i'm i would i would do this in arduino in the future um that being said i like stm32 i i can do a lot of stuff with it um on a low low level even though i'm using hal which is um like somewhere between like reading and writing registers in your arduino framework so what's the the big stuff that we need to care about is here's main here's your entry point and there's stuff here that sets up hal is stm32 stuff that's setting up all of your peripherals your clocks your timers uh even if you re yes that is yes kai you're correct if even if you use hal you still need to read the reference manual that is so very true um it just does some stuff that makes it a little easier but then eventually you get to the point where you're like wait hal doesn't do this exact thing i want then you have to go like to a lower level library or doing manual register reads and writes from there um ei printf um they have us implement ei printf if you scroll down you'll find that function i have ctrl f so here it is all it's really doing is calling this which just uses hal transmit so that's coming out over the uh second uart port um that allows some of the edge impulse stuff to uh spit out over uart that helps us debug that's all that's going on so back in loop sorry yeah loop right we're in arduino i'm already thinking arduino um that'll print out some stuff it's uh then edge impulse sets up all of our buffers and sets up the tensorflow lite stuff and then in here this is where the juicy bits happen um it this waits for this waits for the audio buffer that records exactly 250 milliseconds of audio data to fill up before returning control if you let that overflow it kind of breaks this whole process so if you put a bunch of code here that then prevents it from um that overflows that buffer because if you waited too long at 250 milliseconds um you'll break the whole process so that's why i have to be really careful here um it will do classification the magical thing is we take that raw signal that raw audio data and then we send it we call run classifier continuous with it and that performs the mfcc extraction from that and then does uh inference with the neural network that we trained uh we then print the output oh i thought i updated this from last night i thought i pushed this oh well this will still work i'm gonna have to go check that i thought i pushed it so it when you're putting your code here it should actually be outside of this print statement i'll make sure to update the github repo i thought i updated that last night anyway it should still work so we need to go to run run configurations run run configurations select application okay run run configurations you should see something like this and why are you not giving me a new configuration press the new configuration button oh that is awesome do we have an elf file releases yes we have an elf file why is this not working for me right now okay yeah if you if this is the first time you're using l476 um the only thing that helped was connected arduino program blinky yeah there's some weird stuff that happens with that debugger connection so somebody's getting stuff yay awesome um and funny enough this this debug connection thing is is not working for me right now i wonder if i open these projects will give me something this is this is more eclipse not doing what i want oh ah i need to click on oh there we go i need to click on the c plus plus application in order to create a new configuration of course why didn't i think of that so we create a new configuration here it should point to the elf file we just compiled which is what we're going to send to the board um make sure it's release that should be our configuration otherwise debugger doesn't really work and that should be it so we're going to click run this should it should be it should have been built uh you shouldn't see compile errors with this you'll see a bunch of warnings um if you can post the compile errors in the chat what you're seeing that might help we can probably help with some of this um hopefully at this point you didn't modify any code [Laughter] somebody got the audio buffer overrun so that can happen if you didn't if you're using a debug configuration instead of release um you'll probably see that if you've tried to train um if you try to classify more than two models or sorry two two keywords you'll probably see that um so try maybe going back and training just one um yeah if you're running into problems with this before trying to build um if you go so go back to this uh my github repo here if you go into embedded demos stm 32q ide the nucleo l476 this gives you the exact steps you need to perform yes yeah make sure you're not doing a debug build that will definitely it'll run but it will definitely overflow your uh buffer so once you've gotten this where it says debugger connection lost shutting down remember we did release here we're not doing debug we're specifically saying do not do uh debugging for us so it's okay that it does this you're not we're not doing blinky we're not going to press run here or anything like this um so you cannot do more than two my experience um this is what i i think i mentioned earlier that when you were picking keywords try three i don't i can't promise they're gonna work um because the model may become too big and it might take too long to process that you start over over running your buffer um this is one of those cases like oh you want to do three well you need a bigger processor so that's why i stick to one or two during this um so open up a serial connection because windows i need to go to device manager com ports i'm on com7 serial connect with your serial of choice use of 115 200 and you should see all of that being being spit out um this is your probability remember i mentioned earlier you have the probabilities of each class um noises noise your background noise unknown as it doesn't know what you're saying and then the two classes that you pick so i'm gonna try saying house and that should go up you saw that briefly flash to like 98 0. so you'll see that yes flashed here and that's because of the demo code that is because of the demo code that i totally thought i checked into github last night and did not check in still works um but what's going on here is if it sees classification label number three or index three it's going to print this which corresponds to whatever your thing was and i'll show you how to find those so if you go to model parameters model metadata here is where your here is where your uh classes are listed in this order so noises zero one two and three so if i wanna print zero i need to do three and actually in fact i recommend pulling out this and going above here um because this only this where it prints out the results only happens once every four classifications every second we want it to do it faster so what did i say three was i forget zero so i'm actually gonna change this to zero um this is your onboard led i'm gonna pull this out and drop this right here so let's have it flash the onboard led for house that's zero one two so i'm gonna change my index to two so anytime that value the output probability is above point five which is my threshold and i can i can adjust that right um i can make it eighty percent to be a little higher and then so every time it hears house and so every time it hears zero i'm gonna build project so every time it hits zero then it should print zero to the console every time it hears house it should then flash the onboard led so everybody can give that a shot um remember uh indexes zero and one correspond to noise and unknown so you really probably want to play with indexes two and three um and i'm going to project run run that should upload it to my board serial body weight is one one five two hundred um i don't know if it matters i think there's some autobot thing going on there uh but one one five two hundred should work um it's using that embed it's got that embed um it's got that embed programmer chip on the nuclio board um which i think auto bonds so but 115 200 works and i've absolutely lost my clock and i've got eight minutes left whoo so this is the end i'm gonna bring up this last side while oh here we go here we go i can give this demo all right here we go so we've got it going here i'm gonna say something house you can you probably can't see it but here we go house see it's lighting up the little lvd right here whenever i say it i'll make my screen bigger for that zero zero there it goes zero zero but doesn't like that one zero ah there we go so you can see those flying by i'm gonna stop sharing for a second here so that everyone can hopefully see me and i can bring this up and house zero which won't do anything house yeah there we go so we got to blinky in the most roundabout hardest most obnoxious way possible let's share my screen again and i'm going to come back to this final slide where we have seven minutes you're triggering mine with zero fairly reliably awesome uh so people were asking about this sliding window i skipped a bunch of these slides because i thought we were like really limited on time but yes that's essentially what's going on is the sliding window and we're comparing the probabilities here so this was just what was going on with that sliding window and i can burn through this real quickly right and then once it sees that keyword it compares that probability and we do something with that so we've already run it here is the end slide so i have like two minutes to go through this feel free to modify that code and play around with it if you're comfortable with stm32 stuff i gave you a couple of examples for like how do you blink the onboard led um this was this is what it should have looked like um i once again i thought i committed and this is why i should always check my repo to make sure i did commit those changes um some resources to play around with here's the data sheet and the hal api if you want to take a look at those and um yeah there's the there's the hangout bit i think that is on my project link as well as twitter handle linkedin if you want to chat with me um i do try to uh answer tweets so that is all i have i'll leave this up for a bit are there any questions do you recommend any more powerful boards for edge inference um honestly i it depends on your application because that's always the answer right um for audio i wouldn't do anything less than like these uh arm cortex m4s um which i think is what we've had what we've got these are m4s or m4 pluses i don't remember but m4 i wouldn't go anything less than like an m4 um when you start getting into uh like more hardcore audio processing video or video processing or image processing i can move up to like the uh what are they the m7s i'd go a little more powerful so these are running at like 80-ish megahertz if i remember something that like 80-100 megahertz so that's kind of where you start with audio you can do less for that so i've gotten stuff you can do like some really basic anomaly detection in things like um like m0s like really really basic for like like accelerometer like you don't read a ton you do like basic fourier transform if you're not need to be really fast you can get it done with like an m0 i think you did push it i pulled it from github and got it that way maybe i opened the wrong one when i downloaded it so okay i'm glad that worked for people okay good i did and so that must have been i pulled the wrong one for my download and didn't clear my downloads last night uh with the teens what would the tc4 be able to do i forget the specs on that but if i remember that thing is a beast it was like 400 megahertz or something um you probably start doing video stuff on that very very basic kind of like the uh check out the openmv um uh tensorflow lite they've got tensorflow lite running on the openmv with like um micro python will you add arduino side on the github yes i'm that is that is like the next plan once i get um it's gonna be the arduino uh sense the ble sense but honestly if you if you like download the like arduino package for the ble sense from like everything we did up until that point where we clicked the c plus if you just click the arduino one and then go read how you import that into your um arduino that will totally work um for everything we just did um go read how it works for edge impulse to import a uh arduino library so it everything we did applies to arduino but yeah i should hopefully include that i managed to get tf light running on the oh that's awesome yeah the tt i haven't played with it i think i've got a team c4 but that thing is blazingly fast what if i don't have stm32 this should still work like this supports most arm the c plus plus libraries is supported on most arm processors it's just a question of like how well do you know the the whole tool chain in the build process like i happen to know stm32 decently well which is why i chose it that or arduino uh run configuration was debug yeah sometimes that messes up which is why i always check that so yeah make sure it's release yay people are getting it to work 600 megahertz for the teensy that's awesome can we use it uh yes once again the c c c plus plus library should compile it doesn't so it's not it's that um i can't remember the name of the processor inside of the uh the esp32 um but it's not an arm um the cc plus plus library should work um however note that when you download it from edge impulse it enables some of the arm specific cmsis stuff for doing dsp the extensive thank you 10 silica that's the one i couldn't remember that name um it's enabling the uh arm cm system for dsp to allow those to go faster and it's using the built-in hardware for dsp functions so if you as long as those are disabled that cc plus plus library you download from engine pulse should run on any it should compile for any microcontroller it's just a question of is it going to be fast enough without those hardware accelerators uh briefest explanation of how to retrain the model for better accuracy so um in edge impulse all you need to do go back to your assuming you've assuming you have the data you don't want to add more data um which you want to go back to uh assuming you have the data you don't need to add more data um and you've got all the features extraction just go back to your neural network classifier and just start playing with these parameters um so once again how do you get better accuracy uh you just play with stuff you tweak stuff you read a bunch of documents and say like oh i kind of have an understanding that maybe i need less neurons here but more layers sometimes that works better so really it's just starting to play with some of these maybe try a two-dimensional convolutional neural network and by the way before i forget if you're familiar with keras you can click this and you can go to the chaos mode so you can look at how they're building this model programmatically um they i don't believe they give you ways to oh you can use you can select the like uh learning rate loss functions and all of that here okay so yeah you can if you know keras you can adjust all of this programmatically here um but yeah feel free to and then once you've like changed this just click start training again it'll just train the model and give you an output so just you know write down what your accuracy was and then um keep in mind if you start using the model testing or the model testing your test data to update your um to like look at and be like oh and overfit and come back you start introducing bias into your data so there's there's like papers out there like how do you um prevent that but it's that's a good way to start like go to your model testing use the unseen data make sure it's working decently well come back and do your uh update your parameters just know that you're introducing bias into your model every time you're doing that uh reduced training cycles from 100 to 30 decreased from 50. sometimes it made it better for me and sometimes it makes it worse like i wish i knew exactly what was going on in the neural networks for that awesome yeah go have fun play around with it i hope this has given you everybody um a start for this kind of stuff it's a lot to take in if you really want to get down to the nitty-gritty and do some math um andrew eng's course highly recommend on coursera that's what i started with and then there's like a bunch of books out there that you can start going through um and then the the i would make sure you do that i would learn keras um once you're familiar with that and then once you're kind of comfortable training your own networks and like you know building them or at least pulling them off the internet and tweaking them then i would jump into like um uh pete warden and daniel nayaki's tiny ml book because that jumps into how to use um uh the tensorflow lite library directly on a microcontroller and edge impulses basically take that and wrapped it up into a library for us that like works off the shelf which is why i chose that for this where we only had two hours uh well uh sorry i don't mean to interrupt but sean uh thank you for sharing with us uh this really enlightening and helpful presentation uh it was very technical but uh you presented it in a pretty uncomplicated way so uh and it was also very entertaining and uh thank you for making it a very down to earth and personable talk so i'd like to invite everyone to continue with the conversation for comments and questions at your hackaday.io page you can go use the public chat for uh more talks uh and um i posted the link in the conversation or in the chat sidebar uh and with that thank all of you for being a part of this amazing community and uh joining us for emoticon so i look forward to seeing uh more of you at other talks and uh yeah so appreciate it thank you again sean thank you thanks everybody all right
Info
Channel: HACKADAY
Views: 6,565
Rating: undefined out of 5
Keywords:
Id: IRa_SH-3MSI
Channel Id: undefined
Length: 120min 10sec (7210 seconds)
Published: Sat Nov 07 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.