Building a Local Smart Home Voice Assistant With ESPHome!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
home assistant just dropped a big announcement last week as part of year of the voice during their live stream for chapter 4 a feature that has been the most requested feature since voice was announced earlier this year and that is of course wake wordss so today I want to show you how to get home assistant set up with Voice including using wake words and then I'll show you how to build your own voice assistant using a microphone a speaker an ESP and of course some ESP home magic earlier this year home assistant embarked on what they were calling year of the voice where they basically focused a lot of their efforts on adding a voice assistant to home assistant much in the way you can interact with Amazon or Google home using your voice except this time there was a focus on it working locally and also a heavy focus on privacy we've seen the voice features develop over the course of this year first we were able to input text to control a device in home assistant and then we were able to use things like analog phones to talk to assist then we got a push to talk on ESB home devices and then this chapter allows us to use wake words using wake words like okg or saying the name of Amazon's Voice Assistant to trigger it is of course a really important process in using your voice because it's not really too practical to go over and press a button to have to talk to your house so that is why the Wake words feature is such a big deal the first thing we need to do is set up the voice pipeline in home assistant first which is going to allow our mic phone to stream data into our home assistant server where it's going to be processed so to do that head over to settings and then voice assistants where we find the main page for configuring any of the voice assistant settings in the assist section you'll see that there is already a default pipeline setup called home assistant at the core of Home assistant's voice setup is what's called a pipeline and a pipeline is basically a flow of different things that happen as you interact with home assistant using your voice and as it interacts back with you and these pipelines are cool because unlike other voice assistants where everything is pre-canned and set up for you you can change each stage of the process to really configure it the way you like and because you can create as many pipelines as you like you can assign different pipelines to different devices now don't worry if that all sounds a little bit confusing it should hopefully become more clear as we go through it click on the default pipeline first called home assistant and you will see that there are four sections the conversation agent the speech to text the text to speech and the new wake word section if you try and click on any of the dropdown for any of these they aren't really going to have any of the options available because we haven't installed some of the additional components that we need so we're going to do that now head back to settings and then add-ons and head into the add-on store there are three add-ons that we need to install let's start with Piper first which handles the text to speech component and hit install once installed you can go over to the configuration and change the voice model if you like to one more suitable for your language and then head back and make sure to hit start on boot and then Watchdog are enabled and hit start then go back to the add-on store and this time install the whisper add-on which is going to be responsible for taking the speech from our microphone and translating that into text hit install and make sure Watchdog and start on boot is enabled and then in the configuration tab you might want to change the model which determines the speed and accuracy of the translation depending on your Hardware you will need to select the appropriate one for you if you're using a Raspberry Pi 4 or similar level of Hardware you're going to want to stink with the tiny int8 model if you have a bit more powerful Hardware you could try the base or even the small models and you can play around with these to see which one has the best performance and speed for you once the model is is set make sure and start the add-on and then head back to the add-on store finally we're going to install the new open wake word add-on which as the name suggests is going to identify wake words coming from our microphone select the options again and then under configuration we aren't going to change anything just yet but do be aware of the threshold and Trigger level options here for f tuning later to improve wake word detection then start the add-on and next head over to settings and then devices and services so that we can configure those add-ons you might be wondering why we are installing an add-on for Wake word on home assistant when surely the Wake word should be done on the microphone right does that mean that only our home assistant server can do wake word and any other device can't well the way wake word Works tongue twister in home assistant with this new update for now is that any compatible device is going to stream its microphone to your server and the home assistant server server itself is going to process that audio and listen for the Wake word so for example if you have a microphone in your kitchen and one in your living room both of those devices are going to stream their microphone to your server where it's going to Luke and process that audio for the Wake word now you might be wondering isn't that inefficient and isn't that going to use a lot of bandwidth having microphones streaming all the time bandwidth wise no not really they talked on the stream about how it uses about 32 kilobytes per second per device which is very minimal and of course their goal is to have wake word done on the end device that's just a really challenging thing to do as it turns out so while they work on making that happen this provides a nice stop Gap you should immediately see that all three services are now showing up with the Ying integration which is basically just a protocol that the services use to communicate with home assistant similar to mqtt hit conf configure on all of three of the services to finish setting them up and we can now configure our voice pipeline back in the settings page for voice assistant Select the home assistant pipeline once again and set the speech to text drop down to whisper selecting the language that you require and in the text of speech dropdown select Piper again selecting your preference finally in the new section the Wake word section we going to select open wake word from the dropdown and then you will get to select which wake word you want to use to wake up your speaker of which there are currently five with open wake word all of which are in English for the time being now two things here firstly it is possible to add more wake words to this list if you want to use the porcupine wake word engine instead of open wake word but I don't believe it's completely open source as I understand it so I'm going to stick to open wake word for this guide but I'll leave a link down in the description and that is definitely an option for you and and something I'm sure you are anxious to know is does this support custom wake wordss and the answer is yes it does you can train your own wake word models which is really cool it does require a bit more work but there is some instructions to step you through everything if you want to go down that route they did say that the custom wake wordss won't be quite as good as the pre-trained models just due to the amount of training data available to them but they did mention about opening a place on the forums to share wake words with each other so that will be really cool kind of like how blueprints works that is our pipeline Now set up let's give it a test and see if it actually works first head to your dashboard and click the assist button and then we are simply going to type a command in to test if the conversation agent is working properly and if it is you should get a response and your action should happen next we want to test the speech to text portion so we're going to hit the microphone button and speak the command instead and again it should give you a response note that the first time you speak a command it might take a little bit longer as the system starts to cach responses it should get a little bit faster with both of these working the last thing we need is a device to actually test our weight cord on of course many of the voice demos that have been shown over the course of the year have been using the M5 stack echo which is this really cool little device with a speaker and a microphone but since the home assistant guide does a good job of showing you how to use that if you want to use the M5 stack Echo I thought instead I would show you how to make your own one with an ESP 32 the two essential components you will need for this are an esp32 and an i2s microphone for my es32 I am using a development board this is one that we made that I like you can find this on our shop if you want to use this one or you can use pretty much any es32 development board you have lying around you can also turn existing devices that are esp32 based into a microphone 2 like the ep1 or the EP light or any other device that uses an esp32 and has some spare gpio for microphone choices I am using the IC 43434 microphone breakout board which we also made which is such a good quality little microphone and is kind of the one that I would recommend if you can find someone selling it or the older version of this that is more readily available on places like AliExpress or Amazon is the inmp 441 which you can also find on a breaker board by the way I'll have links to everything I used down below if you want to pick any of it up and follow along for yourself the optional things you will need depending on if you want a speaker to give you feedback or not is an amp and a speaker I hooked up a microphone only to the ep1 for voice commands and it doesn't use a speaker but for some places it might be nice to have one so you can get feedback that your command hands are working for an amp you will want something that is i2s once again the max 98 357 is a popular breakout board that is really easy to find and can drive 2.5 W speakers if needed and for a speaker choice you can use any two wire speaker so long as it's appropriate for the amp don't try driving a bazillion watt speaker off this little 2.5 W amp and vice versa if you choose a smaller speaker just be sure to limit the max volume so you don't blow the speaker out because we're using i2s for this we actually only need four gpios for both the microphone and speaker the bit clock or the serial clock the frame sync or left right clock data out for the mic and data in for the speaker go ahead and wire up the components just like this remember that both speaker and microphone can share the same gpio for the serial clock and the Left Right clock but data in and data out need to be on separate gpios you can use pretty much any gpio you want on the esp32 just remember to edit any code to reflect that finally we're going to go into ESP home and create an ESP home config if you don't have ESB home and you haven't ever used it before I'd recommend my beginner video to ESB home first before doing this bit as it could be a little bit confusing if you are new to ESP home so watch that video first and then come back and finish this one create a new ESP home config for your device and then we are going to paste in this config that I've created for you which you will find linked down below some things to keep note of here firstly we need to use ESP IDF for the new wake word stuff so make sure that you have the framework set to ESP IDF instead of the default Arduino next down in the voice we have our i2s audio block just make sure to set the GPI pins here if you change them as well as the same in the microphone and speaker blocks finally in The Voice Assistant section you might want to play around with the noise suppression level the auto gain and the volume multiplier settings those definitely made a big difference for me when it came to microphone performance and hearing what I was saying correctly then hit install and upload to your esp32 using a USB cable and then once done take note of your IP address from the logs and head back to home is settings page and then devices and services add a new integration search for ESP home and enter the IP address or if it's autodiscover then you can add it that way once added head into the settings page for the esp32 microphone and you will see that you have a couple of options for tweaking assist firstly you can manually set the pipeline on a per device level this is really handy for example if you want to have a different language for different devices or or if you even want to have different wake cords for different devices since each device can have its own pipeline next you have the stopped talking detection level and you also have a binary sensor for assist which shows if assist is running or not which is really useful for troubleshooting finally there's a toggle switch for disabling wake word and it could be a really good idea here as was suggested on the live stream to disable wward whenever you are away from the room by using a motion or a presence sensor and that will help save on some CPU power as well as a little bit of bandwidth too finally all that's left to do is say the WW you selected in the pipeline and if everything works you should see the assist sensor in ESP home change to active and you can speak your command Okay Nabu turn on the office desk turned on light okay Nabu turn off off the office desk turn off light nice we just got wake word running on our own custom ESP 32 microphone all that would be left to do is turn these wires into something a little bit more presentable like Paul's example where he showed off an entire Droid model with a rotating head during the presentation which looks amazing air turn on living room lights I think it'll be really cool to see the creative things that people come up with and I look forward to seeing what you all have to show and that's about it for this video really glad to see some of the more advanced features like wake words coming into home assistant as part of year of the voice and really the next big step for them to tackle at some point is to defeat the final boss and have wake word done directly on the device itself with some form of Hardware which which I'm hoping we're going to see sometime soon but this serves as a great solution and a stop gap for right now anyways let me know what you think of the Wake word stuff down in the comments I know it was something that lots of you have been waiting for including myself for so long and now you can finally do it and we are one step closer to that sweet sweet local and private Voice Assistant other than that drop this video a like and get subscribed and I will see you in the next video
Info
Channel: Everything Smart Home
Views: 134,558
Rating: undefined out of 5
Keywords: home assistant, home automation, smart home, home assistant assist, home assistant voice, home assistant local voice, local voice assistant, home assistant wake word, home assistant voice control, local voice control, smart home voice assistant, voice assistant, wake word, esphome voice assistant, esphome voice, esphome voice control, home assistant setup
Id: zhlIaBG3Ldo
Channel Id: undefined
Length: 15min 35sec (935 seconds)
Published: Sat Oct 21 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.