Home Assistant ❤️ Voice - Tutorial 04 - Piper TTS

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good day and welcome to the next episode of my video series home assistant loves voice in the last episode we did some basic setup and I've shown you how you can use home assistance assist in textual form and now it is time to start adding voice processing components to home assistant and in this video we will start with the TTS or text to speech speech synthesis component so by the end of this video we'll have an overview which TTS services are available for home assistant and how you set up locally running Piper TTS in home assistant and we will integrate the Wyoming protocol which keeps all of the voice processing components together and if you do not like to miss any upcoming videos on this series please make sure to subscribe to this channel hitting the notification Bell and give this video a thumb up if you like it and let me know in the comment box Below on what you think and as this video series should be easy to follow and beginner friendly I would like to start with an overview on how voice assistants work in general and the starting point is mainly the Wake word or hot word or activation word detection probably most of you know Amazon's famous wake word for their smart speaker product but even lots of Open Source communities provide cool wake words ready to use such as okay Nabu hey Microsoft or for the Marvel fans hey Jarvis most voice assistants they provide a blinking LED ring light indicating that the Wake word has been detected and the next step so speech recognition has been activated and speech recognition or speech to text or St in short transcribes the user spoken request into its textual form because business logic is mostly done in text form only and once the action has been taken such as switching a lamp on or off there is this last step called TTS or text to speech the speech synthesis which will create humanlike Voice output provided to the user there are more voice processing components when it comes to multi-room setup called satellite setup so bring a microphone and a speaker into multiple rooms but this will be topic on another episode of of this video series and to make it even more relatable think yourself being a voice assistant or me being around at a party music is playing bunch of people around you everyone is talking to everybody and uh without any wake word It's probably hard to figure out a request imagine someone is calling my personal wake word so my name to and this will Now activate my speech recognition process so I will try to figure out the spoken request and do do DST so not speech to text obviously but speech to thoughts and then maybe the request has been ask for the current time and now I will do the business logic take a look to the smartphone to my watch figure out the current time and then the TTS process so thoughts to speech or text to speech in a voice assistant and provide a spoken response and this should be all for the introduction and now let's go let's start by taking a look to the available Integrations for speech synthesis in home assistant I'll put the link obviously in the description box and uh here can see lots of services mostly cloud or online Based Services but there are some services such as marry TTS or Pico TTS and the piper TTS which we will install in this tutorial um but feel free to choose one of these TTS solution that fits your requirements and now let's go to our home assistant installation can see our default demo entities on the default dashboard at first let's take a look to the Integrations so hit settings and go to devices and services if you scroll down to the bottom of this page you can see all installed and configured Integrations and by now there is no Ying protocol listed which is required to bring all the voice processing components in home assistant together let's start by installing it with the TTS add-on so let's go back to the settings in the navigation and go to add-ons click add-on Store and search for Piper which is the official coming from the resby project text to speech solution so let's hit this and click install while this is installing please check out the links provided here that will bring you to the GitHub repository of Piper with lots of useful information links to audio samples and a list to available language models so once Piper is installed let's see the configuration tab and you can choose from multiple voices starting by the language code so en nus for us English voices and there is different quality levels you can see low medium or or high this depends a little bit on the available compute Hardware you have um obviously lower quality models require less compute performance high quality models require higher compute load so depending on your use case you can choose between these quality levels so let's go with uh the high model let's stick with the default values for the rest and hit save look look to the documentation Tab and if you scroll a little bit down to the chapter called custom voices you cannot just choose between various languages and TTS models that are delivered out of the box using home assistance Piper installation but you can use your own voice inside home assistant and this is done by using or training your own artificial text to speech voice model I've made a video tutorial on how to do this which I will put as a link in the description box below and in this custom voices documentation chapter you can see how you can integrate your trained artificial voice inside of Home assistant now as the addon is started and before we will use Piper TTS in a voice pipeline inside of Home assistant we can do a first test if Piper TTS is working in general so let's go to media on the navigation side and choose text to speech and here you can see the ready to use TTS Solutions in home assistant which is cloud demo and Google as a cloud service obviously but where is Piper TTS it is not listed there and this is because we do not have the Wyoming protocol integration installed by now which we will do as a Next Step so let's go back to settings hit devices and services hit at integration and search for Wyoming you can see the Wyoming protocol as a plain basis and you can see Piper for TTS or whisper for speech recognition let's use the first entry so Wyoming protocol and you can directly see Piper has been discovered by Wyoming so let's choose Piper submit and now success create configuration for Piper when we hit finish and go back to the media entry in the navigation choose text to speech again and now we can see Piper has been discovered using Wyoming and is ready to use for a first test so let's choose Piper okay let's choose our voice and let's hit say this might take a little while on the first hello Thorston voice you can play any t on any supported media player because it has to download the model files the first time so as you can see or hear speech synthases using Piper TTS locally worked hello Thorston voice you can play any text on any supported media player as a small preview let's see if we can choose Piper TTS when we configure a voice assistant or a voice processing pipeline in home assistant so let's go back for this sneak preview to settings tab scroll down to voice assistance we will take a closer look to this voice assistant configuration in detail once all the voice processing components has been installed and configured but for the first preview let's see our home assistant default voice assistant let's go to text to speech and you can choose which method so you can see none obviously no speech synthesis the Google cloud-based text to speech service home assistance Cloud demo and Piper so this list is identical to the options shown on the media navigation side so we can now choose Piper TTS as our locally running voice synthases in home assistant and this should be all for this episode please make sure to subscribe to the channel hitting the notification Bell to not miss upcoming updates give this video a thumb up if you liked it and let me know in the comment box on what you think thanks a lot for today and if you like we might see us next time bye
Info
Channel: Thorsten-Voice
Views: 1,249
Rating: undefined out of 5
Keywords: TTS, Text to Speech, Tutorial, ML, Machine Learning, AI, Smart Home, KI, Voice Assistant, Python, Tech, Open Source, Künstliche Stimme, Sprachassistent, Home Assistant, STT, Raspberry, piper tts python, home assistant voice control, home assistant piper tts, home assistant tts, home assistant text to speech deutsch
Id: jyFmQ2dhoSg
Channel Id: undefined
Length: 10min 12sec (612 seconds)
Published: Wed Mar 06 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.