Build A Python Speech Assistant App

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hey guys this video is sponsored by kite which is a Python plugin for editors and I des that gives you intelligence snippets and an integrated documentation tool called copilot that will let you know all about specific classes and methods and more and I'll actually be using the vs code extension and co-pilot in this tutorial it's free it's extremely helpful so click on the link in the description below to find out more and download kite hey what's going on guys so in this video we're gonna build a speech assistant application with Python and we're gonna use the speech recognition library we're gonna use Google text-to-speech so that it can actually talk back to us along with some other packages as well and basically we're going to be able to give voice commands to do certain things like we'll be able to ask ask their name I'm just going to call it Alexis and we'll be able to ask to search something on Google ask to find something on a map and so on and have it talk back to us alright so let's get started here I have vyas code open just an empty folder called Alexis speech assistant and down here I have my terminal and the first thing I'm going to do is just set up a virtual environment where usually I usually use Pippy and V but it's been giving me some issues so I'm just gonna use virtual and virtually NV so if I say Python 3 - M V env and then create a folder I'll call it V env as well and you can see that that folder got created and then in this bin folder there's an activate script so we want to call that to activate our virtual environment so we'll just do source and that's going to be in V and V slash bin slash and then activate okay so now that should be activated so we need to select an interpreter in V s code so if we do command shift P and you just search for a Python and select interpreter see this right here this V env I'm going to choose that and now we should be all set and if you want to use pit B and V or you don't want to use virtual environments or whatever that's fine too so now that we're set up let's install a couple dependencies that we're going to need so we're gonna use pip install it's gonna be speech recognition all one word and this is a library that we'll be using and I have the documentation here so it's a library for a performing speech recognition with support for several engines and api's and the API we'll be using is the Google speech recognition API there's some others as well like Google Cloud speech wit dot a I Microsoft Bing Sphinx which works offline so there are there's some other ones as well but we'll be using the the Google one okay and then there's a there's a requirement or a dependency that we need called PI Audio where you only need this if you're actually using the microphone which we are because you can also use audio files with this library but since we're using the microphone we need to pip install PI audio okay and then there's some other stuff we're going to need later on but but that's it for now so let's just create a file I'll call it main dot PI and then we'll open up main dot pi ok so first thing I'm going to do here is import the speech recognition library so it's going to be speech underscore recognition and we're gonna say as SR okay so whenever we use this we can just use SR and we want to initialize what's called a recognizer so it's gonna be SR dot and then the recognizer class and that's basically the bread-and-butter of this of this library that's what's responsible for actually recognizing speech okay now we want to use the microphone like I said you can use audio files but we're going to use the microphone so we're gonna say with SR dot microphone and let's say as our source okay so the source is going to be our microphone and we're gonna first prompt the user so we'll just say print say something and to begin with we're just going to be able our first goal is to just get what we say into the microphone and then print it in the console and then later on we'll implement Google text to speech so that Alexis can actually talk back to us or whatever you want to call it that's just what I'm calling it okay so let's create a variable called audio and we're going to set this to our recognizer object and then there's a listen method and we want to pass in our source which is our microphone okay and then we want to create a variable for our voice data so whatever we say we want to put that into a variable and we can do that using the Google or using any of these but we're going to use this recognizer Google and then pass in here the audio variable okay so that should capture the the voice data now let's just print our voice data okay so we'll save that this is just asking if I want to install Auto formatter we'll just say yes okay now let's run this file and when I run it it's gonna automatically listen and I'll say something and it should print it so let's run Python main dot PI I love programming and there we go so you can see that it printed down here in the console alright so that's kind of step one is just to get what we're saying to get that into this variable now I'm going to show you a couple things with the kite extension that I'm using which comes with a really cool integrated documentation program called co-pilot so if we want to look at like recognizer notice how when I hover over it we get this Doc's if I click on that it'll open up kites copilot and it'll show us all the methods and stuff like that you can see the listen method right here gives us all the different arguments or possible arguments tells us what it is it records a single phrase from a source an audio source and records it into audio data into an instance and returns it tells us how it's done and all that stuff and then let's see what else here we have the listen actually we just looked at that microphone so if you want to read more about this we can hit Docs so this is a really handy extension so creates a new microphone instance which represents a physical microphone on the computer and gives you all the different members and I just want to show you right here the recognized Google if we take a look at that and co-pilot this performs speech recognition on audio data but what I want to show you is the exceptions that can be thrown so right here this unknown value error exception this is thrown if it doesn't understand what you're saying if you just make like noises or something that will get thrown so we need to handle that and then it also will raise a request error if something's wrong like the service isn't working so we want to basically wrap this wrap this in a try block all right so right under audio let's go ahead and open up a try block and we'll just tab these two over and then we want to have our exceptions so we'll say accept SR dot and then unknown value error we actually don't need these parentheses and then if that happens we'll just go ahead and print out and we'll say sorry sorry I did not get that okay and then we'll have our request our exception so accept SR dot request error and if that happens then let's print and we'll just say sorry my speech service is down okay so I'll save that and it should do the same thing we'll go ahead and run this again I love coding there we go so it still works now I want to put this into a function I don't want it to just be in the global scope here because we're going to need to basically store it in another variable so let's define a function called record audio okay and then we just want to take all of this and put that in the function by just tabbing it over and we're going to initialize voice data right above the try so our voice data variable just set it to an empty string and then down at the bottom we're going to return it right here on the same level as the exception so returned voice data okay so now we have a function and now we'll go down here and I'm actually going to delete this print say something because I'm gonna put that down here I don't want it in the actual function so let's print actually we're not gonna say say something we'll say how can I help you because basically we're going to be giving voice commands so we'll say how can I help you and then let's create our variable down here voice data and set that to record audio record audio and we'll go ahead and print it out here so print voice data I'm not going to leave this print here but I just want to make sure that this still works so we'll run it what is your name there we go and it said it actually set it twice because I have the print right here so I actually don't want to print it out so I'm going to get rid of that and I'm going to get rid of this print okay so now we have this variable that has our voice data so we want to basically have our have Alexis or whatever you want to call it respond so I'm going to create a function called respond and we're gonna pass in that voice data okay so we'll create that function up here define respond which takes in voice data okay and we'll say if and let's do what is your name if that is in the voice data then for now we have we don't have the actual speak functionality yet so we'll just print it out so we'll print out my name is Alexis or whatever I'm surprised my AAL EXA isn't going off when I say this alright so let's try this out clear this up what is your name and there we go so we get my name is Alexis okay cool so let's actually have Alexis tell us the time so up here let's bring in will say from time import see time and let's have another if statement here so we'll say if what's know what time is it is in our voice data then let's print out see time okay so we'll try that what time is it okay so it actually prints out the date and the time I mean if you want to format it you can or if you want to change it to like what is the date you could do that but I think that that's fine for this so the next thing that I want to do is I want to be able to search for something I want to say you know search for whatever dogs or whatever search Google so we're actually gonna import up here the web browser package which is a core package we don't need to install it or anything and I'm gonna say right here if search whoops no I want to say if search invoice data and notice I didn't put like a variable here or anything what I want to happen is I want to say the word search and then I want it to ask me what I want to search for and then I'll say again you know I'll say what I want to search for so in order to do this let's have a variable called search and set it to record audio because we need to we need it to know what we're saying back now this record audio I'm gonna pass in an optional parameter for it to actually ask a question so we'll say what do you want to search for and up here in the record audio we're gonna have an optional argument of ask we want to set that initially to false because it's optional and we'll put this let's put this right of audios so right here we'll just say if asked make sure you can just do if asked and then we want to print ask okay and then whatever we say back whatever we want to search for is going to get put into this variable alright so the next thing we'll do after we put that into the variable is create a URL that we want to use which is going to be a Google search URL so HTTP google.com slash search and we can do question mark Q equals so this is just a query that we want to search Google and we're just going to concatenate onto that whatever that search term is okay then we can use the web browser so web browser dot get and then dot open and just pass in that URL and that should open it in the web browser okay and then I just want to print Oh after that we'll go ahead and print out here is what I found for and we'll just concatenate on the search term okay so yeah I think that should work let's save it let's try it out so I'm gonna run the program search dogs and there we go so it opens up a browser it goes to google goes to the query equals dogs and shows us dogs okay and I want to do a similar thing for location using Google Maps so let's go ahead and say if actually I'll just I'll just copy this because it's pretty similar so we'll call this find location so if we say find location then we'll have a variable called location and let's change the text here to what do I want this to say let's just say what is the location whoops so the ask will be what is the location okay and then the the URL is gonna be different so let's actually get rid of this it's gonna be Google dot and L / maps / what is the URL place place / and then we'll concatenate the location and then we have to concatenate onto that / ampersand so we want to use ampersand a MP like that and then semicolon okay so down here same thing we're just going to open the URL and then let's change this this text here will say here is the location of and then we'll concatenate on to that the location variable alright so let's see if this works fine location Boston Massachusetts and there we go so it opens up Boston on a map cool so I want this I mean right now we run it we say one thing and it ends I wanted to basically continue on to like to just listen to continuously listen so we can do that pretty easily we're just going to go down here and we're gonna use the time package actually have to import it on its own so imports time and say time dots sleep we're gonna call the sleep method which just waits whatever however many seconds we want so let's say time dot sleep 1 and then right here we're gonna just have a while loop we'll say while 1 and just tab both of these over and this should cause it to just kind of sit there and wait for us to talk so let's go ahead and clear this up and let's call the file what is your name what time is it fine location Boston Massachusetts okay so that worked good now I can get out of this with ctrl C but I want to be able to speak exit and get out of it so let's just do if exit in voice data then we're just going to call exit okay so we'll try this again what is your name exit and there we go so now we can exit on voice command so now what we want to do is we want instead of just printing stuff oh we want it to talk back to us so for that we're going to use Google text-to-speech so it's a Python library and CLI tool to interface with the Google Translate text-to-speech API so we want to install this and basically what it's going to do is whatever we pass in is text which is Alexis basically it's going to create an audio file and we can play that audio file now we're going to need an additional package called play sound because if we don't use this it's going to open up like iTunes or some kind of sound play or whatever your default sound player is and we don't want that we want it to just say it right away and then play sound has a dependency called app kit which is actually in a package called PI object C or obscene oh there's a couple things we need to install so let's go ahead and pip install GT TS so that's the google text-to-speech and then let's do pip install play sound and let's do pip install PI objc so just like that and that has the app kit that play sound depends on so we'll go up here and we'll just import everything we need so let's import place sound and we're also going to import the OS package which is just a core Python package because what's gonna happen is like I said play sound I'm sorry the Google Texas speech will create an audio file and unless we remove that file in our code it's just they're just going to keep piling up so with the OS module or package it has a remove method that we can remove the file with so that's why I'm using this and then we're also going to import random because I want to randomly generate a file name for the audio file and then from G TTS we want to import G TTS like that so Google text-to-speech alright and then we're gonna have a function let's go right here and define a function and we're gonna call this Alexis underscore or speak or whatever you want to call it okay and then the first thing we want to do here is create our text-to-speech variable so we're going to set that to G TTS and this has two things you want to pass in so text which is going to be I'm sorry this has to take in an audio string and this is going to be audio string and then the language so Lang is gonna be English so en and you can use other languages if you want alright and then we're gonna create a random basically a random string using random int so let's set this to random and there's a method called random Rand int and this is going to take in we're gonna start at one and we we want this to be really large so let's do 10 million so 10 1 2 3 1 2 3 and that we want to create the name of the audio file that's going to be created so that's gonna actually be Audio - and then we're just going to concatenate our which is that random number but we want to we want to turn it into a string so we'll wrap it in STR and then we just want dot mp3 because it's gonna be an mp3 file okay so that's our audio file now we need to take our text-to-speech variable and there's a method called save so we want to save that audio file all right now I want to play this right away so we're gonna use play sound which has a method called play sound and then here we want to pass in our audio file okay and then I'm going to I'm gonna print what Alexis says as well as as having our say it so let's just do print and then whatever the audio string is audio string oops don't need that okay and then the last thing we want to do is remove the file so OS dot remove and we want to remove the audio file okay so we'll save this and now basically everywhere where we've been printing we want to now replace that with with Alexis speak so right here the ask will replace that with Alexis speak and down here these prints say it likes to speak and then this one here okay and let's see and it's gonna print - since we have this here so when she speaks it'll also print it out in the console and then all of these prints I'm just gonna command D here select all those and replace them with Alexis speak and then down here this last one okay so we should be all set let's try it out help you what is your name my name is Alexis what time is it Saturday December 21st 14 hours 37 minutes and 21 seconds 2019 search what do you want to search for dogs here's what I found for dogs fine location what is the location Boston Massachusetts here's the location of Boston Massachusetts cool sorry I did not get that exit there we go so work perfectly and of course I mean you could add to this and do whatever you want to it in fact I'm gonna put the github repo in the link and I'm sorry in the description and if you want to add on to it make a pull request make this you know more advanced or whatever have it do some cool stuff up set that's absolutely fine I encourage it and I encourage you to just take what you've learned in terms of speech recognition google text-to-speech and build something of your own you know so hope you guys like this little tutorial and definitely check out kite I mean anything that I want to look up here like if I want to look at web browser I can look at the docks in copilot and it will show me the description it'll show me you know what get does what open does and and I can search for other things as well so definitely check out the kite extensions it's free I have the link in the description so be sure to check that out and especially you know obviously if you're doing Python with vias code or atom or a number of other text editors and hi des alright so that's it guys I will see you in the next video
Info
Channel: Traversy Media
Views: 224,162
Rating: undefined out of 5
Keywords:
Id: x8xjj6cR9Nc
Channel Id: undefined
Length: 26min 47sec (1607 seconds)
Published: Mon Dec 23 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.