Text to Speech with AWS Polly in Unity!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone and welcome to this video uh it this is going to be one of the most requested features uh since I started working on the smart MPC pipeline videos with openai and unity and Ready Player Mi avatars in this case uh this is going to be like a lost building block of this whole thing because we started with turning text into text with chat GPT replies then we use whisper API that openai brought along the way so now we are able to actually turn audio into text as well this is the final block that we get text and turn it into audio so our NPCs can actually speak just like we talk to them so the whole thing is going to turn into we speak turn it into text send a text to open AI get the reply reply also turns into audio and the whole cycle is complete and when it comes to um text to speech there are quite many services most knowns are of course IBM Watson Microsoft Azure cognitive speech services I suppose if it was the name and Amazon poly there are some other services out there like less popular if you would like to use some of them definitely you can go for that for this video I'm gonna go with Amazon poly because it has a good.net support and it is one of the services I already Amazon web services is a service I pay for and I already have the access to the thing is azure and IBM are also paid Services uh even though they give you certain amount of free usage we will need a credit card information phone number and such so there's a certain barrier to go to the other side to be able to utilize this I already had Amazon access and I was able I was using its.net uh sdks so this is the reason I want to go with that in your case if you are using Azure or IBM Services definitely you can check out what they can do for you at the end uh using it in unity all depends on their.net support and you can pretty much handle it yourself but if you do not know Amazon is a great place to start let's first uh get to creating your Amazon account and after that we will will program it get the audio and later put it onto an avatar first thing we are going to do is creating an Amazon web services account if you already have an account you can of course skip this stage and I will put all the URLs in the description of this video so you can follow along I already have an account I won't be able to create one but I will try to guide you with some screenshots so when you visit aws.amazon.com you will see a create an AWS Account button here you can click there and you will be guided to provide your email address and start the process right after that you can enter your email address and the account name you want to create then you will receive a verification code you can enter this verification code after the page you see when you verify click on verify your email address and after that it is going to ask you about phone number and address information and here you can either select that you want to use it for business or personal use right after that you will be asked for your credit card information this is for uh once your free tier usage is over so they can charge it but Amazon is really good with that they notify you before the billing so no sketching like um just uh sneakily charging you kind of thing doesn't happen with that this is one of the reasons I really like them and trust them you enter the information and move forward I suppose I'm they were going to get like one dollar from you just to verify and then send it back to you this is some kind of verification process that the credit card is actually uh usable and right after that um you will do another confirmation either with voice call or a text message so your identity with the phone number will be checked you will enter a security code as well and you enter the security code you get on the call or through um text message and right after that you can select the plan you would like to go for you can go for the basic uh free package and start already using Amazon web services after you create your account you can login as a root user I'm just going to enter here I'll enter my password and we are in Amazon web services I'm directly gonna go to poly and let's see what we have here so you can already start testing poly in this UI that you can type some text and listen to it or download the audio so this is also possible so this is what you get here but this is not what we are going to deal with what we are going to deal with is actually the SDK and its usage to be able to use the SDK we will need to create a secret K and public key so we can use this in there and create our account so I'm gonna go to my uh name on the right top I will click on security credentials right here you can see uh there is a list of access keys so what you what you will do is again um this is not uh I have to tell you this is not the best way to do this the best is to create a user and assign that user these keys I'm gonna create for root user which will be able to you know like be used by everywhere another good way to do that but for the sake of quickly doing this demo we will go with that create a key it already warns you with what I wanted to tell you and you say I understand you create a key and it gives you the access key and the secret key that you can show and copy please get these two keys and save them somewhere because we will use them later and you will be able to see your access key however you won't be able to see your secret key later on so this time when you click on show this is the only time you see that you click done and this is pretty much it you will be warned again just in case you got it all right uh we have our keys we have Amazon access for you it might take a little longer to create account and get the confirmation from Amazon but you can continue the video right after that what we are going to do now is actually going and getting the sdks and the dlls that we are going to use in unity for that let's start with the empty Unity project here we have our empty Unity project however where we are going to get Amazon web services SDK so it's not going to be through package manager because there is no direct support of unity for Amazon web services packages like that however when you search you will find Amazon web services documentation for net and under there there's Unity support which says special considerations for Unity support in short we will have to download this ZIP file and get the right packages for us which is standard 2.01 2.1 support comes here because um ours our API compatibility level is 2.1 and we are using monob scripting back end so we will be downloading this file let's go and download that and on top of that there are three dll files we will need however these come with unity already so just getting the first dll Microsoft PCL async interfaces that is going to be sufficient for us and if you are using uh Intermediate Language CPP scripting backend in unity you will need to click create a file called link.xml and put it into root which will look like that in that case we will need Amazon web services SDK core and poly so you would remove one of these lines and remove this to poly and that would be sufficient for you if you are using IL to CPP backend in my case I will just go with mono and be done with that so let's click on this one it takes us to nugget package manager normally you can get any c-sharp and Microsoft packages through a nugget package manager this is like maybe you can consider it like node packages website uh however this doesn't work with unity right away this is the reason we we needed to download this uh zip on the first time however in this case we do not have a zip file but we will download the package directly so on the left side you will see this download package and when we actually um go to the download location what you will see is that this is a n-u-pkg file nugget package this actually is a ZIP file itself so to extract things from it I will just rename it to that zip let's save it and voila I can see the content now so rename it to zip by the way to be able to do that from the um file settings you will need to enable seeing the extensions so don't forget that otherwise the extension won't be visible to you in this case I was able to change that I'm going back into it I'm gonna go into lib folder then I'm gonna pick net standard 2.1 I do not try to use 2.0 because it has some uh errors like uh it's an older version this works flawlessly so what we are going to need is the dll file itself so I'm gonna move it up into the main folder that was in so this is one of the dlls I need and I'm gonna move into the AWS SDK and here I'm gonna find AWS sdk.core and poly these two dlls are what I need so I'm just gonna scroll down there are quite many things here like all the possible available sdks of Amazon and so core is here so core dll I'm gonna drag this into new folder up there and I'm gonna go down to find poly so let's go down pipes proton pulley is right here so I'm again picking poly.dll other things you don't really need at the moment just moving it up so Microsoft PCL Amazon SDK poly and core these are the three libraries that I need let's go back into Unity I'm gonna create a folder here called plugins and I will place these three libraries right in here so we are ready to script fit Amazon SDK so let's create a script that we are going to use to run the SDK I'm gonna call it um text to speech and let's open our script so first thing we are going to do is create the client so we can actually connect Amazon web services here we are going to use this uh secret key and the um public key that we had so for that um first I'm gonna create a let me actually just disable Copilot say foresee here all right so okay so first thing I'm gonna do is a variable called client and this is going to be a type of Amazon poly client so new Amazon poly client so this client uh takes uh two variables one is the Amazon web services credentials second one is the region so I'm gonna create another variable here let's call it credentials and this is going to be basic Amazon web services credentials all right so in this one uh you will use your access key which is the public key and the secret key so I'm not going to type mine right here just yet so you can't see it but this is where you are going to put your keys again this is not a good practice because people just can uh you know reverse engineer using a tool like dot Peak and see your keys they can use them and you will get a huge bill do not do this in your product best way to do it again have a server side just get the generated audio from there it might be slower but this is a safer way to do that unless Unity introduces I don't know some other way to handle this kind of situations all right so um I have my credentials now I'm gonna just put the credentials in here and what I'm gonna need is the region which might be for me um Central Europe one should be all right probably this was Frankfurt okay all right this is pretty much it now we have our client already registered the next thing is we need to create the synthesizing the speech the request for synthesizing speech I'm gonna create a variable called uh request and this is going to be a type of synthesizes speech request all right so this is going to have some initial values in it so when we check the UI in the Amazon website there are certain options like which engine you want to use which output format voice ID the name of the character you want to use and the text itself so first we are going to have our text let's say um testing Amazon probably in unity and we want to use a natural sounding person so engine we are going to use um engine neural so this is more human-like sounding one that rather than standard which is I think a little more expensive per audio but result is much better not not like robotic voice there we have voice ID which is the name of the character we are going to use there are like many of them available you can see the names here um as far as I remember some of them didn't really work for me so I can't really say if all of them are available but we can for example check Arya I do not remember using it but if it doesn't work we can change it to another one and our output format will be MP3 output format there's um I think Ogg and wave format as well but this is what we are going to use for now so we created our request as well now I'm gonna make my request and wait for the response so we create a response uh objects and client dots synth synthesize speech I think and I'm gonna send my request in there the thing is this is a asynchronous method so I can actually update it right here sorry awaits however I will need to change the type of my method to available so it's going to be a sink void starts let's say privates private I think void start so I can actually have available tasks inside this is pretty much it we created our credentials created our client with the credentials created the request made the request and pretty much we have our audio right here unfortunately things are not over right here the results we get from this is a sorry let's go to respawns and audio stream is the result we get so ours our audio is in here it's a system.io.stream type and unfortunately Unity does not have a direct way to convert audio stream into audio clip so we will need to handle this manually I looked for multiple ways I tried to handle this in the memory directly but all I get is uh just like static noises buzz and nothing proper so how we handle it is we get the audio stream first we write it into a file an MP3 file so we pretty much make that result a concrete file in our system and using a Unity Web request will load it from local into an audio clip so this is the way to do it um not a direct way going back and forth in different places but this was the only way I was able to discover if you have a better option please let me know in the comments we can update this example and have a much shorter much better way to handle that but so far this is how I was able to handle it so what we can do is writing this audio stream into a file how are we going to do that let's create a method for that I'm going to say private void [Music] um right into file and this is going to get a stream it's called a stream and this is it so here we are going to turn our stream into a file stream so let's say using VAR um file stream equals holy equals new file stream so this file stream will have the output file name and the type so we can just call it audio.mp3 and it's going to be a create operation we already had the break using here so I think if we just save it in the audio MP3 it might um it might go into the root folder so maybe best is to put into persistent data folder so I'll just [Music] say application persistent data path so this is where our file will be written and for that I need to create a data buffer which we are going to save every chunk that we are going to read from the system read from the Stream let's say you byte array um bytes and size can be for example four times one k or you know it's up to you to decide the uh since this is audio file larger chunks are much better um so smaller chunks means you know like more iterations but uh you also don't want to leave some uh space if it's you know like too small so if it was smaller text I would have smaller buffer size let's say a number of bytes red so bytes red is starts from zero and while uh there's bites to read we are going to say stream dot read and this is to buffer buffer we'll start from zero we'll go buffer size and we will doing this while this is larger than zero apprentices like this so yeah while this whole thing is uh larger than zero it's gonna read and what we're gonna do is file stream right and we'll write the buffer from zero to buy three bytes read so so while there is byte street we will keep writing this into the file stream and once uh using block is done we do not need to do a close or dispose as far as I know because at the end of using file stream will be flushed and we will be done with this so I'll just uh copy the signature from here whoops let's get this here so we made our request and get the response we got the response now this stream is in the response itself and we say audio stream and now we got audio stream from the file we wrote it into persistent data path audio.mp3 and we have our file in our system so the next step is to download it using Unity Web request so I'm going to say using um VAR www equals new for not not necessarily because I'm going to use get audio clip or Unity Web request so Unity Web multimedia Unity web request multimedia is the class I'm gonna use and get audio clip will be what I do so I know the path I will make a call to the same path because the audio is in here and the type of the audio is so MP3 goes under MPEG file here yes so yeah this pretty much covers MP3 and MP3 we picked this one because we opted in for mp3 so we create our web request so now I'm going to start my web request I'm going to Cache this asynchronous web request result into a variable so I'm gonna say send web request why I'm doing this is that um this is not a core routine it's a asynchronous method start method so I'm not going to do a yield return the WWE sender web request but I'm gonna keep this asynchronous operation result and in a while loop I'm gonna wait until it completes so what's gonna happen is I'm gonna say hope that is done while it's not true I'm going to await task yield task here will be coming from system threading task and we will call yield method so this will do it and once the operation is done I can get my clip from the request how I'm gonna do it is I'm gonna use download Handler audio clip get content method and I'm gonna put my web request result into it so this is how it is how can we use this clip is that I will need to use a audio source we don't have it yet but let's create a variable for it that we're going to assign from the inspector private um audio source audio source let's save it and audio Source dot clip is the clip we just retrieved and audiosource dot play so when we run the scene our audio will play then let's go to Unity scene and create this audio Source I'm gonna create a game object first that I want to assign the script onto and let's create our audio source which we are going to assign into this variable that we just exposed so we're gonna set it to there and we're gonna play it in it I expected to say testing Amazon poly in unity but before running the scene one thing I forgot is actually setting the credentials so I'm gonna just move my screen to the other side and paste my credentials in there where you can't see it so and right after that uh I should run the scene which I saved it right now I should run the scene and we should hear the audio testing Amazon poly and you amazing so it did work all right so uh this is cool let's change the text to this is a pretty cool Tech and let's correct their let's change the character to uh let's say ayanda all right so let's try something different I'm running the scene this is a pretty cool tick so as you can see you can get different uh accents different uh sounding voices uh this can make your NPCs quite diverse all of them can have different uh personal sounds to them you can also assign this this is very cool and this video lasted longer than I expected so I'm gonna cut it here but thank you for so much for listening and in the next episode uh we can continue the smart MPC project and implement this in it using the components that is coming from Ready Player me to basically move them off or we can also add Oculus vizam and have much better looking facial movement with the audio to complete our smart NPCs have a nice weekend
Info
Channel: Sarge
Views: 3,574
Rating: undefined out of 5
Keywords: openai, chatgpt, ai npc, chat gpt, open ai, game ai, dalle, dall-e, unity open ai, unity game ai, gpt3, gpt4, ai game, game npc, how to use gpt, how to use chat gpt, chat gpt in unity, gpt in unity, use gpt in unity, gpt3 in unity, gpt4 in unity, speech to text, ai speech, aws polly, aws polly unity, ibm watson, azure cognitive services, aws text to speech, amazon text to speech, amazon polly, amazon polly unity, how to open aws account, how to open amazon account
Id: rdHqRRzltTo
Channel Id: undefined
Length: 29min 33sec (1773 seconds)
Published: Sat Apr 08 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.