Unreal Engine 5 - Ultimate Voice AI Tutorial - Masterclass from scratch

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello folks I am Marvel Masters personal AI assistant and today he is going to teach you the basics how he created me although my voice sounds way nicer I'm now handing over to Marvel Master have fun watching the video hello folks I'm a Marvel Master and welcome to this tutorial in today's video I'm going to teach you how you can create an AI assistant with Android engine 5. set GPT and 11 Labs text-to-speech the assistant will work similarly like the one in front of me I will press a button ask the assistant something and it will respond let me show you hello how are you today I am still functioning properly and ready to assist you with whatever you need please let me know how I can help you the tutorial will contain the setup for voice recognition chat GPT 11 laps voice response as well as all requirements and limitations it will not include the hand movement or the lip glow but if you are interested into the whole project file I will upload it to my patreon so you can head there through the link in my description and consider subscribing so you will have access to the project I'm sitting in front of first let's start with the limitations and requirements you will need an account for open Ai and 11 lab when creating an account on open AI the first time will grant access to a five dollar free credit during the first three months so I think you can use the AI system with Android 4 without any problems because let me show you I used the open AI API key a very lot in May and I only paid 15 cents for it so don't be afraid if even yeah three months are over you can still enter billing options and then pay for it but as I said in my case it's really cheap with 11 laps it's a little different story you can create an account freely you will also get 10 000 characters for free for one month and it will be enough for a little testing but if you need more you can also subscribe on their servers and get even more characters per month if you need them for me personally it's not enough at the moment so I am working on a account of a friend but if you need it you might consider subscribing to a higher tier now let's move to the requirements I'm going to use three plugins for the whole process the runtime speech recognizer to recognize my speech through my microphone the runtime audio importer to import the Audio I am recording into the engine and the virus plugin this is for the communication with the apis all of these three plugins are completely free you can download them right away but one thing you have to keep in mind if you already have one of these three downloaded already then you have to check if they are up to date you can do this by going to your epic games launcher to your engine versions and it says below your engine version installed plugins you can click on that and once you've installed all of them you will see them in this list the virus plugin the runtime speech recognizer and the runtime audio importer and they need to be up to date and if there is an update there will be a little button or even a button on the right to update it please make sure that you have the latest version otherwise you might run into problems I myself had a problem that during my development the runtime audio importer had an update and that broke my whole process at all but I figured it out alright then let's actually start with creating the AI assistant from scratch so as you see as you can see I am in Epic Games launcher and I'm going to press start on under engine four point 100 5.1 here I'm going to click on games I to third person so we have a mannequin there then I call my project AI s system and hit create all right once the engine has opened I make sure that all plugins we will need are installed so I do the top right and press plugins here I'm searching for the varrest plugins then I'm going to search for audio importer and select a runtime audio importer and then again I search for the Beach react organizer selected speech recognizer and then I press restart now once the engine has restarted I can close the plugin window and let's first check if my character is walking yes it is and now let's first create our speech recognition for speech recognition to work you of course need to have a microphone so make sure you will have one otherwise the whole processor won't work like that however during the tutorial there will be a jet GPT section where basically string commands are entered so if you don't have a microphone you can skip the voice recognition and start in the later section of this tutorial all the code for voice recognition I'm going to put into the third person blueprint here so I'm head to third person blueprint open the third person character and I'm right away going to start with the coding here So the plan is to press a button record my voice output it to a string and send it to chatgpt later so I right click and because I'm using hashtag hashtag for recording I press hashtag press enter and here is my button whenever I press the hashtag or hold it my recording will be started most of the code we need now is based on the voice recognition plugin so I head to the Unreal Engine Marketplace website of the runtime speech recognizer and down in the description there is a documentation link where I click on and I can go to the documentation and on the right there is a button that says how to use the plugin I click on that and a little down there are two examples we're going to use our case we are going to use streaming audio input of course we are streaming from them from a microphone and not from a file and in this little text there is a link to copyable notes I can click on that and it will direct me to blueprintuee.com where I can zoom out here select all these nodes hit Ctrl and C and then go to my engine again and here I can hit roll ndv all the cold will be pasted into my end so now there will be a few errors here if you try to compile but let's fix this the custom event is our hashtag pressed so I can repeat the customer event the speech recognizer is missing the variables and I'm going for all missing variables now to right click on each and create variable once I have done that all the errors are gone and we can check if it's working as you can see there is an unrecognized text segment event and it prints a string of the text I have open into the microphone so let's test this I hit compile and save minimize it play my game and now when I press #and hold it my voice will be recognized and printed out as string hello can you hear me as you can see that worked my voice has been recognized and printed out as string one important thing to mention here is that this runtime speech recognizer has settings you can access these settings by a button on the top right and then go to Project settings and here on the bottom left there is a runtime speech recognizer section and there is not a lot of settings but very important settings for example the language models it's very important to know that the higher language model you choose the more computation time your PC will need and the computation time Rises exponentially when you choose a higher value than tiny if you for example use small or large it will automatically download these models and then use them but if you execute the code then your PC might hang for a few minutes so I'm going to use tiny however if you have a extremely powerful PC meaning 64 cores or more you can try small or medium which will increase their voice recognition quality by a lot but keep in mind the computation time Rises also a very lot so I'm sticking with tiny at the moment and another thing you can set here is the model language you can set it to English only or to multilingual which means you could also use other languages Asian Spanish French German and the voice will be recognized as these languages and printed out a string however also using multilingual as option will also increase computational time and also delay in the whole process so I'm going to stick with English at the moment now for better debugging let's open the third person character again and let's put a little commands here in front it will print if it's recording and string that says recording started so you know if the button press is working and if the recording has started and then let's add some more prints to when the recognition has started so let's put it here and string voice X and method let's make it green and four seconds and another one when the beat recognition has been stopped so we we know when the process has been stopped okay let's make that orange and three seconds another thing I want to do is to stop the voice recognition once I release the hashtag button so at the moment it only captures five seconds but I change it to when I release my hashtag button it immediately stops capturing and I delete the delay so now when I'm going to hold my hashtag button the voice recognition is permanently fired and once I release it it will be stopped at some point you might notice that if you are going to try to each longer text but it will bug out this is because your speech is broken into text segments which means each text segments can be like five seconds long and after that a new segment will be recorded or recognized if you need to speak longer text you can right click and search for recognition parameters make speech recognition parameters and here you can set the step size to higher for example like if you need to speak 15 seconds you can set us to 15 000 here and also you can name some other settings here but you can refer to a documentation on the runtime speech recognizer website if you need to of course there's everything explained what you will need if you need to change anything here but for our purpose now to setup I'm going to do here is completely fine so let's hit compile and Save and test it again now when I hit hashtag and hold it the recognition is fired and when I release it a recognition will stop hello world I am marble master alright that was the first part about recognizing your voice and printing it printing it out as a string so I can select all press C and call it ignition okay now let's move on with the jet GPT integration so when our voice recognized ring has been created we can fire an event and run the conversation with chatgpt to do that I'm going to create a new event here so I right click and type custom event and add in custom event and now I'm going to call it enter voice prompt from here I'm going to drag a line and for debugging debugging purpose print a string again print string and call it prompt chat PPT make it green again and maybe five seconds or so so we know what is happening then I'm going to create a new variable here on the bottom left click this plus button and I call it import prompt and make it a string on the top right and one that once that variable is created I can connect this variable by dragging out the line from the print string and type input prompt and set input prompt I'll leave a little space here because we will add something in between a little bit later but let's first move on from the input prompt I connect the input to the event and it will create a output for the event and this event I'm going to call after the voice recognition has been happened so after unrecognized text segment I'm going to drag a line and type enter voice prompt and this enter voice prompt I'm going to connect to the recognized words now when I double click on the enter voice prompt I am teleported to the position where we were before and I can move on here so our voice prompt is now written into this input prompt variable from here I am going to make use of the raw as bar risk plugin now so I connect the input prompt to construct Json request and if you don't see anything like I you have to uncheck this context sensitive checkbox on the top and you will see something will appear and in my case I'm going to construct a Json request and inside I will change the verb to host and the content type to Json and from the target I drag a line and type RS subsystem and select the very bottom one over the very top one sorry from here from the construct Json request I'm going to connect the node called set request request object and connect this to the return value and for the Json object I create a node that it's called make Json and when you click on it you can add inputs to the big Json we need three one two three the first one will be the model the GPT model we are going to use type will be a string the second one will be messages and the type will be an object as an array very important and the third one will be Max tokens and this will be number it wants to make jsonness set up we can add a model which is PPT 3.5 Turbo and the max tokens I use will be 100 now this is basically how many words chat GPT will use to answer you if you set this way you're too low the answers might be cut at some point and if this value is too high the answers might be very extensive but you have to keep in mind this will use your tokens you might pay for it someday so I keep 100 at the moment and for the messages this is an array so I drag out a line from here and say make array I add a pin and first one will be also a make Json and the second one again will be make Json and again for the jsons we have to add inputs here the first Json we need two inputs one is the role which is a string and the second one is content which is also a string the role will tell GPT what it is and the content will tell Judy GPT what purpose or personality will have for example the content I am going to tell it that you are an AI named assistant you have both Unreal Engine development and general questions and you answer with a maximum of 50 words this is a little trick so as you might know jgpk tends to write very extensive answers and with that um you keep the answers Compact and safe on tokens for the role I type system here and we have to do this for us as user 2 for the make Json we need two inputs here again the first one will be rolled as a string the second one will be content also as a string and for role I type user and the content will be the import prompt hit compile and Save you might ask yourself now where I got this information from and all of these information I got from the open AI API reference when I head to the openai website again when you logged in on the very top there's an API reference and on the left there is a chat section and here are some crucial information for example in the role which roles we can have the system is usually usually chat GPT or assistant and we are the user and we added the model which is at GPT 3.5 turbo and we added an array of messages one for the rural user with the content and one for the role assistant or system with its content also we added the Json object with which is this one here where we set the max tokens let's move on with the next section we have to add headers here and the header info I got from the up here content types and auto authorization so let's move on by connected reset request object to a set header that header Target will be deconstruct Json request and the header name will be authorization like in the API referend authorization copy and paste it into the header name and the header value will be your API key but with Bearer written in front so let's drag out a line here and make it as a rare variable but as a special string so let's connect this to an append if the context sensitive again here and it will append only a string here and 48 type error and don't forget the space after it I can even note it here space after error and the B is the API key which we are going to promote as a variable I'm recording it open AI API or then you hit can hit compile and then when you click on this open AI key you can fill its contents on the on the right to get an open AI key you had to open ai.com and sign up an account once you have signed up and logged in you might see this website and we want to access the API so I click on API and the top right where it says personal you can view your API keys so click on it and you can view them only once if you haven't written them down before or if you have created them before you can't access them again so you can create a new one if you need to and I'm going to call it ue5 story yellow ue5 tutorial I create this new secret key copy it note it down because I will not be able to exit it later then I head to my engine again click on the open AI API key and paste it into this field so now the whole string for the authorization is written like Bearer base and then the open AI key now let's move on by setting another header so set header and the target will be again the construct Json request the header name is again on the API reference Alt content type or I select and copy it and paste it into the header name and the header value will be application Json so copy this and paste it into the set header add a value now from here we process an URL So Pro says URL and again the target will be Json request in the URL will actually be the system we are using for this chat GPT response go ahead to the API reference again and select a link copy and paste it into the URL hit compare and save in between so if it crashes you don't lose everything from here we find event on request complete and after that we bind event on the West Hill so if our request is completed so if our string is sent to jgpt and it has answers for us we can run a custom event some event and call it GPT request complete and to check if you have correctly set up your open AI account you better enter some debugging options here just follow along get response as string this will send the whole response that chat gpg sends to us a string the target is a request and if the string contains green contains substring which means if the response contains anything that says exceeded so if you are if you have no tokens left or if you have not paid or if you have not set up your account correctly then we branch and print a string that says open Ai No tokens left and we make it red and 10 seconds so if at any point you get this notice you might have set up your account wrong or you have no tokens left or you have no free tokens left or you made an account on the wrong website now let's hit compile okay there's an error because Target is not set for the bind request complete and fail so the target is again to Json construct request same as for the fail just construct tracing request then I can hit the compile again and the fail in Arrow because there is no event added to it so I just drag out the line and add custom event call it CPT request failed and from here I've just print something print string GPT request failed make it properly red again and duration again 10 seconds but this should only happen when there's no connection to the server or something then I hit compile and Save and we are nearly finished with it let's move on the GPT request is complete and there is no error about exceeded tokens or something then you can print a string which will be our GPT response and for the GPT response we need something that is and response object so I go to GPT request complete and connect it to get response object from here I connect it to get object array field the field name is choices and return value is an array where type get a copy of this array index 0 and from the get I break Json and this Json will again have some outputs here at this time the name of the output will be message and it will be an object and this message we connect to a break Json again and to break Json this time is named content and a string in this string we plug into our print string let's make it Pink so we know that this is the answer from chat GPT or open AI API so let's hit compile and save and test it all right now when I hit play and I hold hashtag I will say something and it should respond in pink text hello can you hear me as you can see nothing is happening and we have some errors here when I close the play mode so let's find out what happened this is also a good opportunity to show you what can go wrong and how to debug this thing so we get the response content as string so we can use it to find out what happened so I drag out the line from here and print string and connect the get response content string to the print string let's make the print string red again and maybe duration of 10 seconds that's it compile and Save okay let's try it again hello can you hear me okay there's an error message messages is a required property all right and let's hit let's head to the section where I set this option it's in the make Json as the message is set up and I already see what is the problem I didn't exactly write what's written in the API reference let's head to the API reference again API reference chat chat completion and here it says model and messages but with no capital letter this is very important write it as it's written here so I change the model name with a small M and the messages with also a small M hit compile and save let's try it again hello can you hear me and here we get a response and they receive version of content that says yes I can hear you okay now we can move on from there seems to be working now so I can delete my print string again from the GPT we press complete and connect the get response container string to the branch again and now when I do the same again I sort of get response in pink hello can you hear me yes I can hear you how can I assist you today all right so it seems like the voice input is working now and we get a response from jet GPT so I can select this whole section here press C and call it Equity response foreign seems to be working but now we have a little problem because the AI has no function to remember what we are talking about so let me let me show you what I mean can you tell me what the biggest ocean is the biggest ocean is the Pacific Ocean covering approximately blah blah blah okay okay okay now I will ask a question referring to the question before and what is the smallest animal there sometimes your voice does not get recognized properly so you have to do it again and what is the smallest animal there smallest animal in the world is fair fly which is type blah blah blah blah so it doesn't see the context and what I'm talking about and this is a big problem because every question you asked chbt it will respond with a new answer not knowing what we were talking before so we're going to fix fix this by storing our conversation into variables so let's do this now inside the blueprint again we go to where our voice is recognize and set as a string on recognized text segment event here we have to store our text somehow so let's drag this to the side and in between we right click and type add array and reconnect this between these two nodes and then right click again and type append string and this append string return value you plug into the bottom of the add array node and run the array Target you drag out the line and promote two variable and call this conversation array on the bottom left now this is our conversation array where all the conversation will be stored and the value that is written into this array will be we as user and the actual text that has been recognized by speech recognition so the user says the text we are speaking into the microphone and this line will be added to the conversation array this also means we don't have to input our prompt here again but you can double click on here to go to the GPT response again and then disconnect the input prompt and the new input prompt will be our right click conversation array get conversation array join ring array and connect this to the input prompt and the separator will be just a space for everything we say now will be written as a line for example user base and then whatever we have open into the mic this will be the first line we are sending to chatgpt so this will be the line for our inputs and now we need another line for the responses from the system from chatgpt so let's head to the end of the chat GPT response where the GPT request complete this and here we add a few nodes after the print string which will still be the response from jet GPT but we store the response into the array so from the break Json let's append the text again but not into the a but into DB and into the a we write system because the system is jet GPT and then we add the line that GPT answered it array to the over station array so this means the text that is written into the array will be first the user user us or whatever we spoke into the mic the next line will be this above will be system whatever GPT responded this is the value of the array after these nodes user whatever you spoke into the mic and system whatever GPT responded now let's test this in theory the AI assistant should be able to refer to questions and topics we talked before hello can you tell me what is the biggest ocean the biggest ocean is the Pacific Ocean covers blah blah area blah blah okay now I will ask the same question what is the smallest animal there and what is the smallest animal there again my voice does not seem to be recognized properly properly so I try again and what is the smallest animal there the smallest animal in the Pacific Ocean is likely a type of Plankton or small fish like Kobe blah blah blah as you can see now the AI assistant is now responding to the topic properly that we discussed before so this now seems to be working and we now can speak to Jet GPT and it is responding as this text so we move on with creating the voice reply with 11 labs all right to get an 11 apps voice reply let's create a custom event again let's call it event at custom event run speech each reply from here we print the string again for debugging purposes creating voice reply let's make it no yellow and five seconds so we know this will be executed and the voice reply will be executed after our GPT response the GPT response is finished after the response has been added to the conversation array so let's run the speech reply and double click on it again I land on the 11 lap section okay from here we are needing an input for The Voice reply we need to send a text to 11 Labs that it can convert to speech so click on the event add an input here let's call it the GPT reply and make it a string so from here we promote it to a variable and then we move on like we did in the GPT setup so we construct Json context sensitive again disabled construct.json request connect the target to VAR rest up system again and the verb is post and the counter type Json again from here let's set request object and the Json object will be a make Json again and for the make Json we need some elements again one two three the first one will be X second as a string sorry second one will be the model ID and the third one will be voice voice settings will be an object the text will be our GPT reply what we will be sending to 11 lips a model ID we need to get from the API so let's do that for the voice response I like 11 apps most because it has the best text to voice system in my opinion so I'm heading to Google and search for 11 labs and click on the first link here now I'm going to create a new account button on the top right sign up and I enter my email address and my password it redetermines the service and sign up now I have received a mail and I'm going to confirm it quickly okay I confirmed the link in the mail and now I can click on sign in edit my credentials again now let's test this quickly I choose Bella because I like Bella the most hello world I am Bella and as you see this works when you have created an account at 11 Labs there's also a tab that says resources and here is an API reference same as from the open AI API it will and what we want to use is text-to-speech stream in this case and here are again the information we need or example the model ID in this case is 11 monolingual V1 select and copy it and then move to my engine again and paste it into the model ID in the voice settings is in this case NADA make Json which has inputs to and the inputs are the ability and similarity boost again where do I know these from from the 11 Labs API ability and similarity boost we can also copy and paste these into the engine and also its values 0.5 let's say 0.5 notes a few side notes Here the stability and similarity boost stability means our emotional 11 Labs voice will sound so if you set it to zero The Voice will be very emotional if you said that the one will be very clear and and robotic I say let's set it to 0.5 and similarity boost will be a versatile the voice will be if you ask the same questions so if you ask ask the same question multiple times and this is set to to one The Voice reply will always sound the same and if it's set to zero The Voice reply will sound always different our stake so we stick with 0.5 this time again and for the model ID 11 laps recently introduced the multilingual model but for this purpose we only use the English model but if you like to have different languages as output you can also use the multilingual model but you have to keep in mind that if you want to use the multilingual model it might also have a bad influence on the delay of the whole assessment so we stick to the monolingual model at the moment alright then let's move on here we again set header here the target is again the construct Json request return value the header will be accept and the header value will be audio mpac where do I know these from again use the first header as we as you can see we really need three then let's do the next header set header Target will be again the return value of the Json request add a name will be the XI API key or a copy paste it and the header value will be the 11 Labs API key so let's drag out the line here and check contact sensitive remote to variable and call it on the bottom left 11 labs e let's compile and Save and where do you get your 11 lap ski from I can show you on the 11 Labs website again you can click on the top right and go to profile and here you can Reveal Your API key this is mine don't try to use it I will delete it once the video is finished so don't even try but different to the open AI API key you can always exit it and copy it again and then head to the engine again click on the 11 ellipse key you must have compiled it before then you can paste it here and 11labs key is entered then we can move on with setting another header which is targets again the Json request and this time it will be content type put a header name and application Json put ahead away you from here we can now process again URL Target again is the Json request and the URL we also get from the API reference let me show you it's in the API reference text-to-speech stream and then you can copy this link without the brackets and then paste it here material is a section that says voice ID and as I said before I'd like to use Bella Bella has the voice ID which is this number where can you find these numbers you can find these numbers for example on the auto GPT in GitHub did not find them on the 11 Labs website I'm sure they are hidden there somehow or they found them now on the Auto GPT GitHub and I will post this link or the table into the description of the video so you can use either Bella or other voices or if you have a subscription with a 11 lap service you can even create your own voice and then have an own voice ID which is pretty cool in my opinion alright but for this video now I'm going to use Bella here and so I pasted it into the link at this point and here are some additional information that I couldn't even find in the API 11 Labs seems to have five different modes for latency and you can adjust the latency by adding a little ring behind this link I will paste the string here and also below the video in the description and basically optimize streaming latency means or for lowest quality and zero for best quality but the higher the number the lower the latency is the lower the number the higher the latency is but the quality is better so you can choose for yourself but I prefer lower latency or a better assistant experience so I choose four in this case all right then let's move on yes same as with strategypt we bind event on complete request complete and also find event on request fail the targets will be again the casing request and if it fades we again add a custom event each we request failed and this one will just print the string if the server is not reachable or something whoops each week request failed and what we need is the on request complete so the event we bind to add custom event and call it voice request complete and here again for debugging purposes because if you want to run out of tokens or have done something wrong we add and get response as string the target is the request and if the response contains a string that is called XE then we branch and if a response says that something is exceeded then we print a string and say ran out of 11 laps tokens make it red and 10 seconds again but if everything is fine and the response is a MP3 file or something then we we create runtime audio importer because we want to we want to import the file from 11 labs and play it as sound or create runtime importer promote it to a variable all this variable importer from there we bind events on result and then import from buffer import audio from buffer and target from import audio from buffer is the Importer and the audio data will be voice request complete it responds content and return value goes into the audio data and Technology format should be MP3 since 11 lab sends the audio file as on P3 and now for the point event on result at a custom event and we call it on result and for debugging purposes we again print string playing voice reply again maybe green and five seconds okay now let's test if the playing voice reply is printed or hit play and we'll ask something hello can you hear me playing voice reply nice so import seems to have been working and now we can move on to create a character that will play our voice reply to do that inside the content browser just right click create a blueprint class user character call it GPT assistant and drag it into the world then I opened my GPT assistant mannequin SK minikin the new one SK many put it into right position here rotate it a little addition blueprint of after many just a person that can talk to us basically then I go into the event graph delete everything here then let's create a new custom event add custom event and I call it art voice assistant Voice Assistant will play an audio file so let's add an input to this event let's call it voice response audio and the type is sound base object reference all right the sound base object reference the voice response audio is actually in the in our character import a imported sound wave form 11 labs so let's run so let's find a character in the world from here get actor of class GPT assistant run this event start voice assistant and connect the imported sound wave to voice response audio is compile and Save then I click on start voices Center assistant again and from here we can just play it by adding uh audio component I call it audio just for now drag it into the blueprint editor during boiling control then I set sound and connect it to the start voice assistant and the new sound is actually the voice response audio then I just play audio alright this way when in our third person character we get a result from 11 laps the MP3 file is sent to the GPT assistant actor and then the sound is just played the GPT assistant actor is the one I placed here in the world and when I press play now hold the hashtag button and ask something it should respond hello can you hear me so if you listen closely there was a sound so we are getting response from 11 laps but maybe it's a problem with the settings so let's move over to the settings again stability and similarity boost 0.5 each let's check the API again and on the text-to-speech 11 Labs API text-to-speech stream various voice settings similarity boost and stability and its type is a number okay there's a problem uh accidentally I set its type to a string but it has to be a number each number number then set it to 0.5 again 0.5 again and let's give it another go hello can you hear me all right there was a voice reply but still not what I expected so let's search further so there was a voice reply the settings are correct but this could be wrong maybe the text we are sending to 11 labs all the text is being sent from the chatgpt so let's head over to the GPT response at the very end okay the Run speed reply does not have a variable set so what we want the person to say actually the respond from that GPT or I connect this this is a text respond from jgpt then this will be sent to the 11 lab system here and now it should finally work let's test it again I will hold hashtag and say something hello can you hear me system assistant yes I can hear you how can I assist you today all right as you can see it's working now it's giving a voice reply but there's a slight problem the way we set it up it's like a conversation with always the the one talking in front of the text system says something and user is saying something to prevent that the reply will include system user or assistance we exclude them on the string that we send to 11 labs to do that we go into the DPT response and right at the end we drag a line from the break Json and type replace everything that says system pull on with the space so whenever a system is written in front of the text we are sending to 11 lips it's replaced with just a space so it will never be seen by 11 Labs now we are going to do this the same for user and assistant foreign to the GPT reply that will be sent to 11 labs so whenever system colon user column or assistant column appears it won't be sent to 11 laps so it lets it compile and save again and then let's try again and try to do a conversation hello can you hear me yes I can hear you how may I assist you today can you tell me the color of grass can you tell me the color of grass the color of grass is typically green other other colors too that's correct grass can come in various shades of green depending on factors such as species climate and nutrient levels however it can also turn yellow or brown when it's not healthy or during seasonal changes okay thank you that's all for today you're welcome feel free to reach out if you have any other questions in the future alright as you can see it's basically working there are some minor issues with with voice recognition maybe it's just on my side because I'm not a native English speaker but again if you also have problems with voice recognition you can still tweak the settings for the runtime Beach recognizer and choose a higher model or check your microphone input so it seems to be working now and I would say the whole system is finished now all the 11 laps section I can comment now with 11 Labs voice response it compile and Save and now you see we have voice recognition here a GPT response that responses sent to 11 labs and then and back as MP3 to us and replay it as sound little side note here this is the very basic setup you can do a lot much more with it and one side note about delay we are using the 11 Labs extra speech streaming but we don't really make use of it we wait until the response is complete but maybe there's this there is a way to play the sound during the response this means the delay would be way less so if anybody figures out how to do to do this you can share this to all of us because actually LED 11 Labs voice response is the one thing that has most delay on the whole process here all right that's it for this tutorial I will let the AI or my personal AI now close this video I hope you liked it I see you next time thank you for watching the video I hope you learned a lot today if you got any questions or suggestions just write a comment below or join the Discord server see Link in the description even with the help of chatgpt creating an AI assistant like me was extremely difficult and time intensive so if you feel the need to support Marvel Masters Channel consider subscribing to his patreon the link is also in the description the complete project file as you see me here will be available for patreons to download there this video is not sponsored by anyone but if open AI epic games or 11 Labs feel the need to sponsor Marvel Master feel free to contact him have a good day and see you in the next video bye
Info
Channel: Marvel Master
Views: 13,693
Rating: undefined out of 5
Keywords: UE5, AI, Unreal Engine, NPC, ChatGPT, Tutorial, next gen, future tech, education, masterclass, ultimate, from scratch, integration
Id: xBs-nXzXwoM
Channel Id: undefined
Length: 73min 3sec (4383 seconds)
Published: Fri Jun 02 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.