Speech to Text with OpenAI Whisper in Unity!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone in this video I would like to show you how whisper API in openai works I have this empty project and I assume you have your open air Unity package already installed you can click on the update button and get the latest like I just do right now um I will drop the GitHub URL in the in the description so you can get it and download the package so in the samples part of the package you can see whisper at the bottom we are going to import this package and play with the sample so let's go to Project window open up the samples open Ai and Whisper and let's open the scene you will see this basic UI there's a microphone selection there's a fill bar a text screen and a button what happens is um you might have multiple microphones you need to pick one so this is the case for me and it was breaking sometimes let's run it I will pick my microphone so this is the last one and I will just say something and see what happens test the open AI Unity package so we are waiting and yep the transcription comes through so what happens is uh the transcription API endpoint receives the audio and then turns it into English takes the English audio turns it in turns into pretty much text so this is what happens let's go to the code and see itself see what is going on in the code itself in the start method I'm checking through the device microphone devices to put them into a drop down so we can pick the right microphone we want to use and also also setting the button for the recording right after that um I start the recording set UI things and then start the microphone and turn it into an audio clipped that we are saving the audio in inside and then when the recording ends actually we make the request to open AI so I'm using a script I found called save valve which helps me turn this it turn this into a byte array so I couldn't find a native way to do the byte uh click audio clip to byte array conversion so I use the script and I'm sending this data to open AI request normally open AI all only takes file uh directors I turned it into byte array support as well so we can handle everything in the memory and do not do any file saving and we are going to use whisper dash one and receiving language is English the transcription language another thing we can do is also get translation of the audio so I'm going to create a create audio translation request let's just turn this into that do not need to write the code again and create audio translation so it takes the same thing only language part won't be there and we make the request update method has a recording related thing just filling the UI and right after that I am going to run this scene let's run it so in this case I'm going to speak Turkish and try to translate it into English because it is going to translate whatever I say into English so let's wait a couple of seconds and yes so it took the Turkish audio it turned it into English and I can see the translation right now not really regardless which language you speak but there are many languages supported and Turkish is one of them and it was able to understand that and turn it into English however I noticed like some languages are not doing that well let's try for example Estonian uh I'm not a very fluent Estonian speaker but I'll try to say a couple of words and let's see what it does foreign and let's wait need my reading assistance so it didn't really work um I'm speaking Estonian scientists turned into something kind of close to what it would sound like English I do not know and let's try again let's wait and my name is Nicholas well as you can see it doesn't really do Estonian that well works with Turkish works with English almost all the time you can check to open AI documentation to see what more you can do I just wanted to quickly go through this and show you sample is available in the package itself details are in the description make sure to check that and now you can actually send audio and get the text from it and use that maybe text in your text completion or chat have a nice weekend
Info
Channel: Sarge
Views: 13,780
Rating: undefined out of 5
Keywords: openai, chatgpt, ai npc, chat gpt, open ai, game ai, dalle, dall-e, unity open ai, unity ai, unity game ai, gpt3, gpt4, gpt, madewithunity, made with unity, indiegame, indiedev, indie game, ai game, game npc, chat app, chat gtp, indie dev, unity package, how to make unity package, how to use openai, how to use gpt, how to use chat gpt, chat gpt in unity, gpt in unity, use gpt in unity, gpt3 in unity, gpt4 in unity, speech to text, text to speech, ai translation, ai speech
Id: hZVXklCA1Us
Channel Id: undefined
Length: 5min 31sec (331 seconds)
Published: Sat Apr 01 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.