OpenAI Whisper C++ on Raspberry Pi 5 - Real-time translation and transcription

a bunch of people have been asking me does the whisper model support French or Japanese or other languages not only online but also in person so I didn't really know the answer until now and the answer is yes now interestingly enough whisper does speech to text transcription but it also does language translation now here are the different models of available so there's four English only models and they end inen and then there's five multilingual models now the Raspberry Pi 5 will handle up to the small model here and the four will only do the tiny so I'm going to try the small and base on the Raspberry Pi 5 um I'm going to show you the links in the description to everything including the the previous install video for the Raspberry Pi 5 and uh we'll give this a go also here's the whisper page on open ai's website and uh it's talking about translating to English and uh it's pretty interesting stuff let's give it a shot here to install the model the process is the same as before just um down here do small at the end instead of small. and then replace this with base and it takes uh I don't know about a minute for each one then when we're ready we're going to run these and these are the multilingual models all right let's run our base model we have a fr French weather video all queued up [Music] here mod [Music] as you can see it does a uh decent job but definitely not 100% accuracy but it'll give you a gist of what you're listening to and translate from French to English pretty cool all right now we're going to try the same thing with Japanese and we're going to switch to the small model [Music] [Music] for we're going to give the base model a shot again this is the multilingual model and we're going to try Japanese this [Music] time [Music] thank you [Music] in general I noticed that the Japanese translation worked better with this base model versus the small model so I'm going to try the same thing just for fun with Spanish we're running the same setup here with this Pi five Raspberry Pi five and a USB mic let's give this a shot see what we get [Music] [Music] fore isos [Music] I'm pretty impressed by this to be honest for a few reasons uh number one this is all happening on the device uh the computation is done locally so you know nothing's being sent to the cloud or coming back from the cloud so that's number one number two it's automatically doing this so it's automatically translating from the detected language to English so you know Spanish French your Japanese it's detecting that and then translating and then writing it out transcribing it so I'm very impressed now the accuracy isn't quite as good as the if we doing English to English so for example the base. dobin that's going to be more accurate because we're doing English to English theen is English but um the multilingual model I'm still I'm still pretty amazed that uh this can all be done locally I'll put all relevant links in the description as you can see this will translate from many different languages to English and uh thank you for watching
