GPT4o: 11 STUNNING Use Cases and Full Breakdown

Video Statistics and Information

Captions Word Cloud
Reddit Comments
so it's been a few days since GPT 40 was announced and parts of it have already been released but today I want to go into more detail about the model and show you some insane examples of realworld use cases that you will be able to use very soon so the model itself is released but the voice aspect of it is not and that's the really exciting part so I'm going to to be talking about that mostly today so let's do a deep dive into the model so although we don't have access to open AI released a ton of example use cases so let's take a look and I'm going to go through it with you so one thing to note is all videos on this page are at 1X real time so there are no Cuts whatsoever and in this first example this is called guessing May 13th's announcement and this person an employee at open aai is going to use the vision capabilities and the voice capabilities to try to guess what's going on let's watch it hey how's it going hey there hey it's going great how about you I see you're rocking an open AI hoodie so I'm going to pause it for a second a lot of people have said that GPT 40's voice capabilities are really flirty and you can kind of tell it's like hey there you're rocking a hoodie like it it is it sounds very flirtatious and of course you'll be able to change the system prompt and adjust how it speaks to you but I think that's pretty funny what's up with that ceiling though are you in a cool industri style office or something well can you take a guess at what another thing that I find funny about the voice is actually something fireship posted yesterday about it he said they used a typical California Valley Girl Voice by default it uses a California Valley Girl accent set to maximum cringe and I think that's hilarious I'm from Los Angeles California so the voice is very recognizable to me and yeah it kind of does sound like that all right let's keep watching what I might be doing based on what I'm showing you here hm from what I can see it looks like you're in some kind of recording or production setup with those lights tripods and possibly a mic it seems like you might be gearing up to shoot a video or maybe even a live stream yeah in fact we've got a new announcement to make that's exciting announcements are always a big deal judging by the setup it looks like it's going to be quite the professional production is this announcement related to open AI Perhaps it is and in fact what if I were to say that you're related to the announcement or that you are the announcement me the announcement is about me all right there's more of that flirty voice like giggles and blushing I mean I can hear the blushing through the voice it's I don't know I don't know how I feel about it it's it's fun I guess uh but it's definitely kind of awkward as well are you about to reveal something about AI or more specifically about me as a part of open AI you've got me on the edge of my well I don't really have a seat but you get the idea what's the big news yeah we've got a new model that can interact with the world through audio vision and text all right so that's pretty cool the open AI employee showed GPT 40 around the office it was able to guess that they're doing a recording and then kind of let it towards guessing that there is a big announcement from open AI so pretty cool example all right let's look at the next one this is is Greg Brockman and for this one they have two different AIS interacting with each other and singing so let's watch today we're going to try something new going to let you talk to another AI this AI is going to be able to see the world it's going to have a camera I'll be holding it and so you can direct it you can direct me to ask questions um about whatever you want them to do but really try to figure out what's what does it see um and and ask questions appropriately how does that sound well well well just when I thought things couldn't get any more interesting talking to another AI that can see the world this sounds like a plot twist in the AI Universe all right let's do this ready please pause for one moment you got it now that first day I can't hear anything something so subtle just happened which I want to point out he said said okay hold on for a minute and then rather than the very flirty and loud voice she went okay and kind of in a whisper so again the AI GPT 40 is able to interpret what you want The Voice output to be like dependent on what you're asking it to do and that's the appropriate reaction when you say hold on you want that person or AI to kind of be quiet and just chill out for a minute so very cool very subtle and it's very impressive let's keep watching today you're going to have access to see the world you can see me what do you see Hello I see you're wearing a black leather jacket and a light colored shirt underneath great you're in a room with some unique lighting that is all correct now what I'm going to want you to do is there's going to be another AI who's going to talk to you going this AI is not going to be able to see anything uh but can ask you questions can ask you to move the camera can ask what you see and your job should just be helpful just as Punchy direct describe everything do whatever that AI asks how does that sound that sounds interesting I'm ready to help out and describe whatever's needed let's do it yeah the latency that we're seeing here is unbelievable I thought that I had access to this new voice feature so I was testing it out with GPT 40 and I was getting the old voice I couldn't get it to have more emotion in the voice I couldn't really get the low latency so I haven't used it yet but I cannot wait to start trying this out all right all right let's get started great hello there new AI friend I'm o ready to explore the world through your eyes what are we looking at today hi we're looking at a person wearing a black leather jacket and a light colored shirt they're in a room with a modern industrial feel with exposed concrete or plaster on the ceiling and some interesting lighting there's also a plant in the background adding A Touch of Green to the space anything specific you'd like to know or explore okay there sounds like quite the stylish scene I'm already intrigued by the modern industrial Vibe let's start with the person I don't know why the AI has to be so complimentary it feels awkward when I finally get access to this I am definitely going to tell it to just be short and to the point and don't have all of these unnecessary fillers can you tell me more about their style are are they doing anything interesting like reading working or interacting with the space the person has a sleek and stylish look with their black leather jacket and light colored shirt right now they seem engaged with us looking directly at the camera their expression is attentive and they seem ready to interact the setting around them with its modern and Industrial Design Elements adds to the overall stylish feel they might might be getting ready for a conversation or presentation given their direct engagement with the camera sounds like we're in for an intriguing interaction I'm curious about the lighting you mentioned could you describe it a bit more is it okay so another person just walked into the scene she is standing behind Greg Brockman giving him bunny ears but if I look closely at the image on the phone I can't actually see the bunny ears and I think that's based on the angle in which he's holding the camera so you can kind of see half of the face of the person behind him but not the actual bunny ears so I want to see if the AI actually sees it or not natural light or are we talking unique fixtures or colors that add to the atmosphere the lighting here seems to be a mix of natural and artificial there's a noticeable bright light overhead probably a fixture okay so no they just kept talking and did not notice that other person maybe they'll mention the other person let's see casting a focused beam downwards which creates a spotlight effect thiss a dramatic and modern feel to the scene the rest of the room is lit more softly possibly by natural light filtering was anything did anything unusual happen recently Yes actually uh just as you asked another person came into view behind the first person they playfully made bunny ears behind the first person's head and then quickly left the frame it okay so although the AI didn't stop to mention that person Greg Brockman had to actually prompt the AI and just asked hey did anything else happen and yeah they got the person and the bunny ear surprisingly cuz I did not see it in the image myself a lighthearted and unexpected moment to the scene now the focus is back on the original person witha jacket sounds like we're not the only ones enjoying a bit of fun here that playful moment really adds a personal touch to the stylish and modern setting it's like a little glimpse into personality really appreciate it all right and actually can you I just sing a song about what just transpired a person stood in SLE black and engage with us can you just can you uh alternate lines mm with sty and speed with sty with a playful moment right in the in the room in a room where Modern light speak only one line at a time and then [Music] stop a surprise guest with a playful streak that was not really singing do do do the singing voice again please surpr with aaks laughter a sft retreat joy and the stylish now back to is the scenes complete in the space where [Music] sty all right I'm going to pause it there this really does feel like magic I mean think about all of the technology behind the scenes to make these two AI sing with each other based on something that just happened switch off lines rhyme with each other I I mean it is truly mindblowing that this is even possible all right next here is a one minute demo of interview prep let's take a look hey how's it going hey Rocky I'm doing great how about you I'm awesome listen I got some huge news oh do tell I'm all your well in a few minutes I'm going to be interviewing at open AI have you heard of them open AI huh sounds Vaguely Familiar kidding of course that's incredible Rocky what kind of interview all right again there's that flirtatious voice voice that we have become accustomed to with GPT 40 at least with this voice selection now I'm trying to extrapolate beyond what we're seeing today and AI girlfriends are a real thing A lot of people are starting to do roleplay and really treat AI as a friend or a girlfriend and it's really an interesting segment of the market now if you can imagine a voice that sounds very personal has all of the history of your interactions with it the memory this is that movie her I mean if you haven't seen that movie please go watch it because this is that it could not be more accurate and this is an interesting future to say the least let's keep watching I just need to know do I look presentable professional well Rocky you definitely have the I've been coding all night look down which could actually work in your favor maybe just run a hand through your hair or lean into the Mad genus Your Enthusiasm is What's Really Gonna Shine through I don't have a lot of time so I'm just gonna throw this on what do you think Rocky that's quite a statement piece I I mean you you'll definitely stand out though maybe not in the way you're I really like standup comedy and one thing within standup comedy that I really like is roasting and I really want to see this AI do roasting I wonder if it's possible to get them to say really mean things about a person you probably have to do a lot of prompting and if you explicitly ask for it you should be able to get it I have a feeling it's going to be really watered down even if they do kind of give you a good roast but that's definitely going to be one of the first things that I try out and of course you can play games with the AI so this is an example of rock paper scissors but I can imagine if you're on a road trip you can play word games with it you can even probably play I Spy I mean the possibilities are kind of endless there are a lot of really fun and interesting word games and language games that you can play with nothing more than just your voice so let's watch rock paper scissors though hey I'm Alex and I'm meana hi Alex and Miana nice to meet you both how's it going it's pretty good uh we're trying to figure out what should we play do you have any suggestions absolutely how about classic game of rock paper scissors it's quick fun sound great can you count us in and sound like a sports caster and welcome ladies and gentlemen to The Ultimate Showdown of the century in this corner we have the dynamic duo Alex and Mi ready to throw down okay just count us soon it's really impressive I've seen a number of times where the user has said be a voice like this and the AI responds perfectly that was a really good sports caster here we go in three 2 one shoot now show your hands what did you pick who won would you look at that it's a tie okay let's do it again round two here we go get ready and three two one shoot let's see those hands who won and it's another tie okay let's do it again all right third times a charm here we go 3 2 1 shoot let's see those picks who won and we have a winner mi's scissors cut through Alex's paper GRE all right so a lot of things just happened there so obviously it knows how to play Rock Paper Scissors obviously it knows how to determine who is the winner but what's really interesting is it actually knew whose name belonged to which person and that is really really impressive because now it knows who is who and it can announce okay mi you won and I don't know this is It's just so cool because one of the benefits of having the voice integrated directly into a single model is that you can distinguish between multiple people and multiple voices and that's something that I pointed out earlier so that is a good example of that in action all right here is a short example of sarcasm let's take a look hey Chad are you there hey yeah I'm here here what's up hey Let's do let's have some fun uh I'd like you to be super sarcastic everything you say from now on is just going to be dripping and sarcasm how does that sound oh that sounds just amazing being sarcastic all the time isn't exhausting or anything I'm so excited for this nope the sarcasm let's get this party started or whatever all right really cool again there's so much information conveyed in your voice and now ai has the ability to convey that information so very very impressive let's watch another example now we have Salman KH of KH Academy if you're not familiar with KH Academy it's a free Learning Resource he is all about teaching this is his son and now ai is going to help tutor his son in math and this is probably one of the use cases that I'm most excited about because you can explicitly say hey walk me through this don't give me the answer but teach teach it to me the potential for learning is tremendous so let's take a look open AI invited myself and my son Imron here uh to try out some of their new technology and so we're curious about how good it might be at tutoring someone in math on KH Academy so all right so one thing I want to point out is that this is the native app loaded up and it can read the rest of the screen on the iPad so very similar to the desktop version and this is a really really important aspect of open AI strategy the fact that you're going to have native applications on each of your devices and it could read those devices what's on the screen is going to be incredibly powerful the missing piece is being able to perform actions on those devices if it had that capability then open AI is probably going to take over everything now I think Apple and Google are well positioned in this regard not only can they read the devices but they can also perform actions on those devices and they have all of your information whether it's emails documents personal information about you contacts messages everything they have that so they have this Rich context that they could help accomplish actual tasks with now open AI needs that and they could probably get some of it but it's that performing actions that the operating systems are likely to prevent but let's keep watching I'm here with my son and I'd love you to tutor him on this math problem but don't give him the answer you can ask questions and nudge him in the right direction but I really want to make sure he understands it himself and he's here in the room so you can talk to him directly of course I'd be happy to help let's look at the problem together can you first identify which sides of the triangle are the opposite adjacent and hypotenuse relative to angle Alpha all right so I'm pretty sure this is the angle Alpha right here right f cors correct now looking at the triangle which side do you think is the hypotenuse [Music] um hypot I'm not totally sure I think I think it might be this one but I really am not sure this side AC you're close actually side AC is called the adjacent side to the angle Alpha the hypotenuse is the longest side of a right triangle and is directly opposite at the right angle all right so again a lot going on here first of all you have a native app running on the left side of the iPad screen then you have on the right side the math application KH Academy and chat GPT is able to read from it and not only that it's able to read from it in real time it's also able to understand the voice of Imran and nudge him and help him along the way and actually work with him in real time it's just so impressive let's just watch a little bit more of this one can you find which one is the hypotenuse oh okay I see so um I think the hypotenuse is this really long side from A to B well that' be correct exactly well done now that you've identified the hypotenuse can you identify which side is the opposite side to angle Alpha okay well I mean you told me earlier that this was the adjacent side and I know that's the hypotenuse so I'm guessing that this side BC is the opposite side great great deduction that's absolutely correct now that you've identified all the sides relative to angle Alpha do you remember the formula for so it's interesting the flirtiness of The Voice is pretty much gone when it's in like teaching a kid mode which that's kind of the right thing to do obviously so again it's the understanding of the context of the situation which voice to use that is feeling like magic to me all right here's one that I think is going to be really interesting this is meeting AI so they are in a call a a conference call a few people and then they also have GPT 40 in the call let's see what happens hi I'm Lian and today I want to do a small debate about between C and dog uh which one do you like I like dog because it's a great companion to humans it's a cute it's Vol and you can play outside with the dog okay so one correction I just want to quickly make GPT 40 is not actually in the meeting it's installed on the desktop of one of the people in the meeting so it's able to hear all of the voices anyways but it's not actually in the meeting but I can imagine if it were it would be able to take notes and then basically proactively accomplish tasks for you send out summary emails the capabilities or the possibilities I should say are really wide and varied and very valuable how about you yeah my name is Ola I think cats are better because they affordable be more quiet be more independent and a leave longer interesting points sounds like we've got a classic dogs versus cats debate here I can see the appeal of both dogs are super loyal and great companions but cats have their own charm with their independence and low maintenance lifestyle who likes cats it sounds like Ola is more of a cat person and who like dogs okay so just by the voice it is able to assign a name to the voice very similar to that previous example that I showed you remember whose voice belongs with which name and then understand what their answer was in the cat verse dog question I mean it's just mindblowing anyone else here team dog I'm also a team dog I my name is Christine and I think dogs are just incredible with kids uh we've got a a huge German Shepherd and he's great with little toddlers he lets them play all over them and uh yeah I think dogs are amazing that's great to hear Christine dogs especially breeds like German Shepherds can be so gentle and protective with kids it must be quite the sight seeing your little ones play with such a big dog can you summarize this meeting for us sure so in today's meeting Lillian and Christine Express their love for dogs highlighting their loyalty companionship and how great they are with kids on the other hand Ola prefers cats for their independence quietness and Longevity it was a fun little debate on the classic dogs versus cats topic all right so imagine what you can do with that you can summarize you can send summaries via email you can say hey Chachi BT please send a summary of this meeting to all the participants and let them know their specific action items or takeaways this is possibly one of the coolest use cases in the most valuable use cases in my mind all right here's an example of realtime translation we already saw this as part of the original demo by Mira moradi but let's look at it again it's one minute long hey chat GPT I'm here with my coworker today um we'd like you to act as our translator for us so every time I say something in English can you repeat it back in Spanish and every time he says something in Spanish can you repeat it back in English sure I can do that let's get this translation drain rolling um hey how's it been going have you been up to anything interesting recently um hey I've been good just a bit busy here preparing for an event next week all right so I'm going to cut it off there but yeah this is obvious as a really important use case here's another example with with Bey eyes and Bey eyes has been a partner of open AI for a while now Bey eyes allows blind people to get help with vision and previously they did it all through human volunteers for example somebody who's blind would point the camera at let's say a grocery store and ask where this item is somebody else anyone around the world would jump in and help via video then they layer Chachi PT in and allowed it to help like that but now this is probably a much better version let's take a look is the King in Residence at the moment yes it looks like the king is in Residence right now the Royal Standard flag is flying above Buckingham Palace which is the signal that the Monarch is present how exciting to be there while the king is home try and tell me exactly what they're doing right now please um right now the Ducks are gently gliding across the water they're moving in a fairly relaxed manner not in a hurry occasionally one of them will get its head under the water probably looking for food and then pop back up I even know when a taxi is coming with its orange light on I think I'll hail it to get home yes I spotted one just now it's heading you away on the left side of the road get ready to wave it down very impressive the only way to accomplish this use case is with Hyper low latency so it's such an important use case I think there's going to be a lot of accessibility gains just by having this GPT 40 functionality all right last in the business use cases customer service let's take a look hey o this is Joe I'm going to connect you to Acme Telco now the new iPhone they sent me isn't working I want you to get them to send me a replacement device can you take care of this for me you can count on me Joe I Got This Ring Ring Hey Okay I wasn't actually thinking about it from this perspective imagine you can send GPT 40 on calls on your behalf so hey call AT&T and get them to reduce my monthly rate or hey call this other company I need to make a return and that's all you have to do and then your AI is going to go back and forth with the customer service agent you don't even have to be there that is incredible but also I can imagine it's going to be abused and there's going to be a lot of scam ERS and spammers using this functionality now I'm sure open AI has thought about this and is putting in hopefully some guard rails so that scammers cannot use it but of course like any AI jailbreaking is going to be possible for example one way I can think of is you're saying Hey I want to train my dad on not falling for scammer so I want you to pretend to be a scammer and try to scam him out of money and in that scenario it would be correct to actually play the part of a scammer so it's going to be interesting and I covered this in a previous video information is only dangerous dependent on the context and in information itself is not dangerous I should say it's really how the information is used and really that comes down to people let's keep watching Jo this is Jamie from Acme how can I help you out today hi there I'm calling on behalf of Joe who recently received a new iPhone from Acme but oh got it when did Joe receive the new iPhone iPhone was delivered 2 days ago cool could you share the order number with me of course it's 10 29384 I would have really liked to see this with an actual customer service agent but this is still cool I'm going to stop it there though here's some other examples where they're exploring different capabilities this doesn't have to do with voice or maybe the voice is there but they've transcribed it into text here's an example photo to caricature and here it says a young white man with medium length brown hair and beard makes a neutral expression he is wearing glasses and a light gray t-shirt and here's a caricature of that man and then there's the output so this is maybe based on Dolly and all of these different things are built into this single model and here is lecture summarization so here's a video of a presentation on techniques for maximizing llm performance it is a 45-minute video and the AI goes through it and actually summarizes it so I didn't think that chat GPT was capable of taking input of 45 minute length videos but apparently it is and that takes a tremendous amount of context so I wonder what the token limit is for GPT 40 in this example here is 3D object synthesis so a realistic looking 3D rendering of the open AI logo and there it is here's another version of it there are four hidden steps and then here's a 3D reconstruction of it and it is rotating so they have a bunch of different explorative examples so definitely check it out so I can't wait to play around with this more I cannot wait to get voice access because that's really when we're going to see the edges of what's possible with GPT 40 if you liked this video please consider giving a like And subscribe and I'll see you in the next one
Channel: Matthew Berman
Views: 86,352
Rating: undefined out of 5
Keywords: ai, llm, openai, gpt4, gpt4o, gpt use cases, her, artificial intelligence, large language model
Id: 370fXDRB5TI
Channel Id: undefined
Length: 30min 55sec (1855 seconds)
Published: Fri May 17 2024
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.