Open AI CEO STUNS Everyone With Statements On GPT 5(GPT-5 Update)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

so some Outlet in a recent interview spoke about many different topics including gpt5 and I believe that many people did miss his recent statement in which he talked about what the future models that openai are building are going to be like so in this video I'm going to tell you exactly what he was talking about and everything we know about GPT 5 so far because there's been a decent amount of information released in the past week for a while here we can predictably say this much compute this big of a neural network this training data this will be the capabilities of the model now we can predict how it'll score on some tests we're really interested in which gets to the latter part of your question is can we predict the sort of the qualitative new things just the new capabilities that didn't exist at all in gpt4 that do exist in future versions like gbt5 that seems important to figure out but right now we can say you know here's how we predict it'll do on this eval or this metric so essentially what he's saying here is that since we know what these models have done on previous test evaluations we can lightly predict what gpd5 will be capable of in certain tests so what I'm going to show you right now is a screenshot of how GPT 3.5 and gbt4 performed on some standard Benchmark tests now if you don't know what these Benchmark tests are let me go ahead and read one of them the mllu is essentially a benchmark that is designed to measure knowledge acquired during pre-training by evaluating models exclusively in juroshot and few shot settings so essentially this Benchmark covers 57 subjects across stem Humanities the social sciences and more and it ranges in difficulty from an elementary level to an advanced professional level and they test both World Knowledge and problem solving abilities so tests like these allow users to understand exactly what the AI is capable of and it becomes increasingly easier to predict these models in terms of their capabilities now the only problem that we're going to be having with gbt5 and future models is that we won't know the emerging capabilities and this is what Sam Altman is talking about you see emerging capabilities are something in which we can't predict what these abilities are like so in this recent AI talk which has now gained over 2.1 million views there's a small clip that I do want to show you all because it is very very key to understand exactly what these emergent capabilities are like and how we came to find them yeah and you some people use the metaphor that AI is like electricity but if I pump even more electricity through the system it doesn't pop out some other emergent intelligence some capacity that wasn't even there before right and so a lot of the metaphors that we're using again paradigmatically you have to understand what's different about this new class of Gala generative large language model AIS this is one of the really surprising things talking to the experts because they will say these models have capabilities we do not understand how they show up when they show up or why they show up and what you see here and I'll move into some other examples might be a little easier to understand is that you ask the these AIS to do arithmetic and they can't do them they can't do them and they can't do them and at some point boom they just gain the ability to do arithmetic no one can actually predict when that'll happen here's another example which is you know you train these models on all of the internet so it's seen many different languages but then you only train them to answer questions in English so it's learned how to answer questions in English but you increase the model size you increase the model size and at some point boom it starts being able to do question and answers in Persia no one knows why what we see here is of course truly fascinating but at the same time it is quite concerning imagine creating something that has abilities that you simply cannot predict what it is going to be able to do and what we're creating is of course potentially going to be super intelligent in the future now of course there are more examples that they do delve into and the next one is honestly very interesting because we didn't know about certain capabilities until we realized the tests for them so it's very interesting to see what these models are capable of when we gain the understanding of different and new Concepts like theory of mind for example which they talk about act how the rate of theory of mind has increased since 2018-2019 here's another example so AI developing theory of Mind theory of mind is the ability to like model what somebody else is thinking it's what enables strategic thinking um so uh in 2018 GPT had no theory of Mind in 2019 barely any theory of mind uh in 2020 it starts to develop like the strategy level of a four-year-old by 2022 January it's developed the strategy level of a seven-year-old and by November of last year is developed almost the strategy level of a nine-year-old now here's the really creepy thing we only discovered that AI had grown this capability last month so definitely thought provoking but the most important thing that I did find from Sam Altman's recent statements about gbt5 was where he talked about the future modalities now we all know that everyone has been so focused on text but across many different AI developments we know that text is only one form of modality across many different research papers and many different companies they're starting to raise the quality of other modalities this essentially means things like audio things like video and other ways of communication now we need to understand is that Sam Altman here talks about AGI and he also discusses how the future modalities namely gpt5 are likely not just to be text-based now we know that gpd4 already did have some images in the developer live stream which we'll talk about later but take this clip from Zam there are a lot of things about coding that I think are a particularly great modality to train these models on um but that won't be of course the last thing we train on I'm very excited to see what happens when we can really do video there's a lot of video content in the world there's a lot of things that are I think much easier to learn with video than text there's a huge debate in the field about whether a language model can get all the way to AGI can you represent everything that you need to know in language is language sufficient or do you have to have video I personally think it's a dumb question because it probably is possible but the fastest way to get there the easiest way to get there will be to have these other representations like video in these models as well again like text is not the best for everything even if it's capable of representing everything Sam just told us that text cannot represent everything and of course this is true people don't just talk to each other we have body language we have music there are many different forms of communication that help us understand the world today so we know that future models may be gbt 4.2 maybe gpd5 are likely to have perhaps video features or 100 photo features like cbt4 will have down the line and it's very interesting to see how this is going to be developed because of course this is something that is going to be a new front now there are some companies that are working on combining these modalities like meta a couple of days ago they released something called image bind where they actually try and combine all the different modalities and essentially create one unified way of communicating between these modalities which is really really interesting and we do know that openai already Eddie did create Dali 2 which was an AI system that can create realistic images from a description in natural language so we know that the possibilities are there and we also know that some companies like Microsoft have actually gone ahead and released these small projects like visual chat TPT in which they explore how talking to an image can actually work when integrating it to a large language model and using openai's Dali 2. so this is something that is called visual chat GPT it is a free demo that you can actually use we've done a video on this before and it is really exciting to see how these future models are going to be now this isn't something that is 100 streamlined meaning that it is quite slow to respond but this is likely what the future of chat EBT is going to be like because as we know text is only one modality we are missing several different modalities such as audio image and of course video now something that we do also need to remember is that recently in the gpt4 live stream when GPS before was announced they also did announce that gpt4 is of course going to be multimodal the only problem is they haven't actually released this yet now many are speculating as to when this update is going to be there the only thing that we are thinking about is that they need to fine tune this because of course as you know people may submit many different images and there could be many different privacy issues and many other issues that we haven't thought of yet so this is likely to be released sometime but we aren't too sure as to when that is going to be because of course open AI haven't given us a timeline just yet something that is interesting when you actually want to talk about timelines for GPT 5 is that gpt4 finished training in 2022 of August okay so that means that they finished training for around you know seven months before they decided to release it because of course one thing that you know about these AI teams is that they always want to ensure that these AI models and large language models are extremely safe because when they release them to the public they're going out to a hundred million people in many different countries but it's just goes to show us that these things do take a long time to train namely around six months and remember it was released in March of 2023 so that is a pretty pretty long time now Sam Altman did recently actually talk about when he's going to start training gbt5 or else he actually said he's not currently training 255 and won't for the next six months which means that speaking in terms of timeline is likely that we won't get gpd5 at all this year drama do you agree with her would you would you pause any further development for six months or longer uh so first of all we after we finish training gpt4 we waited more than six months to deploy it we are not currently training what will be gpt5 we don't have plans to do it in the next six months so that statement was from where Sam Altman had to testify before Congress about the rapid rise and increase in AI development and essentially of course as you know he just stated that that we won't be training gpt5 for at least the next six months so it's arguable that it is very likely that we won't gbt5 this year but we do know now that from Sam's recent statements the gbt5 and the later versions of gpt4 subsequently leading up to G5 are likely to focus Less on text but more on other modalities that do include image and do include video now there's one major thing that I think nobody has taken into account just yet and I do think the modality of audio is likely going to be a feature of gpd5 or the further versions of gpt4 and here's why I state that if you read openai's blog posts on their research posts they have one post which hasn't really been covered by anyone that I've seen before and it's called whisper so it says we've trained and are open sourcing a neural net called whisper that approaches human level robustness and accuracy on English speech recognition so the applications for this could be pretty much anything and we know that with future modalities including things like image video and audio it is likely that openai already has the infrastructure to be able to build this so I wouldn't be surprised if in future there are updates where chat GPT GPT 4 or gbt5 does include the ability to potentially get a conversation from a you know an audio or for us to be able to talk to chat GPT in real time and it seems like this software that they've been able to build has been really really good so let me know what you all think about gbt5 and if you are anticipating it or if you are just patiently waiting and want more features for gbt4

Info

Channel: TheAIGRID

Views: 286,320

Rating: undefined out of 5

Keywords:

Id: ucp49z5pQ2s

Channel Id: undefined

Length: 11min 41sec (701 seconds)

Published: Fri May 26 2023