GPT 5 Unveiled: Everything We Know So Far (Release Date, Parameter Size, Predicted Abilities)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
gpt5 will be the next level in artificial intelligence and it will be developed by openai now this video will cover everything you need to know about gpt5 including timelines how smart it's going to be different modalities and many things that you may have not thought of before this is because since the Inception of gpt4 there have been hundreds of different research papers that will fundamentally shape the way that GPT 5 is created and this is going to be largely different from how gpt4 was trained on and is going to be built this includes but is not limited to the thought process the risks the limitations and the regulations of this model so let's get into exactly what gpt5 is going to be based on everything we know including statements from Sam Altman himself so if we're going to talk about GPT 5 we first of course need to talk about the release date that is one of the most anticipated things that people do want to know now it isn't impossible to gauge when GPT 5 is going to be released this is because Sam Altman has said some key statements that can indicate as to when we could expect GPT 5 to be released so let's take a look at some of the timelines of gpt4 what he said about gpt5 and use that to estimate when GPT 5 is likely going to be released so firstly you need to understand the three stages of building a large language model or whatever AI system it's going to be stage one is data collection that's where you collect the relevant sources that you're going to train the model on stage 2 is of course fine-tuning the model so you first need to understand that with GPT form it was finished way before it was released although gpt4 was released in March of 2023 they actually finished gpt4 in August of 2022 but they had to spend seven months aligning the model making it safe for public use in addition to that they also started data collection in 2021 this means that the Inception of gpt4 started around two years before its initial release and we are likely to see the same kind of timelines for gbt5 so the question is when are they going to start training tpt5 and I think I do have a little bit of an answer so if we take a look at this clip from Sam Altman testifying at a senate artificial intelligence hearing he actually talks about gpt5 so in this talk what he references is an artificial intelligence paper where they want to delay the progression of any artificial intelligence tool greater than gpt4 that's because there are concerns as to how great and the capabilities of that model will be but this is where Sam Altman gives us a slight Glimpse at when he's going to start training GPT 5 or potentially collecting said data so take a look at this clip what about you Mr Roman do you agree with her would you would you pause any further development for six months or longer uh so first of all we after we finish training gpt4 we waited more than six months to deploy it we are not currently training what will be gpt5 we don't have plans to do it in the next six months but I think the frame of the letter is wrong what matters is audit so what Sam Altman just said there was that he's not going to train gpt5 in the next six months now if we take a look at the dates we can try and calculate what he means by this so if it's May and he said that we're not going to train gpt5 for the next six months this means that training could stop on in December at the seventh month and this also means that potentially data collection for gpd5 has already started this is because he hasn't mentioned anything about data collection for gpt5 the only thing that he has mentioned about gpt5 is of course the fact that they haven't started training the model on the data that they've collected so it is likely that data collection for GPT 5 has started in addition if you go ahead to open ai's help area you can see that openai actually does use consumer data to train future models and they state that you can opt out of this by doing that in a specific setting although many may not do so it is indicative of the fact that they might actually already be collecting data for gpt5 and I do think that this is already started so based on the fact that training is likely to start at the end of the year around December times it may be that if it does take another eight to nine months to train the model and then another eight to nine months to finish of course aligning that model we could expect a gpt5 to be released sometime in late 2025. now of course this is just a rough estimate based on what they've told us but that is a conservative estimate based on the training times the dates that Sam Altman himself has stated and what we know the only thing that might lead to gpt5 being released earlier is of course increased competition we know that Google is going to be producing something called Gemini and we know that that is likely to beat all surpassed gpt4 but then of course we have token size window so in the context of a large language model a token size window typically refers to the number of tokens that the model consists as context when generating or predicting the next token in a sequence tokens a unit of text such as words or sub words that the model processes the token size window is a parameter that determines how much context the model can take into account for example if the context size window is set to 128 it means that the model considers the previous 128 tokens when making predictions now currently we do know that gpt4 exists in two versions a 4000 context token window and a 32 000 token context window now this of course at its time was revolutionary but since the Inception of gpt4 there have been major advancements in the ability to process large volumes of text for example we've already seen anthropic release a 100 000 context window AI now if you don't know what their AI is it's called Claude it's an AI that's quite similar to chat GPT 3.5 but this 100 000 context window version can process entire novels and entire books this means that if you wanted to input a whole trilogy an entire movie or multiple books and then ask it about a specific word it simply could which means that the applications for this are going to be incredible so it is likely that GPT 5 if it is released will likely have a much larger context size window additionally since the Inception of gpt4 there was a research paper called scaling to 1 million tokens and Beyond with rmt so with this research paper they demonstrated the ability to get to 1 million tokens and Beyond with a recurrent memory Transformer and it will be interesting to see if this research paper is used in developing gpt5 because we do know that a larger context window does present a much larger ability for a wider range of tasks there's only a small small limitation of this though where the ability to memorize detect and their reasoning drops a bit substantially after around 500 000 tokens so I do think that there will likely be versions that are separate that we currently have perhaps maybe at around a hundred thousand tokens then of course we need to talk about different modalities currently as you know gpt4 was released with two modalities however even though gpt4 was released with two modalities such as text and image recognition we do know that currently these large language models haven't actually added that functionality just yet as of recording this video the only data that we've seen of gpt4 being able to analyze images was from a very small test group on Microsoft Bing this means that they are still trying to roll out gpt4 with images as we speak so now that we've discussed the different modalities that gbt5 is likely to have we also need to discuss the parameter size and of course how it's going to be trained so one graphic that we did see scattered around the internet is of course this famous image where you see gpt3's parameter count and then of course you see gpt4s however this simple image where you see 175 billion and 100 trillion isn't true GPT 4's current parameter count isn't actually publicly available although many estimate it to be currently at 1 trillion tokens now this is of course because when you try to use gpt4 at its current state it is much slower to respond much slower than gbt 3.5 so that is not an overestimate however recent developments in the artificial intelligence and Landscape have proven that larger parameter count doesn't mean that the language model is likely to get better the problem is is that upping the parameter account doesn't mean anything if your data is going to be low quality so the parameter count of gpt5 is likely to be unknown this is because with gpt4 it was only trained on text and if it is trained on images then that is going to be a huge number of parameters that we can't account for we can predictably say this much compute this big of a neural network this training data this will be the capabilities of the model now we can predict how to score on some tests what we're really interested in which gets to the a lot of part of your question is can we predict the sort of the qualitative new things just the new capabilities that didn't exist at all in gpt4 that do exist in future versions like gbt5 that seems important to figure out but right now we can say you know here's how we predict it'll do it there are a lot of things about coding that I think are particularly great modality to train these models on um but that won't be of course the last thing we train on I'm very excited to see what happens when we can really do video there's a lot of video content in the world there's a lot of things that are I think much easier to learn with video than text there's a huge debate in the field about whether a language model can get all the way to AGI can you represent everything that you need to know in language is language sufficient or you have to have video I personally think it's a dumb question because it probably is possible but the fastest way to get there the easiest way to get there well to be to have these other representations like video in these models as well again like text is not the best for everything even if it's capable of representing so potentially depending on the number of modalities that it is trained on it could be larger or it could be smaller but we do know that if gpt5 was just text based the parameter count would be significantly smaller that is because recent papers have shown that the effectiveness of your data matters much more when training your large language model than actually upping the parameter account let me show you some examples showing you that a smaller parameter account is more effective than a larger promise account when you use high quality data to train your large language model so if they're going to train a gpt5 we can refer to this paper by Microsoft released recently in this paper they talk about textbooks are all you need and basically what they state is that when we had high quality data versus low quality data we increased the effectiveness of our large language model three times and the only thing we did was switch the training data this means that if they switch the training data to three different methods that we're about to talk about we could see a 3X jump in the quality and responsiveness of gpt5 even if the parameter count does not increase so I can simply summarize the file one paper by saying that they had a large language model that was trained on 1.3 billion tokens and it achieved comparable results on par with other large language models that were trained on 16 billion tokens and 175 billion tokens including GPT 3.5 so it did on par with or better with significantly less parameters which means that gpt5 doesn't need a large number of parameters to be effective all it needs is high quality data additionally something else that they're going to do which is likely to be done with GPT 5 is resulting from this paper show you're working so openai did release a paper which talked about how they increased the ability of the raw version of gbt4 just by using a different method of prompting so essentially this new method of training was they trained two reward models one for providing positive feedback on the final answer to a math problem and another for rewarding intermediate reasoning steps by rewarding a good reasoning the model achieved a surprising success rate of 78.2 percent on a math test almost doubling the performance of gpt4 and outperforming models that only rewarded correct answers the approach of rewarding good reasoning steps extend beyond the mathematics and Show's promise in various domains like calculus chemistry and physics so the paper highlights the importance of alignment and process supervision training models to produce a chain of thought endorsed by humans which is considered safer than focusing solely on correct outcomes which essentially just means that when you get these large language models to think step by step these large language models double their efficiency simply based on this Chain of Thought reasoning which means that the output that you're going to get is going to be as twice as good which means that GPT 5 is likely to incorporate this into its mechanism for outputting prompts which means even if you put in a simple prompt you won't need to say let's think step by step it will have that thought process originally and the output process is going to be much better then of course we have another research paper which blows everything out of the water so as we talked about before the way in which you prompt gpt4 or the large language model can make the large language model improve by a 2X or a 3X okay and with gpt5 I do know that they are trying to increase the capability now this paper called tree of thoughts increased gpt4's reasoning ability by 900 which is the base model increased by 900 just by changing the words that you input to it so essentially in this paper they talked about how they used a tree of thoughts prompting essentially what it means is that if you think about every time you can make a decision there are about five different outcomes the large language model was asked to rate every single decision from number five being the best decision or number one being the worst decision then every single time they went through that decision they then continued that same process to the end and essentially ranking all the different outputs and essentially thinking about what is the best output that you could get by of course going through every possible output and this of course increased the reasoning by 900 so if we know that tree of thought is going to be implemented into GPT 5 which it most likely will be this could increase GPT 5's reasoning by a huge amount so along with data and training it very differently the parameter size is of course hard to come by but I do think that the quality of gbt5 will be absolutely incredible if we take a look at how smart gbt5 is going to be as we discussed earlier on in the video there are countless different examples of research papers and new different research papers coming out every single week that showcase the ability to increase the effectiveness of large language models without changing anything now we do know that data and stuff like that is going to increase the capabilities but one thing we haven't talked about is how is gpt5 going to perform currently we do know that gpt4 was a huge leap in advancement from GPT 3.5 and gpt4 is absolutely outstanding we do know that gbt4 for example was able to pass the bar exam and was able to get around 90 on various different tests that are benchmarks for artificial intelligence so knowing this it is currently estimated that if gpt5 manages to succeed in its ability to reason think critically and include these true thoughts way of thinking that it could theoretically achieve around 99 on pretty much every test there is we know that it's already great at math already knows every single sub object that there is the only thing we need to do is pretty much fine tune everything in one last site which is why many people are thinking that gpt5 will truly be very close to AGI in addition to remember that GPT 5 will have images embedded in it and of course we know that the performance of gpt4 greatly increased when Vision was acquired so many of the exam questions that gpt4 took they took them with vision and without Vision so they were able to see diagrams and some of them didn't have diagrams and when they were able to see these diagrams gpt4 with a vision improved significantly then of course we need to talk about the various limitations that gbt5 will Implement because although gpt5 is going to be absolutely insane when you think about everything before that we discussed from higher context Windows to image audio and of course to new ways of thinking and prompting gpt4 and GPT 3.5 still struggle with the most basic concepts and you might think that is a ridiculous statement but please look at this Ted talk whether document where AI is incredibly smart and also shockingly foolish because it cannot understand basic concepts but let the video explain it better because it's gonna do a better job so all the video is going to show you is a simple question a common sense question that it doesn't take a genius to answer but gbt4 continually gets it wrong AI is passing the bar exam does that mean that AI is robust that common sense you might assume so but you never know so suppose I left five clothes to dry out in the sun and it took them five hours to dry completely how long would it take to dry 30 clothes gpt4 the newest greatest AI system says 30 hours not good a different one I have 12 liter jug and 60 liter jug and I want to measure six liters how do I do it just use the six liter jug right gpt4 speeds out some very elaborate nonsense step one fill the six liter jug step two pour the water from 6 to 12 liter jug step 3 fill the six liter jug again step four very carefully pour the water from 6 to 12 liter jug and finally you have six liters of water in the six liter jug that should be empty by now so with that it's going to be interesting to see how they do solve this issue I haven't been aware of any solutions just yet but it will be interesting to see if they even focus on this because largely we do gloss over these problems and just focus on the interesting stuff next of course now that you know that we don't really understand exactly what AI is doing we also need to talk about emergent capabilities now this is something that we've spoken about previously in the past but you have to understand that GPT 5 is likely to be a few echelons better than gpt4 this means that even if the parameter count is the same a few emerging capabilities are going to be seen in GPT 5 that we simply cannot predict that have been in gpt4 now one of gpt4's most emerging capability was theory of mind and theory of Mind essentially is where an AI is able to think about how other people are thinking in certain situations now of course this is particularly worrying if you are thinking about how an AI could potentially manipulate humans to get it to do things from it because of course these large language models do have access to almost every piece of text on the earth and that means it's not limited to other books about persuasion manipulation and of core persuasion tactics now take a look at this clip that perfectly explains emerging capabilities you've likely seen it before but for those of you who hadn't you really need to understand because this emergent capabilities phenomenon is likely to be one of the reasons that they don't release GPT 5 on time because if there is an emergent capability then the AI researchers or openai will need to learn how to effectively contain it or potentially remove this capability that some people use the metaphor that AI is like electricity but if I pump even more electricity through the system it doesn't pop out some other emergent intelligence some capacity that wasn't even there before right and so a lot of the metaphors that we're using again paradigmatically you have to understand what's different about this new class of Gollum generative large language model AIS this is one of the really surprising things talking to the experts because they will say these models have capabilities we do not understand how they show up when they show up or why they show up again not something that you would say of like the old class of AI so here's an example these are two different models GPT and then a different model by Google and there's no difference in the the models they just increase in parameter size that is they just they just get bigger what are parameters Ava it's just like the the number essentially of uh weights in a matrix um so it's just it's just the size you're just increasing this the scale of the thing um and what you see here and I'll move into some other examples might be a little easier to understand is that you ask the these AIS to do arithmetic and they can't do them they can't do them and they can't do them and at some point boom they just gain the ability to do arithmetic no one can actually predict when that'll happen here's another example which is you know you train these models on all of the internet so it's seen many different languages but then you only train them to answer questions in English so it's learned how to answer questions in English but you increase the model size you increase the model size and at some point boom it starts being able to do question and answers in Persian no one knows why we've already seen that clip on emerging capabilities but I do think this is also going to show you exactly why AI has these emergent capabilities you have to understand that although we can see the output of artificial intelligence models models like GPT 5 and gpt4 we still don't know what they're actually doing so take a look at this tweet right here it says this person tweeted these Engineers never speak a word or document anything their results are bizarre and inhuman this guy trained a tiny Transformer to do addition then spent weeks figuring out what it was doing one of the only times in history someone has understood how a Transformer works and Transformers are essentially the building blocks of these large language models then of course you can see here this is the algorithm it created to add two numbers and you can see here this is a large simple complex calculation that it's doing to add two simple numbers which is pretty crazy if you ask which means that these AIS think completely different to us this example shows us that this artificial intelligence thought about basic math as rotation around a circle which goes to show that although it might tell us an answer it doesn't tell us how it got there and this is what's so scary about AI we would never know that it's thinking about rotations around a circle when performing simple addition but it is which means that we need to ensure that these artificial intelligence are completely aligned because if you release something like that into the public the risks could be existential that brings us on to one of the last points which we need to talk about and that is of course regulation So currently there are many challenges relating to regulating AI while dealing with the speed of artificial intelligence development however recently there has been an announcement which does show a little bit of Hope recently the UK is set to get early or Priority Access to AI models from Google and openai the UK prime minister stated that we're working with the frontier Labs Google deepmind open Ai and anthropic and I'm pleased to announce that they've committed to give early or Priority Access to models for research and safety purposes to help better evaluations and help us better understand the opportunities and risks of these systems additionally the European Union is working on the AI act a global first That Could set the Benchmark for other countries the legislation aims to regulate all automated technology including algorithms machine learning tools and logic tools the AI act has been criticized by some European companies such as Renault Heineken Airbus and Siemens or its potential to jeopardize the Europe's competitiveness and its technical advantages the US has also proposed a blueprint for an AI Bill of Rights which covers aspects such as safe and effective systems algorithmic discrimination protections data privacy notice and explanation and human Alternatives the US is making progress in developing domestic AI regulation including the National Institute of Standards and Technology AI risk management framework and existing laws and regulations that apply to AI systems and of course many people are trying to currently restrict gpt5 if you haven't heard already there is that letter which is an open letter that says pause giant AI experiments and open letter we call on all labs to immediately pause for at least six months to pause the training of AI systems more powerful than gpt4 and essentially they state that even the recent months have seen AI Labs locked out in an out of control race to develop and deploy ever more powerful digital Minds that no one not even their creators can understand predict or rely Library control therefore we call on all labs to immediately pause for at least six months for the training of AI systems more powerful than gpt4 but as we do know that this is very unlikely to happen this is because we do live in a capitalistic World which means there is a lot of incentive to providing the best products so there are even rumors that China's Baidu has claimed its Ernie Bot Beats Chachi on a key test in artificial intelligence as the AI race continues to heat up and this will be something we do cover in another video because if these other countries are going to be working on trying to beat gbt4 there isn't really an initiative to slow down unless there is some sort of global AI regulation body that can ensure they all slow down and even if you do get the large companies to slow down there is no guarantee you don't have solo coders in their rooms working on large language models that eventually surpass the larger ones in addition we did make another video where we did talk about how GPT 5 is extremely risk this is because from Google deepmind they did talk about how with of course emerging capabilities and AI being able to learn rapidly we have literally no idea what they're going to be capable of and of course Google has deemed any model greater than gbt4 namely gbt5 to be extremely risky so that leads us to the question are you excited for gpd5 or are you more afraid because although gpt5 is likely to be a huge advancement there are a number of unfortunate circumstances that will arise from gpt5 such as job loss and of course the possibility of Bad actors to use these large language models with jailbreaks to harm Society
Info
Channel: TheAIGRID
Views: 19,334
Rating: undefined out of 5
Keywords:
Id: mKpdsQe0sO8
Channel Id: undefined
Length: 27min 21sec (1641 seconds)
Published: Fri Jul 07 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.