How ChatGPT Works Technically For Beginners

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so I've been using chat GPT for the past two months and I'm a software programmer and now every day eighty percent of my work in terms of coding is all generated code by chatgpt and some other AI code generation tools that I incorporate with chatgpt but mostly it's been chat apt and it's amazing how it just completely transformed the way I work every day and throughout this journey I felt both excited uh relieved because some of the coding that I do every day I just don't feel like doing because it's repetitive and I've been doing it for almost 20 years um so relieved excited but at the same time scared because I could see how in some cases when I tell it to code something it actually codes better than me and I get to learn from it while I'm using it so with all these feelings inside of me I finally took the time to Let's uh like let's get to know how it works how it's built and how it's How It Works um technically I would like to know the details but also very high level like I don't want to get into the maths or anything like that I just want somebody to explain to me at a very high level in sort of a like a dummy language layman's terms explain to me how it's built how the scientists are building it and um uh yeah how it's basically produced and I couldn't find anything online that did it really well so I went spent a lot of time learning it and I decided to make a video of my own that's a complete beginner's guide in terms of how chat apt is researched how it's discovered and built and developed and released to us so that everybody can use so let's get right into it so chat GPT for beginners and um I'm gonna put this slides in the description of my video and there there is my YouTube channel there so you can share if you want with somebody else and use this slide if you want so what is chat GPT and why it shocked the world so chat GPT is a conversational AI that is capable of carrying out an intelligent conversation with a human so you can think of this as almost like the scene from Iron Man in the Marvel movies where the Iron Man goes hey Jarvis I need to build this and this can you run some Diagnostics on on this and this and Jarvis goes like after two seconds and then it's like here's the results and then the Iron Man is like oh that's surprising can you look more in more into that in more detail and then Jarvis goes okay sure and then he does more research and then comes back and they collectively work together in a conversation and then they build this thing together right so uh that means AI is now able to have like intelligent like human-like conversations with other another human so AI research itself has been around for almost like 80 years so it's a pretty old uh field but the conversational and also during this time everybody tried to crack this conversational Ai and everybody failed if it turned out to be incredibly different difficult tasks to do and um the the reasons why this is difficult is our natural language like English you know I speak Korean um I mean whatever whatever other languages you speak they're not precise language like mathematics um you when you're talking um in natural languages they're they're they are full of nuances both in the grammatical structure of the language so uh the sequence of the sequence in which the words appear in the sentence matter and then depending on the context of what you're talking about the meaning of the same word that's pronounced the same way uh are different so you have to really listen to the whole context whole explanation of what you're talking about in order to interpret each word and this is this is exactly why it's known to be quite difficult even for humans to be able to fully develop their brain and and be able to carry on like intelligent conversation with another and this is why it really takes human babies a long time to to learn to speak and have a conversation so when chatgpt was able to carry on such a conversation and in some cases for certain topics it could do so better than humans uh it really gave a lot of feelings inside emotions inside um the humans um like things like I mean they feel amazed at this excited and scared and these emotions uh you can get a better kind of control of them once you get to know the limitations of what shept is capable of and not capable of because you know even if you're excited you don't want to be over excited and get disappointed to find out the limitations later if you're scared you might you don't need to be scared but you're needlessly getting scared because you don't know what it's capable of and what it's not capable of you're just kind of scared right so let's get to the details of how it works and then once we understand how it works then we can kind of understand what is capable of and not capable of and some of the differences clear differences between Ai and human beings so that so then you know all of these emotions get cleared cleared out so how did the AI scientists crack this conversational AI so the only it turns out the only animal right that can do this massive you know processing of natural language and having a conversation is humans really we're the only animals that can do this on on planet Earth so scientists decided to uh basically say right so if we can't manually model this with mathematics let's literally just take a look at how our brain works and then simulate that in computer and let's see what happens so uh this also this program right is a completely reactionary and I say that because once you build this program computer program that's simulating our brain it's kind of sitting there and waiting in a computer server because it and then it waits until it's thrown into a certain conversation so some user logs into chatgpt website and then starts typing like a paragraph or paragraphs of something that they want to talk about with the chat GPT and then at that moment it takes all that text data that user literally typed on their keyboard runs it through the program and then gives out an output of a response so it's completely reactionary it's sitting there and waiting to react okay so there's that input part and then there's the generated response part input output so that's all the AI program is so let's go back a little bit and how the scientists begin tackling uh like looking at our brain so this is a guy who won the Nobel Prize what he did was uh he took a rat's brain he sliced it very thinly and then he dropped the watercolor so that he can better visual visually see the structure of the brain slices so he would just put that slice under the microscope and he would observe different parts of the brain and what turn what seemed to be pretty random structure so what you see here is you've got neurons and then in between the neurons the neurons are basically like you could think of them as cells and in between the neurons these cells there's a thin connection where the electricity signals travel and this is why when you see MRI scans you see like lighting up all this this complex structure in the MRI scan is because there's an electrical signal traveling through these neurons and are these neurons just randomly connected that's what I thought in the beginning but it turns out there's different patterns that emerge so depending on the part of the brain that's you know dedicated for memory let's say uh it's structured in a certain way if there's a part of the brain that's dedicated for logical thinking logical processing then it's structured in a different way uh the actual neurons are the same but their connections uh look in some parts of the brain like this and then some other parts of the brain it looks something different and that's what he found out and uh um I don't exactly know what exactly he did to win the Nobel Prize but that's these are some of the uh officer regions that he made so in reality neurons are connected with each other in 3D space so because when you look at our brain it looks like a 3D object like this right and if you look you know into it they look like this here and you've got neurons connected in all directions surrounding one neuron and then the connections are like these thin connections that you can see and what you see is like let's say a brain is like this here and then you give input signals from all parts of your body so your eyes your touches from your fingers uh your smell from your nose everything is converted into um electrical signals from their you know sensing Point like fingers or nose smell receptors whatever and then there's a visual Optical receptors in our eyes whatever it is we our body parts receive that and then it will convert that into electrical signals and then send it to your brain and your brain what the neurons do is when the input signals reach the brain there's the initial input receiving neurons and then when the signals hit them they they get activated and then they sort of propagate the next signals to the next connecting neurons and then you see like this lighting up like if you look at the brain the lighting up like sending signals to from one neuron to another you have the input neurons getting activated first and then you see like an animation of like you know the lights going on in your brain and they light up in different patterns even if you receive the same visual kind of signals from your eyes depending on what you're looking at if you're looking at a Peach versus you're looking at a mountain whatever your brain even though um ultimately they're coming from the same Source um your eyes are sending different kind of strength of signals and your brain kind of lights up in different ways um so how do we it's very complex is way too complex and especially too complex to program this in computers so how did computer scientists then simplify this into a very very simple model right so forget about all the connections let's just take a look at one single neuron how can we boil it down to a simple concept so they basically took this one neuron and they started with let's say one input neurons can have multiple inputs into it but let's just talk about this one input here and here the input you can receive electrical signals from let's say from zero to nine zero being very weak and nine being very strong signal you get a signal that's three and then this neuron this particular neuron every neuron behaves and activates differently but this particular neuron when it receives three it activates and it outputs a 519 because it's got three connections to three other neurons so at the end of this connection there's other neurons here right so this neuron is going to get five signal five even though this neuron received a signal three this neuron is going to receive five this neuron is going to receive one and this neuron is going to receive nine so if we have just many many neurons like this and they they're all connected in the computer program right are we mimicking what the brain is doing yes and depending on so you've got not just one input neuron but you've got multiple input neurons and then you simultaneously send signals to it they're all going to light up and then you have the output neurons at the end that will also be activated and then give you human understandable answers at the end so let's talk about that so one of the most successful AI case that really AI became a thing is image recognition and this is where we started seeing the importance of using neural networks what the scientists did basically is they had a set of images thousands and thousands of images of dogs birds and cats okay and then what they want to see is basically when they present this picture of a bird and then they kind of converted into electrical signals into the initial input layer where they have in this case it's a very small neural network but this is just to kind of demonstrate to you what it would look like but in reality in order to understand an image like this you would have like thousands and hundreds of um uh you could potentially have hundreds of neurons I don't know maybe this many neurons is good enough to decipher dog Burkhead I'm not sure whatever you've got an input layer and then they will all decide to light up and activate and then pass on signals to the link next layer and then they would also decide to pass on signals to the next layer next layer and then at the end because as humans we're interested in only three types of answers dog just tell me either this is a dog bird or cat so we've got three neurons at the end and then the question is what kind of signals given the input image what kind of signals do we get for a neuron that's responsible for an answer for the dog or the bird or the cat and hopefully our hope is that given the image of a bird we get a very strong signal coming out of this neuron that's responsible for bird and then we get very weak signals for neurons that are responsible for dog and cat so then we can safely say as a human being okay this AI tell me what this is and then oh it's a bird right so what is a training of an AI we hear this all the time that people talk about like chat GPT had to be trained right so like I said we have thousands of images of dog bird and can initially you have um these neurons that are connected in this case one neuron has a connection to every single other neurons in the next layer and then this one let's say is connected to every single other neuron in the next layer as well and initially when you have when you input a image of a bird right they all travel remember some neurons will decide to activate and some neurons will only decide to activate to maybe from here to this one this one and this one and then ignore this one and this one it's it's all random in the beginning so you start with a very random neural network you give it an image of a bird and it gives you the answer of dog that's usually what happens when you start randomly and it will give you the wrong answer every single time right and um the the idea is then you have a training uh data set of like let's say thousand pictures of dog bird and Cat and then every time it gives a wrong answer you basically tell the neural networks that was not the right answer so you need to change your activation Behavior a little bit so they will all change their activation behavior and then you will feed in the images again and then test like or is it or giving the right answer or not and then you just keep on doing that over and over until they begin to more or less start to give the right answer at the end so then whatever it is doing inside right uh what seemed to be a purely random activations now become like um start to have like um certain patterns so so if you fit in a picture of a dog then you would have like certain neurons lighting up and that would eventually lead to this lighting up and give you the answer of talk so that's what it means to be training you really have to just start with a random set of neural network and connections and activations and then you just keep on reiterating and telling it you're wrong you're wrong you're wrong until oh you're starting you you start to get the answer right okay two more of that do more of that and then it will just eventually the activations and connections are the smarts inside of an AI so in this particular design of a neural network where you have initially input layers and then next layer of neurons neck and then activating next layer of neurons activating next layer of neurons and eventually activating output layer of neurons is particularly good for image recognition but it this this particular neural network pattern it's it miserably fails in other tasks especially like natural language like if you want to have a conversation so what the scientists realize um and there's a lot of I think Google is like the most advanced AI scientist it's not even open AI who made chpt actually open AI scientists learn from Google papers so what the Google people realize is that right we can come up with not just this one pattern of neural networks we can kind of structure it so that sometimes you could even have a output of a neural network feeding back to the earlier layer that's that's kind of crazy right here here you you just see One Direction like from from left to right and then eventually you get an answer but instead of that they they kind of figured sometimes you can even look back into your own neuron or you can send the signal back and feed it into the previous layer so then then the the answers you get out of that starts to become very interesting and they found out that these different patterns of the sort of networking of the neurons um and a lot of it is also based on our own uh observing our neurons in the brain they look at okay how is the neurons in our brain kind of connecting itself is it really truly just unidirectional from left to right like like this simple case that's not the case our neurons in our brains are very complex and they kind of make connections to other neurons and then that that neuron five or seven steps down down the road also connects back to this neuron and it's all jumbled up and it's crazy right so they also simplified all that patterns crazy patterns and came up with some of the patterns that seem to be working pretty well in computer program I don't know too much about this I mean it seems like the red is the output neurons where you get the final answer and the yellow is the input neurons where you initially hit with the initial input data and whatever is in in the middle just do their crazy magic I don't know the details of these uh but the important thing is these a lot of these patterns are borrowed from our biological brain right so I just want to make sure uh depending on the the networking of the neurons you can have very simple behavior and also very complex Behavior that's all we need to understand here from here now then how did chat GPT come about right so this is a paper from Google scientist and I don't understand this either so I'm not going to try to explain it in detail but on the left hand side what they did was that's the initial part where when the humans actually type in the input like chat message it would come and hit the initial part of the chat gpt's neural network that tries to understand what's going on and try to understand the context of of the input and then the next part is generating the answer so they've broken it down into understanding what's the input understanding the input and then the next second part is generating the answer so that's how they did it but I'm guessing uh the neural networks that are involved in here and here are like based on very complex pattern nothing nothing like this simple is going to be very complex pattern but then it's also the way you um train them so basically so it's very similar to how humans go through learning so human babies they learn to speak the language even before they go through formal education in school so you can see babies at seven years old 80 years old they they start start talking to their mom and dad but they do it in a very unprofessional way that like uneducated way like very rudimentary but they can still communicate but how are they doing that it's not like Mom and father are sitting there teaching grammar to the kid Mom and Dad just kind of speak to them their uncle and whatever other human beings all speak to that baby and they all talk about different things to the baby sometimes they're talking about the TV show sometimes they're talking about like buying a chocolate from a supermarket whatever whatever the topics may be the human baby is kind of sitting there and just receiving all these information from different other adults and then they start to find patterns in their brain and that's their complex neurons here what you see forming connections with other neurons and then if that was a mistake they would disconnect it and then they would form it with another neurons and then also they would change the activation Behavior right so all of that development is happening inside of the baby's brain but it's completely unsupervised nobody's guiding uh the kid what is right and what is wrong the baby is just kind of taking it all in and then speak out to test um uh the baby's current current babies are brain State and then if the if the feedback from the adults are not what they expect then they would just feel stressed and that causes the brains to like disconnect with the current connections of the neurons and then reorganize itself and then test it out again in the real world and that process repeats over and over just like how the scientists are training or what seems to be a very random simulated neural network and then they keep on training that baby is doing the same thing so before going to school they go through all this unsupervised basically fine pattern in chaos phase and then they enter school and then they formally get educated and this is with very strict guidance and the school tells the kid either from the teacher or the exam scores the school tells the kid what is a better answer and what is the worst answer and this is a completely supervised learning right and this is the fine tuning so most of the work has already been done with the unsupervised run learning by taking all this complex information and the baby can find patterns rough patterns of what they're what the conversation is about but then the school is like an icing on top it just basically takes that and then fine tunes the baby's brain with with a guided guidance based education does exactly the same thing so the part of the brain are the neural networks in chatgpt that has to understand the context of what's being talked about when they're training that neural networks there's no human involvement they just gather all kinds of information from the internet and then just throw it at the um the neural networks and then the neural networks will start to find rough patterns right so this set of um blog posts I think it's talking about something related to like Tech this I think it's talking about like I don't know traveling I think this is talking about I don't know human like a marriage relationships whatever it start to find these patterns and group the blog posts in all the texts well open air claims that they scraped all the text websites and blog posts and everything that is text based they scraped it everything from the internet and they just dumped it right into this neural network so this this neural network it was completely um uh trained unsupervised and it can find patterns uh given any input text that uh that the user from on chatgpt website they start writing whatever they want to talk to channel and when that input comes in that neural network it can find patterns and context of what exactly that person is talking about like really fast so that's like almost like what a kid does right and then the actual answering part to you know talk back the response back to the human on chat apt website that's on a task of another neural network that has been completely trained with human supervision so what that means is they openly I literally hired thousands and thousands of people right and then the outputs of the the context understanding neural network will be the input to the response generating neural network right so right so this is like now I understand what this person is talking about in the chat right now here's here's how I understood them so that becomes the outputs of what they understood because the input to the response generating neural network and then the response generating neural network will give out a certain response in text right because it's a chatbot at the end of the day and that text has been read and judged and scored by humans so this is like equivalent to school teachers in schools and then the human judges will basically say no that's that's illegal answer like you can't sometimes it's flat out wrong and then the human is like that's a wrong answer uh like I don't know one plus one is not three it's two like the humans are actually telling the chegebt you're wrong and then once it corrects itself and also there's there's other cases where Chacha PT would give an answer to how to kill another human being and then the human judge will be like you don't have more okay so you cannot give this kind of answers reject so with these kind of supervision the response generation neural network is basically they're able to kind of learn the ethics learn the morals everything and start to put out like a human-like responses so it's important to understand there's two uh by and large two neural networks going on inside of chatpt okay so current state of chat GPT for unsupervised learning the the uh the understanding of the input part that neural network takes in like I said all the text Data from the internet up to year 2021 and to train that neural network by dumping in all these blog posts and all the texts um it takes about one year that's crazy you basically have like this training program tweaking the connections and activations of all the neurons in this neural network that understand that they're supposed to understand the input endlessly 24 7 running on vast vast number of gpus so for the court over the course of one year and then the actual response generating part where the human judges are involved and that takes about six months just working humans I guess I guess the humans work nine to five in that case right they have said because humans get tired right and then that takes about six months so once these two parts of the neural networks are trained and and complete right uh it serves the customers which are the users of chatgpt website we just literally go there and just type in chat it serves customers for a period of time until a new version is released so they're working on gpt4 and I don't know how long GPT g54 my guess is that they're gonna follow the same pattern so there's the neural networks for understanding uh the context of the input chat and then there's a neural networks for the generating the response and I I don't know how many neurons and how many connections they're going to use they say they're going to use way more by the way the more connections and more neurons you use in the simulated these programs neural networks the smarter the AI gets when you compare that to how many the number of neurons and connections that humans humans have in our kind of organic cells we have way many more way many more I heard something like we have like ten thousand ten thousand and one thousand ten thousand more number of neurons anyways is way more than what we simulate at the uh at the current time with the chat GPT but chatgpt4 which is the next version is supposed to have way more neurons and connections so it's supposed to be more um sophisticated so in between the releases really these neural networks that are trained that they're set in stone and they're fixed they don't change so let's compare chatgpt and humans now that we understand what actually is uh involved in in simulating our brain and creating this artificial intelligence we can start to see the limitations and uh what is good at and what is not good at humans need to spend about 25 years to have fully developed brain right like which means more or less than the the neurons activation and the connections it makes with other neurons they're stabilized right when you take a kid the disconnection connection and all these changes that are happening in these neurons it's very rapid and it's radically changing all the time there's even a case where a cat was got into an accident and the surgeon took out half of his brain because it had to be cut out and the rest of it the cat's brain because it was a small kitten like a small baby the kitten kitten's brain was able to reorganize itself completely to take on the responsibility responsibility of the other missing half of the brain with the you know one half of the brain that survives so that's amazing but they took a same case with a mature cat where the mature cat got into an accident and we had they had to carve out a part of the brain to to you know hopefully the cat survives and the mature cat's brain could not reorganize and it died right so for humans it requires about 25 years the neurons are always changing in our human body even after it's fully developed we are constantly having these minor tweaks so it's very fluid our neurons are disconnect and connect and change its activation behaviors and like it always changes every second we live and our neurons are completely autonomous and super energy efficient what it means is it decide each every single individual neurons decide to make changes necessary changes uh to Itself by itself without any kind of a governing body it will get external stimulus some kind of signal will come and it will be like oh I should disconnect this one now but all that decision making is being made being made at the cellular level and because it's organic Hardware neurons are just cells right so it's super energy efficient like there's electrical signals flowing my body right now but these are very tiny tiny electrical signals and the transferring of this electricity is very very efficient in our body because probably our body is made up of like liquid and stuff like that I don't know and all you have to do is just I don't know just bite into a potato eat it and then you fuel your body um and then your body just kind of works on low energy you know uh but chat GPT on the other hand it takes 1.5 years to be trained and released as a version which is like a um upper hand compared to human human has to spend 25 years um AI right now chatgpt 1.5 years is going to get faster and faster but for now it's 1.5 years if you compare 1.5 years to 25 years of course you just want to spend 1.5 years and that's because you know that's the the advantage of simulation right you don't you don't have to live out your entire life you can just simulate it and then just running quickly in your pro in a computer program and chat repeat it doesn't change its neural network once it's built unlike human beings real brains and there's a lot of researches that are happening right now to make that change to make it less rigid so in between the releases uh Chachi PT version let's say one and then version two uh in between the releases right now um it's fixed but then during the course of chatgpt1 being used by the users maybe we don't want to restructure the whole thing too much but if the user while entering the conversation user says you're incorrect you know you're wrong actually this is the right way and then the user gives the feedback and then that becomes a feedback for training the what it's right now is a fixed version of chatgpt but it will be able to pick that up and make minor changes in its neural networks and that's being researched right now but for now you can kind of assume a vast majority of that neural network is kind of fixed and it's you just have to wait for the next release for the AI to be smarter and smarter and charger BT of course you've seen these days a lot of people are using it and the website is constantly down and we're slow that means it requires massive amounts of electricity to run these servers and it requires a lot of gpus to train the neural networks and all these things eats up a lot of energy at the moment I think AI is consuming a lot of electricity right now I wouldn't be surprised if they're spending like I don't know 30 40 of an entire U.S state electricity consumption into AI in some tech driven States or even more percentage than that it's just massive right so yeah so that's basically it so what that means is in a nutshell current AI is very rigid right just because of the way the simulated neurons work right now it's very fixed and rigid it's very it consumes a lot of energy so as humans having this brains of our ours as as the hardware what's under the hood we have clear Advantage we we are we can be autonomous independent intelligent intelligent beings that doesn't have to be hooked up to a massive server you can throw somebody into a deserted island and that person will figure his way out uh using his intelligence to survive make decisions face problems solve problems on a remote location without being connected to anything else if that person's brain needs to function just grab a potato and eat it right and you get energy and then you just function so if you want to send out a explore exploration team to outer space I don't know it humans probably will be um very adaptable and they can run on very low Energy Fuels and they can make very intelligent decisions um but maybe that'll change in the future like the movie Prometheus but anyways for now those are the big differences and uh hopefully um this video can encourage some people who thought chat topt like some some I don't know some young computer scientists out there who who generally got excited about jtbt and I started researching about jtbt and overwhelmed by the scientific papers and Mathematics and all these hopefully this video really made it simple uh and then encouraged some young people to kind of take the next step from this presentation to start learning about all the bits and pieces that I explained at the high level go deeper learn the math and all the things related to neural networks and you know make our lives better with AI thanks
Info
Channel: Kurdiez Space
Views: 489,954
Rating: undefined out of 5
Keywords:
Id: uCIa6V4uF84
Channel Id: undefined
Length: 33min 11sec (1991 seconds)
Published: Sat Feb 04 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.