But How Does ChatGPT Actually Work?

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

chatgpt reached a record-chattering 1 million users in its first five days of existence and is without a doubt the most talked about Topic in Tech right now you will learn how it works and this will provide you many benefits such as helping you to use the model more effectively evaluate its outputs more critically and staying informed about the latest developments in the field so that you are better prepared to take advantage of New Opportunities chat gbt is a type of natural language processing model known as a generative pre-trained Transformer developed by openai these are the two big terms we will focus on in this video on top of that you will also get a base understanding of common machine learning techniques like supervised learning and reinforcement learning which were used to make chat GPT as good as it is while doing the research for this video I actually met an old friend of mine or should I say foe of mine proximal policy optimization the reinforcement learning algorithm which was also developed by openai was used to train fat GPT I wrote my computer science thesis about this thing but don't worry we won't talk about math at all in this video instead we will talk more about language natural language processing is the cross-section of linguistics and computer science and you're already confronted with NLP all the time autocorrect is NLP or to suggest this NLP and things like automatic plagiarism checks also make use of natural language processing but what exactly is natural language processing it's a subfield of artificial intelligence that focuses on enabling computers to understand interpret and generate human language this is at the core of chat GPT it needs to be able to understand what you're saying and then generate a human-like response let's start with the understanding part computers don't know English or German or Spanish or any other language besides numbers and vectors so how do we get it to understand what we are saying in the first place here is an example of how seven steps of natural language processing might be applied to the sentence I am learning about chat GPT right now first is segmentation here we split the sentence into smaller units or segments that can be processed by the NLP model we can split it into individual words or tokens the term token pops up a lot here even open ai's pricing model is based on set tokens prices are per 1000 tokens you can think of tokens as pieces of words where 1000 tokens is about 750 words this paragraph is 35 tokens next up tokenization converting the words to a standard format for example lowercase the next step is to remove common words which don't really contribute anything to the meaning of the sentence such as about and write this can help to reduce the amount of data that needs to be processed by the NLP model and improve its performance so in this case we are left with I am learning chat GPT now which still has the same meaning as a sentence we can also reduce the words to their base form or stem in order to improve the model's ability to generalize and recognize patterns this might involve using algorithms that remove suffixes from the words leaving only the core meaning for this sentence the stemming step might reduce the token learning to the stem learn and we can even go further here with limitization which is quite similar to stemming it reduces the words to their dictionary form M turns to B our sentence is now reduced to IB learn chat GPT not really grammatically correct but you get what it means in speech tagging we assign a tag to each token indicating its role in the sentence for example noun verb adjective Etc this can help the NLP model to understand the syntactic structure of the sentence and lastly named entity recognition in this final step we identify and classify any named entities in the sentence such as proper nouns that refer to specific people organizations or locations once all the steps of the NLP process have been applied to the sentence the model will have a more complete and accurate understanding of its meaning and content by running through steps 1 to 7 for a vast amount of data the computer is looking for patterns and structures that can help but understand the meaning of the words and sentences in other words it starts learning how exactly it learns you will see in a bit this understanding is typically represented using numerical or vector-based data structures that can be easily processed by the model here is what it could look like for the speech tagging as an example the better it understands the better it can use this knowledge to generate its own text that is similar to human language this perfectly leads us into how GPT generative pre-trained Transformers work generative means the model can generate something in this case a human-like language or a human-like text pre-trained refers to the fact that the model has been trained initially on the vast amount of data we just discussed keep in mind that for this specific model it also means that it doesn't get any better the longer you chat with it it has been pre-trained what you get is based on a snapshot of the internet with a data cutoff in September 2021 and the last part of the term are Transformers these Transformers us are the crazy part which made this whole technological breakthrough happen in the first place that for us was like okay this is gonna work um you know trans the Transformer came out uh late 2017 um and my co-founder Ilya immediately was like that's the thing that's what we've been waiting for so you take this sort of very early nascent result put in a Transformer and then that's gpt1 they consist of an encoder and a decoder these seven steps we just described are all happening in the encoder so the output of the encoder is a numerical or a vector-based representation of the input sentence that captures the meaning and structure of the sentence in a compact and efficient form this is what we work with in the decoder the goal here is dependent on the specific use case of the Transformer but for us it's sequence to sequence transformation feeding in one sequence like a sentence or a question and then getting back another sequence like another sentence or an answer to our question for example if we take our input sentence the decoder might generate an output sequence like as a result I now have a better understanding of how chat GPT works and what it can do which is similar to human language and captures the meaning and structure of the input these Transformers are a fairly new machine learning technology they were introduced in 2017 in the paper with the Fantastic title attention is all you need Transformers use a self-attention mechanism which allows the model to focus on the most relevant parts of the input when generating its output it does so by calculating how much attention each Vector should receive based on the other vectors in the input that's why they are so good at summarizing content because it understands context since the vector's attentions are based on the other vectors of the sequence this breakthrough was essential for Technologies like chat GPT because it allows the model to effectively process input sequences of any length in a parallel and efficient manner it vastly all performs recurrent neural networks and convolutional neural networks in tasks which are simply too slow for our use case but we need to make a step back here because we are already talking about the outputs of the model but how was it trained in the first place op may I got creative here using a variety of different machine learning techniques to get the best results an initial model was trained through supervised fine tuning here a lot of prompts were created and a human described the desired input the language model is basically looking at that to understand what's going on it's like a math teacher explaining something on the Blackboard while you sit there watching him and trying to figure out what's going on and how it works hope may I also train a reward model with a similar process a prompt delivers four different answers and the labeler ranks the output from best to worst this data is then fed to the model and finally it's about optimizing a policy against the reward model using reinforcement learning this is the fun part because here the model tries to teach itself how to get better say you have a dog you want the dog to sit on command so you start by showing the dog a treat and saying the command sit when the dog sits you give it the treat as a reward for performing the desired action over time the dog learns to associate the command sit with the action of sitting and the reward of getting a treat humans also naturally learn skills that way you don't learn how to ride a bike by looking at someone ride a bike all day you'll learn riding a bike by getting on a bike falling on your face a few times and figuring out how it works what went wrong how can I do this better until it feels natural falling on your face is the punishment for a bad action dopamine from learning the skill and being able to move faster is the reward for a good action and this is how machines can learn through reinforcement learning as well I will give you a concrete example from my thesis I build a super simple game in unity in which the AI is controlling the green character it can move and shoot enemies are spawning from the sides of the map and following the character now the reward model could be shoot an enemy plus one collide with an enemy minus one you lose take a shot minus 0.001 to simulate ammunition which is not infinite the AI then starts playing the game and tries to maximize this reward through a lot of trial and error it learns it adapts its own policy which is its own rule set of when to perform which action with the information it has and the performance gets way better over time and this is exactly what happened in the training of the chat GPT model as well I believe that it will Usher in a new paradigm of AI power tools that will do incredibly impressive things over the next few years I also believe that we are on the verge of a second Renaissance and Technology business and art even without taking AI into consideration check out this video next to learn why thank you for watching

Info

Channel: Till Musshoff

Views: 163,403

Rating: undefined out of 5

Keywords:

Id: aQguO9IeQWE

Channel Id: undefined

Length: 10min 3sec (603 seconds)

Published: Mon Dec 12 2022