GPT Prompt Strategy: Latent Space Activation - what EVERYONE is missing!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what does take a deep breath let's think through this step by step tree of thought Chain of Thought what do all of these have in common so one thing that I've noticed is that out there in the scientific literature and the prompt engineering space there are all kinds of techniques that have been elucidated by papers such as this one uh L large language models are zero shot reasoners um where it's like let's think through this step by step great that was a popular one um I don't think it was worthy of a paper but that's my personal opinion um the most recent one is take a step back which basically just says let's take a step back and think about what what information or techniques we need in order to uh achieve this another one is take telling the model to take a deep breath uh elicits very different behaviors from the model which I think is just fundamentally a flaw with the way that it's trained um rather than something about the model itself wherein basically if you train the model to just barf out an answer without thinking through it um that's the result you're going to get so this could be fixed with uh training schemas uh tree of thoughts which is where you basically use iteration and brainstorming to Think Through uh various possibilities and then finally uh or I guess similarly uh guided tree of thought um as an answer so what is the underpinning thing about all of this what is it that these uh all these papers and techniques are missing that is not yet generalized so the reason that I haven't made a video about this is because to me having worked with models since gpt2 and gpt3 it seems rather obvious um but so I'm adding I'm adding uh a little bit of to the to the conversation and what's missing is the idea of Laten space Activation so here's what you need to understand about the way that intelligence actually works if you think about in the way that human brains work we uh you have an intuition right if you if you have a question or a problem you have many brain structures that will just give you an instantaneous uh possible answer this is called intuition intuition is knowing something without knowing how you know it one single inference from an llm is equal to human intuition it hasn't thought about it it just says this is my gut instinct this is my knee-jerk reaction however humans can also think through things in order to get the right answer like let me consider what I actually know let me think about the the proper techniques to go through this so there's a really famous book that is really popular in many intellectual circles and wherever called Thinking Fast and Slow by Daniel Conan where he talks about system one thinking which is that instant knee-jerk intuitive uh thinking which is what a single inference of a large language model is the equivalent of and then there's uh system two thinking which is slow thinking it's very deliberative where where you jot down your thoughts and you take notes and you kind of recruit all the stuff that you have and you're very systematic about how you approach things this is what all these other prompt strategies do so anytime you have multiple steps whether it's using Lang Chain Tree of thought Chain of Thought chain of reasoning let's think through this step by step where basically you're saying okay let's use the model to prompt itself in order to get better stuff into uh whatever's going on here so great how do you use all this uh this is a question that I've been seeing lately because I made a recent video about uh sparse priming representations and I didn't realize that a lot of people wouldn't understand how to use this because again like maybe it's just because I've been in it so long and I forgot to explain it anyways basically what you do is you use the same techniques that your brain uses that you use consciously or unconsciously to prompt the model and what you're doing is is what I call latent space activation and so what you need to understand about latent space activation is that these models are trained on infinitely more knowledge than you will ever personally possess but it's not going to be activated at all times likewise you might have heard like oh humans only ever use 10% of our brains which is more or less true at any given moment most of your brain is not actively participating because if your brain goes to a 100% activation you basically go into a coma and you die because it overloads itself likewise large language models have a tremendous amount of latent space that is just not really used this is embedded knowledge embedded capabilities and it's really only going to do one thing at a time and this is another this is another place where large language models are similar to human brains and that we have a conscious Spotlight so in in Psychology in Neuroscience there's What's called the spotlight of Consciousness which is that your brain has all kinds of stuff information that it is that it is filtering out um in real time there's all kinds of thoughts and memories your brain is actively ignoring most of the information that it has access to at any given moment this is basically a biological attention mechanism so in the same respect you need to use that that Spotlight of Consciousness in large language models to sequentially scan over and figure out what it needs to know in order to bring the correct things into its quote unquote Consciousness or into the context window and activation and and so I was like let me just do a quick demonstration I started working on a second demonstration but I'm just going to show you this first one so in this qu in this example um rather than having uh you know like kind of a fixed set of prompts what you can do is you just think through like okay how would I answer a general purpose question who was Emperor during the absolute apogee of Roman power so the first thing that you do is when you ask yourself this question you think hm well what do I know know about Rome uh what what do I know that is relevant to this question the next thing you do is say well how do I Define the answer what criteria am I looking for in order to judge this on and then finally you say Okay based on activating everything that I have in my brain what is the answer that I'm going to settle on so let me give you let me just show you the script that I wrote and uh how this works so the script is here technique dialogue technic1 dialogue and so basically all you do is I have it here let me zoom in a little bit uh I have it ask the the the the user what is your main query or question and then I have some general purpose placeholder uh questions that kind of abstract this process and so the question is the first question that it asks itself is what information do I already know about this topic what information do I need to recall into my working memory to best answer this what techniques or methods do I know that I can and this is this is the second one what techniques or methods do I know that can I that can answer this question or solve this problem how can I integrate what I already know and recall more valuable facts approaches and techniques I probably should use the word methods um here because the well I guess I Ed methods um and finally with all this in mind how will I discuss the question or I will now discuss the question or problem and render my final answer and so this is a very very very simple Chain of Thought um set of reasoning but the purpose of this is that it understands how to uh approach this and if you look here it actually accumulates it all in a conversation and so one thing that I've noticed and that plenty of other people have noticed is that the more the more uh like valid or Salient information you have in the context window the more latent space it activates and so in this case what I do is I have a system uh message which let me show you that real quick so the system message that runs this uh basically tells it what it is so uh you are here let me go here you're an internal dialogue iterator for an llm large language model neural network l M possess latent space or embedded knowledge and capabilities you'll be given a main query as well as a sequence of questions Your Role is to answer the queries as a way of activating the latent space inside your own neural network this is not unlike how how a human may talk through a problem or question in order to recruit the appropriate memories and techniques the ultimate goal is to answer the main query listed below um and then I actually had to fix this on my local code because it was hardcoded and so I'll update this um anyways interaction schema the user will play the role of interrogator your answers will be thorough and comprehensive in order to get the best possible latent base activation anything potentially Salient is valid to bring up as it will expand your internal representation or embedding thus recruiting more relevant information as the conversation advances Okay cool so let me show you what this actually looks like okay so here we are here's the here's the repo so let me just show you real quick all right so python technique dialogue py1 and um like I said I fixed I fixed the system message uh locally so python on uh technique 01 so it's going to ask me a main query or question so I'll I'll show you the the one that I that I did originally so that you can see kind of the process um who was Emperor actually no let's change it up to be a little bit more specific um who were some of the Senators uh who were important during Rome at its peak power okay so this is a question that is not as simple as like who is the Emperor because it probably just knows that um but this is going to this is something that's going to require it to Think Through what it has okay so what information do I already know about this topic so what I did was instead of writing a prompt that is very specific to like any given query okay so let's see what it says but basically what I said is like uh I kind of framed it generally like let's think through this step by step to answer this question I need to recall information about the Roman Empire specific spe during its peak um let's see the structure of the Roman government as Senate blah blah blah additionally I need to remember so it didn't give anything specific um so in this case it's a little bit disappointing because I would what I would have hoped is that it would have listed out specific information and finally with all this mind I'll now discuss the problem and render my final answer um but interestingly enough it actually provided it so it knew Marcus tulus siss uh it knew so Cicero he was he was he was a big speaker uh KO uh polio guas asinus V uh vipsanius agria so the Agrippa family that was big Marcus aelius so yeah so in this case it was able to to to basically dial into the answer and and give me a valid answer um now I would have hoped that it would have listed out some of the um some of the people already I guess it did here's Marcus aelius and and a few others um but because it activated all of this and of course like you can do a test and ask this question um just like side unseen but the point is is that this is a very similar uh prompting strategy and It ultimately gave me a pretty good answer um here's another one that I that I tried so let's let's do a clear screen um and and we'll do this again and um calculate the exact Coastline of Britain okay so for some background the reason why this is a challenging thing is because the coastline of Britain is very Jagged and angular and it also changes with tides and time um and it's also it depends on how you define a coastline um so here we go so it says like I don't have working memory in the same way humans do so I really wish that open AI would stop like stuffing this kind of like arbitrary like asinine uh logic I don't have working memory in the same way humans do that's not necessarily true but like they're they're teaching their models to to axiomatically believe this um anyways so what's going on here okay cool so Coastline Paradox so it understands the coastline Paradox uh Britain's Coastline measurement techniques there are number units of measurement to calculate the exact we would need a detailed an upto-date map or satellite so it's basically figuring out what information it needs um and it says like I don't have access to this information so in a cognitive architecture what you would do is you'd actually use retrieval augmented generation where what you then do is you use you recognize that you need to search for information um so let's see then what techniques or methods do I know of that can do this cardgraphic measurement satellite imagery GIS software fractal analysis um I don't have the capability to perform these measurements directly so it's aware of its own limitations so that's a good Agent model um that I do approve of but complaining about working memory is a waste of time and energy um and finally with all this in mind I will now discuss uh and render my final answer the exact Coastline of Britain is challenging to determine because of the Paradox so on and so forth the uh Coastline of Britain is often reported to be about 12,000 km however this is an estimate um so in this case it it pretty much failed but that's kind of this was this was a gacha and so what I started working on was right here let's let me show you this other technique that I started working on um and so this is some thing that I have I have done and I have um I have uh consulted for people doing something similar but it's a brainstorm brainstorm search hypothesize refine Loop so the bshr loop is basically just what humans do and this is another reason why I haven't commented on this and and I apologize because like I recognize that not everyone you know is familiar with information foraging uh techniques that humans use but basically the bshr loop is you brainstorm alist of search queries which if you've used tools like perplexity um you you see that like this is what it does where like and honestly like perplexity could be infinitely better because all it does is it generates like really like basic um uh Google queries so let me show you what I mean by like um let's let's take this let's take this question and I'll show you what I mean by like generating um generating like valid search queries so if we go to if we go to the the playground I can show you what I mean I haven't finished this because like uh the coding is a little bit tedious but anyways um so Mission uh you are a uh search query generator you will be given um a specific uh query or Problem by a uh by the user and you are to generate a Json list of uh questions that will be used to search the internet make sure you search uh you generate uh comprehensive and counterfactual um search queries uh employ everything you know about information foraging and information uh literacy to generate the best possible questions okay so um as I've mentioned in other videos um this is this is what's called priming and so with priming large language models will recognize certain Concepts or terms and it will activate the network in a different way and at the activation quote unquote is basically the internal representation the embedding is going to look different and so I said comprehensive and counterfactual I said employ everything you know about information foraging which is a very specific term and information literacy to generate the best possible questions so I'm going to say calculate the exact Coastline of Britain and in this case what it's going to do um it's going to ask a bunch of questions um that are that are relevant um how often is the coastline measured uh what impact does erosion have what are the longest and shortest estimate so you can see that it's basically like generating a whole bunch of questions so that it w it once it searches like imagine you run all of these Google searches and then you take notes so this is the brainstorm phase um and here this is actually super valuable so I'm going to record this as a uh as a hypothesis generator okay sorry about that I was just realized like hey I just did a good thing um okay so then the bshr loop so the brainstorm search hypothesize uh refine Loop is basically what you would then do is you would take each of these queries put it into a Google search or a DuckDuckGo search or even a Wikipedia search and then then with the with the the like the main query in mind you take notes and so large language models have a really great ability to you know you search a document you take notes and then you generate a hypothesis and so I've got um some of the other script out here let's see where did it go there it is so I've started working on a script and you can you can look at some of what I've got out here but it'll basically say like generate a hypothesis uh and and you give it a question some sources of information and you ask it to generate a hypothesis and then you update that hypothesis over time so that's the refine and so this brainstorm search hypothesize refine Loop this is how humans answer questions now what you might what you might add to this Loop is another like you know perform a test or an experiment or you know perform a calculation and this this is where you're getting into more like formalized cognitive architectures ideally you create something that can that can construct these loops automatically fully in real time but in in the immediate future like doing a brainstorm search hypothesis refine Loop this is going to be good enough for information literacy and this is honestly one of the reasons why I um I also canceled my perplexity um subscription is because all it does is it it does a really like uh like a what I would call a naive search if you use perplexity um a naive search is it doesn't really think about the question that you're answer you're asking it doesn't generate the good questions like what I just showed you it just basically translates it to a really really dumb Google Google query and doesn't use its its int intrinsic uh information literacy and information foraging knowledge to ask actually good questions so I'm like okay well I can ask better search questions because I've got better Google Fu so whatever I'm not going to use it uh now but then what what perplexity does that it does well and let me just show you okay so what I mean when I say that like perplexity doesn't have good Google Foo um let me ask the same question oops um what is the exact Coastline of Britain and so it'll actually show you like the like so all it literally all it did is say exact Coastline of Britain that is not good information foraging um like it's just it it it took a really complex question and spat out a like really basic like okay I could have done that and all it does is aggregate this uh this information and so because it's not because it's not engaging in counterfactuals because it's not engaging in good information literacy if it's search query like pulls up like bad information it will just tell you patently false information that's that it's reports from the internet and so like for instance one thing that I that I did before here let me let me start a new thread um uh let's see tell me about AI destroying jobs so if you follow my channel you know that I will occasionally document like AI destroying jobs and so here it generated a couple of things um you know couple of of of search queries um you know but it's like while Ami might reduce it can also create new employment opportunities so it didn't actually generate any counterfactual searches it just like it it did a a very naive search and just kind of gives you that information sight unseen without actually looking at the validity of the sources or even having asked good questions um so anyways there's all kinds of uh ways to improve this if you understand the general principles and concepts of language models such as information literacy information foraging um and and Laten space Activation so I hope you got a lot out of this video thanks for watching cheers
Info
Channel: David Shapiro
Views: 63,477
Rating: undefined out of 5
Keywords: ai, artificial intelligence, python, agi, gpt3, gpt 3, gpt-3, artificial cognition, psychology, philosophy, neuroscience, cognitive neuroscience, futurism, humanity, ethics, alignment, control problem
Id: N8p6u1OtARs
Channel Id: undefined
Length: 20min 53sec (1253 seconds)
Published: Tue Oct 24 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.