Tree of Thoughts LLM Framework Explained by a Rube

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yesterday evening I was browsing YouTube as I often do when I'm just wasting time and I saw this cool YouTube video about this paper called tree of thoughts it describes a framework for increasing the effectiveness of using large language models to solve problems so I watched this whole video which was 13 minutes long or so and I didn't really understand what was being talked about I didn't understand what Wes was trying to communicate I didn't understand the paper but I did understand that there was probably something really cool here so I found the the paper the white paper and I plugged it into one of those chat with PDF Bots and then I talked with the bot for about two hours and I also tried to read through the paper which was very hard because I don't know how to read these solution formulas there's a lot of words in here I don't understand because I don't have a background in machine learning I don't even have a hardcore computer science degree or anything like that so a lot of this stuff was going over my head so I really had to chat with the AI bought for quite a bit but eventually I got it and it's not as complicated as it appeared to be at first um a lot of people in the comments section on this video were asking to explain it like they were five years old I found a Reddit thread where there were also a lot of comments in the thread asking if this could be please be explained as if they were five years old because I guess I'm not the only one who didn't just immediately understand from watching this video how the tree of thoughts framework functions although I'll say there were a lot of people in this comments section that seemed to immediately understand it so you know I I was saying to various people on The Forum that I simultaneously felt smart because I finally was able to understand something that in my opinion to me is complicated yet also simultaneously felt kind of stupid because there were so many people in the comments section on a YouTube video that just understood it immediately so I'm going to explain this as if you were five years old the tree of thoughts algorithm so here we go so there's two papers there's this more detailed more refined paper that came out two days after this other paper published by one lab I don't know if there was multiple people with this lab but uh here we've got Princeton University and Google deepmind working together Isn't that cool and here we just have a lab from San Jose maybe it's just one guy from San Jose I think that'd be hilarious if one guy from San Jose managed to come up with more or less the same thing two days prior I'm guessing a lot of people were probably discussing this before someone wrote a paper up so maybe you know these people aren't the inventors but these different publication sources they put out these papers and what I found is that both of these papers basically talk about the same framework the same problem solving approach the main difference is that uh it seems like the problem-solving approach in the Deep Mind and Princeton paper uses a depth first approach which I think yeah they talk about here whereas the approach outlined in this other paper is more of a depth first approach this which is outlined in this this picture right here so if you don't know what breadth first in depth first means because I don't know you didn't remember from Community science class or you just don't ever take that that's okay I'm gonna explain it all right so I have two examples for you today on the left hand side I have a depth no a breadth first search approach and on the right hand side I have a depth first approach so both of these are valid approaches and you could come up with other approaches that use different traversal algorithms that are not breadth or depth first like a star or I don't know what else and that's just assuming you're using a tree something I thought about was that you could potentially use a different data structure that's not even a tree but I don't know what that would look like but let's let's get to it so let's start with the depth first search approach and by the way uh they talk about it in the abstract they say hey this is all inspired by just the way that people think in general same here they say the same thing in the abstract up here as well they say Hey you know when we think about things we come up with different solutions and then we trial them in our mind and then we go with the one we think is the best that's what the tree of thoughts algorithm does in a nutshell it it lets you have uh different solutions and then try them out weigh the pros and cons and pick the best and this is as compared to other approaches so in the uh one shot approach on the left hand side here we have an example where we just put a problem and we just come up with a solution and this is the least successful in all of their uh benchmarks then we have the Chain of Thought approach where we break our thinking into steps so we come up with a problem but rather than saying hey please produce the output from the get-go we say can you please think through this in a few different steps and this has a much better chance of success in their benchmarks and this has also been around for you know months or whatever and people have been finding great success with this there's another method that generates a set of different uh strategies based on the initial input and then follows through with Chain of Thought for each one then Compares which outcomes are the most common and says hey those are the most common outcomes that's probably the one that's the best let's go with that one and then the one that's proposed in this paper in this paper is the tree of thought where instead of just generating multiple approaches at the start and then carrying each one through to completion we generate multiple approaches at the start and then for each approach we generate multiple approaches and at any point in time we might say hey this approach is not valid uh it's too expensive it doesn't work it's broken it's not feasible whatever and so we'll actually backtrack and try different approaches until we find one that wins and so this is basically making a whole bunch of chains of thought from each of the previous thoughts and stopping at intermediary points and going back and trying other Solutions before possibly doing a longer range backtracking for example in this one in this green box here can I uh can I move this over so in this example here we come up with three solutions and then we go with the first one and we say Hey you know that worked but both of those were failures so let's just backtrack um twice I guess maybe it tried some more and they were failures so then we try this one and we say well that was a failure that was a failure this was a success that was a failure that was a failure but this was a success so output and maybe this one was a failure from the get-go so here we're showing an example where we're coming up with multiple thoughts at once carrying them through to completion stopping if they're invalid or not feasible before even getting to a you know further um logic jumps and we come up with an output and so finally okay that's all in the paper that was in the video here but I didn't understand it maybe you didn't understand it so here's my example let's say that you're on vacation in Hawaii and you want to boil some water you tell the large language model that you've got an electric kettle on your kitchen counter you've got an electric stove in your uh kitchen and a stove top and you've got a propane camping stove in your car's trunk so we'll do a breadth first approach we'll just generate some different thoughts that would be the subsequent next steps so you might say well because I've got all these stoves let's put some water in a pot or you might say well because I have a kettle let's just put this water in the kettle so those are two solutions right and so far neither one of them are invalid so let's carry their next steps out and see what we get well with the water in the pot we might want to hike up all volcano because we're in Hawaii okay you know it's not totally invalid yet maybe we want to put the pot on the electric stove or in the oven and then on the other hand we have from our Kettle we can just say well we turn the kettle on okay so now we're going to pull out our chains from thin air and lower the water in the pot down the volcano until it comes within a few meters of the magma okay technically this is possible let's keep going for our electric stovetop we're going to turn it on for our oven we're going to turn the stove on or the oven on or whatever for our Kettle well hey we already turned it on and the water's boiling so cool we we could kind of just stop here but maybe we want to carry all these Solutions out to their completions or failures so that we can find all the successful Solutions and then compare them based on what we're going for like are we going for quick you know the most convenient method are we going for a method that can boil the most amount of water there's different things that we can optimize here for so let's look at the next steps well okay yeah the water's boiling because the magma boiled the water the stove and the oven boiled water there's no Next Step here so then we can say well hey what are we what are we optimizing for uh we can go back and look at these and we can calculate hey are we looking are we trying to find the most cool way to boil water like the most awesome way well then obviously lowering the water over a volcano is to most epic so in that case you know this one would win uh maybe we're uh trying to cook and we want to add stuff to the water while we're cooking then you know it's on the stovetop it's in the kitchen it's oh this this the the pot doesn't have a lid this is the best for cooking maybe maybe we're trying to optimize for Thanksgiving dinner and the stovetop's full so we say you know what let's use let's conserve let's use all of our stuff let's use the oven so you know this might be good for Thanksgiving and finally you know maybe we just want to make some tea before going to sleep we don't need to make a whole bunch of hot water we really just care about getting a few cups of water boiling ASAP with the least amount of cleanup then if we're optimizing for convenience the kettle might win so this introduces a thought that I was wrestling with quite a bit which is do you calculate the the score of the heuristic while you're generating the thoughts and thought steps or do you go do you generate all of the thoughts all at once then go back over and calculate the the cost or the um Merit or whatever in in separate steps and that's actually up to you and in the paper here the authors do say like you can decide that for yourself so based on the problem you're solving you can either calculate you know you could say hey hike up a volcano I'm optimizing for convenience that's not convenient and so you might just um you might just rule this one out immediately and say nope and then you know stovetop you might say well you know actually I'm afraid of stove tops so nope and then you know in the oven you might say well actually my oven's broken nope and then you might say well I've got this Kettle cool turn it on cool that works so you know that's if you're if you're evaluating your options while you're generating your thoughts then you can short circuit and potentially save yourself time and calculations so that's an advantage of the tree of thoughts framework if you have a problem space is small and you know you're not going to be generating too many thoughts you could do it all at once go back and evaluate them and weigh all the options against each other and then in that situation you know you you you you could optimize for whatever you want and compare different solutions and so there's the whole idea of like local maximum and minimum and Global maximum and minimum um you might evaluate a short-term solution and think it's the best but in reality if you evaluate all the options there might be some other option you have not evaluated yet which could be the best so for example um if we weren't making all of these all at once and then going back and evaluating them we might say that oh you know what you know that wasn't a success but uh you know this this one right here this one was a success so we stopped here but then if we're optimizing for convenience then if we stopped here we never would have done you know this one with the kettle and so we would have been in a local maximum situation where we found a solution to the problem that was the one that we were currently calculating um it's not the best because if we looked at the global solution set we would say hey there's actually this one over here which is faster and better more convenient but we didn't calculate it because we stopped at the first solution here so by going with the breadth first search we can evaluate lots of different options and then weigh them against each other so that's that's an advantage of the breadth breadth first search algorithm and I noticed that it seems like the authors of the Princeton paper they seem to be doing a breadth first search in their examples so in this example here we'll see that hey you know this red like erroneous thoughts all the way at the bottom of the tree if they were short circuiting at the first erroneous thought then uh they they wouldn't generate subsequent thoughts so here we can see we probably generated all of them at once and because this one was a failure we stopped but then here we for each one we generate them all at once and for these two failures we stop but for these successes we generate all of the solutions at once these two are bad this one's good so this is I think this is an example of a uh of a breadth first search although I honestly this the reason why I don't think this is a depth first search example is because in the second thought on the first level and then in the third thought on the second or second level this one here we generate a successful thought but then we also have these failure thoughts now if the successful thought was on the right hand side and assuming the algorithm generates thoughts from left to right then uh I would think oh you know that failed that failed so we generated this third one and third time was the charm but if we're going from left to right then we'd say oh this one was successful we also generated these ones and evaluated them and saw that they were failures so I think that the authors are doing a breadth first search here let me think about that one more time let's say that we're evaluating from right to left with depth first so failure no uh success yes failure no success yes okay if we were doing a depth first search then yeah we'd do failure hero failure here success cool but if we were doing a depth first search why would we continue to generate these two failures once we if we're evaluating from right to left we find this one here and and then why would we go back and calculate another approach and maybe my recollection of the depth first search computer science algorithm is just outdated or I've forgotten some things but I think I think this is a depth first search example here and again the authors do highlight for the down in the paper hey we've got uh you know two example search algorithms breadth and depth first and they do say that there's you know there's other ways that we could do it like a star or whatever MCTS is and I'm I'm sure there's more and they want to do that in future work okay cool so now let's talk about the depth first search which I found in the paper by uh I don't know how to pronounce his name whoa uh anyway um what do we got here yep so depth first search so tries something fails goes back with the memory module try something succeeds fails goes back with the memory module try something fails goes back with the memory module hey the max number of tries per child is two so let's backtrack again and that's the big strength of the tree of thought framework is that you can backtrack not just a single time but you can actually backtrack multiple times and this is how we can really construct a tree structure so then we try try and succeed so this is a depth first strategy um and so here's my example with the depth the depth first approach so it's the same example we want to generate we want to boil some water because we're on vacation in Hawaii so we're doing depth first so we're going to go left to right because that's how I think so we'll say all right let's put some water in a pot Okay cool so then what happens well we hike up a volcano now if we're evaluating these Solutions as they're generated we might just say hey that's just too prohibitive maybe my legs broken maybe I just don't want to hike up a volcano so we'll stop there in short circuit and we won't even look at you know lowering the the Water by you know metal chain to the magma which is pretty epic right super super epic but no we're just gonna stop there short circuit nope okay failure so let's try the next one maybe we have a maximum of three thoughts per thought step so this would be a thought step this would be a thought so all right we put the water in the pot and then we put it on the stove top cool let's let's try praying to the fire gods okay maybe the fire gods will make the water boil well no the the water didn't boil so sorry fire gods okay that didn't work um all right well then let's try turning the stove top on hooray the water is boiling okay uh but you know maybe you're afraid of stove tops or maybe the stovetop is broken and so instead of instead of uh you know water is boiling maybe because your stovetop is broken maybe the water's not boiling uh this is also something the authors talk about in the paper here which is that uh that because the tree of thoughts framework involves multiple trips to the large language model during these intermediary steps you can have side effects and interact with the real world and so for example let's say that the stovetop was broken right you could have an intermediary step which goes and checks and sees hey is the stovetop actually working oh it's not working and so we don't have to just rely on our like Opera some assumptions from the initial input we at any time we have a programming framework like literally the whole computer is at our disposal we can make connections to the internet we can get input from sensors and so we can say hey you know the water's not boiling because I don't know the uh the stove Top's broken maybe we don't know the stovetop's broken but we somehow determine hey the water is not boiling even though it should so we backtrack and we say you know what we're having a max number of two thoughts per thought step or whatever and these both failed so let's backtrack again and again this is the big advantage of the tree of thoughts framework is that we can backtrack not just once but we can backtrack as many times as we want until we get all the way back up to the initial thought okay so we backtracked two times this is called like long range backtracking or something like that it's it's in the paper if you look up long range in here they use the term long range several times and now we'll say hey we put in the pot in the electric oven we turn the oven on and it's not boiling because we forgot to close the oven door I guess so let's backtrack and turn oh close the oven close the oven door okay and all right the water's boiling all right but you know what maybe maybe the oven is also broken man this oven just sucks so it's not boiling so because of the tree of thought framework allows for a memory module and allows you to backtrack back up the tree to previous steps to rethink other Solutions because your solutions that you've tried have failed we can backtrack from closing the oven door to turning the oven on to you know putting the pot in the electric oven and let's just assume that because I was saying that it's two thoughts per thought step before you backtrack let's just assume we try something else and it you know besides turning the oven on and that also fails so then we've exhausted all the options uh right so then we would backtrack to the water in the pot we backtrack here and we say all right well let's try something else so we'll put the pot the water in the electric kettle cool you know that works turn the electric kettle on cool that works and the water is boiling and yeah it's it's super convenient so this is a this is a depth first search so in this example of the depth first search I set it up such that it was the the the worst case like everything we tried until the last thing failed and then the last thing succeeded so that makes the depth first search look pretty bad but let's assume that the first thing we tried was just putting water in the electric kettle and we didn't do you know any of this because the first thing we tried succeeded that is the potential advantage of the depth first search and again the authors of this other paper they highlight a depth first search and so we can kind of see I the example I just went over here where I from left to right kind of tried things that failed and until I found one that succeeded we can see that's kind of what's going on here as well they tried things that failed until they found something that succeeded and then again looking at the other paper uh I was saying it's it's depth first or rather breadth first so in in these approaches here here's a cool breadth first uh problem so they use they use this example of this mathematical game because oh boy I love playing games with math where you're given four numbers and some mathematical operations or operators I think you're allowed to use each one a single time although I'm not sure about that uh I could just Google it but you know I'm not going to do that and you have to calculate 24. so they give 4 9 10 13 to the large language model and I guess they generate three different examples or thought steps and two of them failed but one of them was somehow valid they they took ten and four and they used the subtraction operator and they're left with six so then because they use ten and four they're left with 9 and 13 and they also get to use the number they output which is six so that's why they have nine thirteen and six so then they try out 9 13 and 6 with uh oh I guess they can reuse the mathematical operators so they they try subtraction here and it fails they try 13 and 9 with subtraction and uh that's that's valid I suppose um you know I really should look up the games the the rules to this game because I don't get why this one is invalid like what 13 minus six is seven yes uh left seven I don't get why that's invalid but whatever and then they so they they generate all the solutions at a depth first level and eventually they find one that that succeeds so this is an example of a depth first approach now okay I'm trying to explain it like I'm five uh I I hope I kind of did that with this here uh in these examples they use you would use language models to uh determine what's going on the cool thing about the paper by uh Young uh is that they use they use um they use Sudoku as their example so come on where is it where is the Sudoku okay here we go so they've got three by three Sudoku puzzles four by four Sudoku puzzles and five by five Sudoku puzzles so as the dimensions of the puzzles increase the difficulty also increases and so three by three that's like three rows three columns five by five that's five rows five columns this is a those are the dimensions right uh they have zero shot approach which is where you just give the input and then expect an output to work they have one shot of oh one shot what is one shot I forgot what one shot means uh yep literally can't remember what one shot means then they have a few shot where they give it some instructions and after giving it some examples of how to do it uh you know it tries to do to solve the problem and then they have uh tree of of thought man I really wish I remembered what one shot means but I'm just not a machine learning guy I'm just interested in this stuff because I'm a general programmer um and we can see here with the success rate one being 100 and zero being zero percent that the Sudoku puzzles for the one shot they only or rather the zero shot only succeeded four out of ten times on the three by three whereas the tree of thought had a 100 out of ten puzzles they say that they they ran the test on 10 puzzles and the tree of thought had a 100 success rate uh with the 4x4 90 versus you know uh 20 and then on the five by five the zero shot only succeeded one in ten times whereas the uh tree of thought succeeded uh eight out of ten times so as we can see that is a huge Stark difference in success rate by employing the tree of thought solution now the tree of thought solution is more calculationally computationally expensive than the zero shot because the zero shot is just one single query to the large language model whereas the tree of thought solution each one of these bubbles here on this diagram entails you know one possibly multiple queries to a large language model um I don't remember exactly where in in the other paper they say I think in the results here it is yes in the results section the authors say that their tree of thoughts solution had a 74 success rate versus a uh I don't know a seven seven percent rate or uh it's low it's much better we can see here that you know it's also almost zero percent of time the Chain of Thought succeeds I guess that's like seven percent almost almost uh you know eighty percent of the time the tree of thoughts succeeds so in in situations where you really want to get the right answer then I think using tree of thoughts is the right move because even though it's going to be more computationally expensive um you know maybe that cost pales in comparison to the accuracy of of your result and some you know everyone in AI is always scaremongering everyone's you know saying oh everything it's all bad we're all AGI is going to kill us somewhere in this paper they uh they say hey this is really bad or it could be really bad I'm trying to find the yeah here it is so blah blah blah blah humans could bring potential danger or harmful uses by facilitating LMS I mean basically they're just saying hey we think that this tree of thoughts framework is such an effective abstraction layer to build on top of language models it amplifies the effectiveness of the large language models so much that you know this could really facilitate harmful uses I mean you know we could always already do harmful things with learn language models but I get this is just the authors doing their typical artificial intelligence people thing which is scaremongering about artificial intelligence which is perfectly valid by the way you know I'm sure you know it might kill us all Skynet and all that um I think they said at the top of the paper that it was like a 900 logic Improvement let's see can I do a text search no but if I look at the uh click bait header right here tree of thoughts gbt4 reasoning is improved 900 I don't remember where the 900 comes from but uh I mean it it seemed realistic uh I I'm definitely sure this isn't how percentiles work but uh where was it the 74 number uh yeah that's that's over 10 times this so what is that uh over a thousand percent better maybe that's maybe that's where the YouTuber uh came West West came up with the 900 statistic uh cool is there anything oh that's right there was one last thing I wanted to cover which is that I actually looked at the code so the authors here I don't think they linked any code but the author here long Jay Long actually linked to code repo on GitHub so I went and read that this morning what a great use of my time because I've been very interested in this since last night and you can essentially just feed it your own sudoku and it will try to solve it so I read through all of the code and I just want to show you two parts of the code so the first part of the code is the tree of thought framework itself and uh there's some errors in here or like fixmes or whatever but basically this is the loop here that generates the the tree structure so we're saying hey for the max number of rounds which I looked in their config it's like 100 so 100 times or or if it succeeds then you uh where is it if if solution return solution or something there's a there's a piece in here yeah yeah if solution found return solution so a hundred times because we don't want to query our large language model a million times that would actually cost us some you know dollars that wouldn't be good um unless you're the government uh and you have unlimited money that's probably why they were talking about the scare mongering stuff in here but uh it says hey you know like uh if you should repeat because maybe the prompt was incomplete like just continue uh find out if the solution succeeded or not if it didn't succeed uh you failed try again if the number of tries is too many get your roll back steps and those rollback steps are where we go back up the tree and try other Solutions so this is python code that more or less implements the tree of thought framework um the other the other piece of code in this repo that I wanted to show you is in I think it's in uh actors yes it's the Checker right the checker so rule-based Sudoku State Checker so you feed in a board and I guess the current board so the the Sudoku puzzle like the 5x5 grid of words or numbers whatever Sudoku is it's it's numbers and then there's just some python code here that doesn't use language models it just checks to see if the Sudoku Solution that's in progress so the impartial you know currently being completed Sudoku solution it just checks to see if it's valid or not and uh you know if it's not valid it spits out hey it's not valid and then if it's not valid then it might you know try a different approach and if it tries a bunch of approaches and they're all not valid because maybe it's an impossible Sudoku setup that seems like it should be valid but it's not working we can backtrack more and the authors actually do say somewhere in this paper that hey uh the reason that we were only able to get like an 80 success rate because I think they say yeah so they say right here um hey we only got an 80 success rate but um where was it there was a yeah I don't remember where in the paper it is but um I think it's here uh eighty percent uh I don't know I'm not gonna read the whole thing but basically they say that this number this 80 metric would be higher if the the Checker code here was able to determine that the current board state is seems valid but in reality is will never result in a solution um that'd be like if you were walking up a staircase and then the staircase just had a pit that dropped you know 100 meters and you kept going back down one step and then going back up one step and you kept encountering that you know fathomless bottomless hole and you just kept saying well I better backtrack a step and retry you're never going to succeed because that staircase has a bottomless pit that you would you would fall down and and die in so you can have a Sudoku board where it is passing all the rules in this Checker like it's valid but you'll never actually be able to complete the result because maybe the the different grids in combination of each other just can't uh actually uh follow the Sudoku rules and succeed so the authors they say yeah if we had a better Checker that would that would that would make this 80 metric higher and then the other other last thing I wanted to say was that the authors in in this paper the one from Google deepmind and Princeton they say uh where's the section where they talk about the heuristics they say that you know to evaluate to evaluate these nodes like um I was I was saying in my original solution I was saying well you could evaluate for different things like is it the most cool is it the best for cooking is it the best for Thanksgiving is it the best for saving time and being convenient those are all different like metrics or or use cases that you can evaluate for and um when you evaluate those those are all subjective things and so you would use a maybe a large language model to apply that subjectivity but in the authors of the other tree of thought paper where they use the Sudoku problem as their uh you know problem space uh they don't need a large language model to evaluate whether or not a impartial Sudoku solution is invalid or valid or a success and thus is the solution that's just you know as I showed here it's just a plain old Python program rule-based Sudoku State Checker so I'm trying to find which one of these papers outlines it more clearly that you have two approaches you have the language model approach and you have yeah that's the search one the thought generator thought decomposition chain and blah blah blah blah okay maybe it's the other paper let's see there was a section where they were saying hey you could use a language model or you could you could use rules yeah Checker module blah blah blah blah architecture memory module controller Checker module ah yeah here it is here it is this is a section so in this section here this author basically and and I I do believe they outline this in this paper as well I just this paper's got it's it's more pages it's more verbose they say hey it can either be rule based that's like the Sudoku puzzle where you literally can just find any Python program on GitHub for Sudoku and just copy paste the rules same for chess and checkers and any game with rules right they can be enforced uh strictly like this um and that could be it could be a rule based checker or you could use a deep neural network to uh you know contextually evaluate things and so in the examples I gave here where I was talking about boiling water this doesn't this wouldn't use a rules Checker this would use a language model to generate these thoughts and you know how are you going to evaluate using a computer how costly it is to hike up a volcano versus turning an electric water kettle on um a language model is is fit for that task so you would you would use a language model for that so um yep that's that's basically what I wanted to cover basically uh how does the tree of thought framework work when approached with the breadth first and the depth first approaches for an ambiguous problem where the heuristic or evaluation method would use a large language model to evaluate the you know cost or trade-offs of you know each thought I wanted to talk about rules-based versus language model based heuristic evaluation I wanted to mention that these different publication sources came up with basically the same concept within two days of each other or at least they put the white papers out within two days of each other and then while I was Googling uh trying to find this repo because I had it open on my iPad but not on my computer I went and found you know someone went and last night I guess I'm not the only one who had a stroke of inspiration uh key Gomez or Kai Gomez or whatever went and uh you know took the a diagram out of the Deep Mind and Princeton paper this one right here and key Gomez however Calle Gomez went and put that here and you know they're talking about what they're gonna do um I do anticipate that probably within a week we're probably going to see plugins for tree of thoughts on Lang chain uh we're probably gonna see complete examples uh my biggest question when I was conversing with the AI chat bot that I'd fed the PDF for this one this paper into my biggest question was you know given given an art like let's say you're trying to create an artificial general intelligence well then you you don't know what question or problem you're going to be faced with you know it might be trying to boil water on vacation in Hawaii or it might be solving a Sudoku puzzle so you know based on the context of the situation you don't know if you're optimizing for you know correctness or speed or coolness you don't know if you're going to be querying a language model or if you're going to be using a rule-based checker another cool thing about the tree of thought framework is that because it involves multiple trips to the language model at each intermediary trip you can run other code including other queries to the language model to stop and say hey like you know should I continue even using the tree of thoughts framework should I switch over to should I switch over to Chain of Thought should I just go to zero shot you know like if someone asks is the sky blue you might say hey it's let's pretend that instead of I'm I'm on vacation in Hawaii and I want to boil water it's is the sky blue you might just say like hey this is this is really simple I don't need complex problem solving skills let's just do zero shot and you would come back with a good answer why this the sky is blue versus a question like this where you would say you know what I think I'm going to need to try a lot of different things and be creative let's try tree of thoughts um oh anything else nope it's escaping me I think I've talked for almost 40 minutes now I think I think I've exhausted everything I wanted to uh oh um so in the example I gave here I was saying hey you know we expect the water to boil when you turn on the stovetop but it's not boiling so at each of these intermediary checks we can make Network requests to other services HTTP Services web servers whatever and we can read instrument data we can get camera data from you know like Tesla has their uh cars that have the cameras on them that the self-driving and the robots with the vision I think it's Optimus so like at each of these steps we can recalibrate with what the real world sensor data shows us what the real world results of our past actions have resulted in um you can't do that with a zero shot where you just send a query to a language model when you just when you when you just send this query to the large language model and expect a single response you can't perform like intermediary checks you know you you could say hey I'm gonna put water I'm going to put the pot on the electric stovetop and then you might wait like 10 minutes before saying like hey you know these didn't work and so this is this is much more suited towards the uh AGI application the baby baby AGI or whatever it's called Auto GPT all that stuff because we can actually pause after each thought or thought step and we can evaluate we can reevaluate what's going on and then you know based on if it's succeeding or not we can backtrack and try other things so uh yeah that's it uh it's I think this is the first time I've even read a white paper for computer science stuff so uh took me like two hours before I even understood this and I'm sure I'm still misunderstanding some some things oh yeah one more thing I wanted to mention a little thought I had last night was that uh the authors gave examples of these trees here and when I was looking at this diagram something that really popped up in my head was you know it seems like an evolution of complexity in terms of data structures or algorithmic approaches or framework approaches or whatever you want to call it well what if we went even further than this tree here I mean this tree here is is a two-dimensional drawing but what if we used a vector database with I don't know 1532 parameters you know like pine cone DB right and so what if uh What If instead of a two-dimensional you know drawing of where each child can have however many children but we only can backtrack uh to the parent what if the thoughts that are generated were put into coordinate space like 1 500 coordinate parameter coordinate space this is obviously just like a three-dimensional um four-dimensional I guess what x y z and oh yeah three-dimensional anyway you know this but this still illustrates the concept right what if um you know we used this kind of data structure or approach to structure our tree of thoughts and then you know maybe instead of backtracking towards the parent we could backtrack towards anything within a certain radius or perimeter whatever distance of the the parent thought and also the other thought I had was what if you saved your trees of thought over time for many different inputs and you created a vector database of where each of these each of these embeddings here represents a a thought step like you know one of these then in the in the future like essentially you're caching this tree of thought uh result right so then in the future you could potentially if the same question is asked or maybe like a different question is asked but one of the subsequent thoughts ends up being similar or the same as a previously saved thought in a thought step for a different problem you know I wonder what that would do like maybe it would save you some computation time I mean I guess I guess theoretically speaking if you had all of the questions that could ever be asked and all of the situations that could ever happen and you had all of the thought steps and and uh thoughts for all of them right then you wouldn't even need to really like run any logic for generating thoughts and thought steps you could just you could just check your previously thought thoughts I mean that's not quite accurate you would need to generate thoughts and then look and see if you'd already generated them and then say oh I already generated them what are those results but um I think I think we're going to see I think we're going to see new data structures here besides trees and I think we're going to see huge advancements in the effectiveness of artificial intelligence uh based on this this tree of thoughts this tree of thoughts framework so thank you Google deepmind and Princeton and uh Jai long very cool stuff uh it it was interesting
Info
Channel: pbrunner
Views: 4,833
Rating: undefined out of 5
Keywords: LLM, Large language model, ChatGPT, Tree of thoughts, Ai, Machine learning
Id: QLJtfH8oGjk
Channel Id: undefined
Length: 45min 10sec (2710 seconds)
Published: Mon May 22 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.