Fusion Chain: NEED the BEST Prompt Results at ANY COST? Watch this…

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

it all starts with The Prompt by training on vast amounts of information state-of-the-art large language models are able to derive meaning and understand the World As We Know It with a single prompt we can use llms as a new tool of computation The Prompt is the new fundamental unit of knowledge work but it doesn't stop here we can chain together multiple prompts to add and even multiple apply the reasoning abilities of llms we call this The Prompt chain The Prompt chain is exactly as it sounds you have one prompt and the output of that prompt becomes the input of the next prompt so on and so forth prompt chaining is a powerful technique that gives your large language models the ability to Reason by chaining together one prompt to the next to the next we can also refer refer to previous prompts outputs in any subsequent prompt that we want so prompt three can refer to our inputs or the outputs of any prompt that came before it this creates our chained output The Prompt chain is critically important for building agentic workflows because they represent how you and I work on a daily basis we have different tasks that we operate on throughout the day and there's a flow to it it may may not be a straight line but there's almost always a sequential flow to the tasks you and I execute on on a daily basis but start to finish sequences are not the only way to operate we've explored multi-agent workflows on the channel in the past where you have one prompt firing off a loop or a set of prompts in a graph likee format tools like autogen Lang graph and others allow you to build workflows like this as we've discussed I recommend staying close to the metal you don't want to give the most important piece of technology away to a library just yet or at least without knowing what's going on under the hood these workflows can also drive very powerful results for your tools applications and products we've explored all these ideas in previous videos and we're actively building out agentic workflows on the channel right now in this video I want to show off a new chain that is slowly starting to make the round and slowly starting to gain traction that can help you boost your ability to really truly utilize state-of-the-art models near their maximum capacity traditionally we have a prompt chain operating from left to right we can take this and simply multiply the chains by multiplying our individual chains we can take our inputs and pass them to sets of model chains the core idea here is is that by multiplying the number of chains you have running on a diverse set of models you're able to get a multi-chain output that beats out a single chain running there's been a lot of talk of throwing more agents at the problem throwing more prompts at the problem effectively think step by step and tree of thought on steroids by passing your inputs to multiple chains running the same or even different prompts you're effectively getting different opinions from the models that you're choosing and when you're running the state-of-the-art models from open AI anthropic and Google what you're going to end up with is the best possible result you can get with the state-of-the-art models after every model chain runs we take the final layer and merge them together in the evaluator step the evaluator then takes the final layer of every prompt chain and decides which one best fits the result you're looking for this is a powerful pattern because it also forces you to decide what it means to get a great result from your prompts and your prompt Chains It's also important to note that your evaluator can also be in itself another prompt call or even a prompt chain call but the end purpose is to take all the results from your prompt chains and fuse them this is called the fusion chain also known as beam or the competition chain depending on how you evaluate and combine your last layer of prompts so this is the powerful prompt chaining technique that I want to share with you today but first I want to dig into a couple of key questions that if you've been working with llms writing prompts writing some prompt chains you've probably had one or more of these questions yourself let's start with some of the Heavy Hitters does adding two or more chains actually improve performance we're going to answer this question and we're also going to answer won't GPT 5 CLA 4 and Gemini 2 outperform prompt chains outright making them completely irrelevant we're also going to take a look at the question what's the optimal flow of prompts for a gentic workflows this is something that I think about more and more every day with every agentic flow that I build out with every prompt chain and AI agent and Tool call and function call I keep wondering what's the optimal flow here are prompt chains important will they be relevant in the future these are questions we're going to answer in just a second here last question we're going to look at a concrete example of fusion chaining let's go ahead and just answer some of these questions based on all of our experience here on the channel and in the background as we're just building out prompts prompt chains agentic workflows Etc so first things first does adding two to NX chains truly improve performance the answer here is a concrete 100% yes you can check out previous videos to kind of see how exactly this happens but when focusing on the Fusion chain specifically you'll see something really really interesting in this example you can think about it like this instead of hiring one team of Consultants to build something for you you're hiring three teams you're then concretely defining what it means to solve the problem you're trying to solve and then you force these Consultants that you've hired every version of them you're forcing them to compete against each other and then you're taking the best of every single team you're taking the best of every single workflow and you're taking the best of every single prompt chain I can confidently tell you using multiple prompt chains increases performance so here's another really really powerful important question the answer to this is yes DB5 Cloud 4 and Gemini 2 will absolutely outperform prompt chains but the key here is that they'll outperform prompt chains of the previous generation I can almost guarantee you that a GPT 5 lets think step by step work through the problem over three prompts versus one prompt that will still outperform a single GPT 5 prompt so when we ask this question we have to ask the question will a gp5 prompt chain beat a single gp5 prompt or Claud 4 or Gemini 2 and then the answer completely changes right this is a really really important Insight because it means that the prompt chain is a critical abstraction it's a critical pattern it's a a higher level composition of the prompt that isn't going to go away with time at least not without a lot more time in the end all technology Fades away to something greater the question is how long will it last betting on the prompt chain is a key piece of technology is an absolute win multi- conversation chains and fusion chains I think are all the right direction this is how we build powerful agentic workflows that work while we sleep now that leads us to another question if prompt chains are the answer what is the optimal flow what's the right chain for building great agentic workflows and I have to be honest guys I'm not there yet I do not know the answer to this question I cannot confidently say if a prompt chain with length three or eight is better I can't say if a fusion chain with five chains is better than than two um all I can say right now for sure is that using a prompt chain is definitely better than using a single prompt and I can tell you that using a context filled prompt or a BAP or a big ass prompt prompt with 10 plusk tokens I can definitely tell you that in most cases that will help perform a single prompt I can't give you a prescription for you know the right flow of agents to solve your specific problem I do think that a lot of that is going to be Case by case specific it's like asking the question what what's the optimal way to build out an API and the answer to that question is well what what's what problems are you trying to solve what information do you need to be readable writable updatable deletable right it very quickly gets into the use cases of your specific domain right now on the channel we're on a hot streak building out agentic workflows if that interests you hit the like hit the sub this is how we figure out what the optimal flow is it's very likely that there's no one flow that fits all but I'm very certain there's a couple workflows that will cover 60 to 80% of all use cases all right and then last thing here let's go ahead and hop into a concrete example of using this Fusion chain right and what are we doing with the fusion chain why is this important why is this so different right basically what we're doing is we're multiplying the capabilities of our llm once again we can keep doing this horizontally by adding additional prompts from left to right but we're exploring this new idea of of blowing up top and bottom kind of going vertical if you will and adding additional models to run the exact same chain and see which model wins we're also really digging into this idea of this evaluator a pattern that keeps coming up while building these agentic workflows is that it's super important to evaluate your responses because something magical is going to happen here once we start truly understanding how to build these great evaluators the evaluator is going to l loop back in to the beginning of this chain and provide concrete feedback this is where our agentic workflows are going to start thinking and operating for themselves and self-improving definitely let me know in the comments if you agree with my answers to these questions or if you have tweaks or other ideas that you think need to be considered I do think ultimately prompt chaining is a powerful abstraction that isn't going away the big question I see is what's the optimal flow of prompts to push your agentic work close to the very limit with whatever current state-of-the-art models exist so let's take a look at an API for the fusion chain as I mentioned on the channel I don't like to use llm libraries because I like to keep the distance between myself and the prompt as thin as possible I recommend you do the same here's the classic prompt chain that we explored on the channel before you can easily read this and understand what exactly is happening you have a model you have some context that your prompts can use these are the inputs that we saw in the diagram and we then have a list of prompts and the prompts can reference the context pass in and also outputs of previous prompts you can see here we're using output minus one. tile this is coming from that exact previous prompt that just ran that then happens again in our third prompt where we reference both the first and the second prompt Link's going to be in the description for the classic prompt chain let's move on to the fusion chain so the Fus Fusion chain looks very similar to The Prompt chain instead of passing in a single model we're passing in N models and now for every prompt we're going to run our n models so this will run a total of nine times and then we're passing in this evaluate function which is going to determine the best response based on the final layer of responses from every model we're going to rank the output from 0 to one so we know who performed the best we'll take these fin responses and we'll do any merging or fusing or ranking and we'll just return the best result in the evaluator function that gets passed in to the fusion chain you can find the fusion chain and the simple prompt chain in the description let's go ahead and look at a concrete example we're continuing to improve the zero noise application that allows us to without going out to look for specific information we can hit run workflow and this is going to go out it's going to scrape these websites and is going to alert us if there are any new updates this allows us to create a library of information and prevents us from going looking ourselves wasting time going to every single link you can imagine having 10 20 30 different endpoints that you want to visit and keep track of this tool helps you in a very low noise way look for updates for specific tools and from blog posts from specific Engineers that you're interested in so you can see here we have an update from Simon w Ed blog it does a light summary here of all the changes since the last scene date and then of course we have a new update from cursor looks like they updated and just recently launched the beta of multifile editing I just saw this go out right before this video I'm pretty excited to check that out but anyway you can see this tool working in action it's a very very low noise way to get information from blogs from change logs from places that you really care about so we have an agentic workflow that runs this portion check out the previous couple videos to see how exactly we built that in this video we're going to just look at some of the core details of this learn method because it runs the fusion chain so shout out to Big AI they were one of the only websites where I could find anything on the fusion chain outside of Hardcore research and you can see their version of beam here they basically run your prompt on multiple models and then they fuse the response right so very similar but it's not quite in chain form I'm going to go ahead and Link them in the description as well check out big AGI they have some interesting stuff Brewing we're going to come in here paste this in hit enter it's going to run a fusion chain to determine what the best HTML selectors are in order to properly fetch the updates and then it's going to add it to our rolling configuration file that has all these URLs so you can see it added that there we can now hit run workflow again and we now have that big AGI blog in there and we're going to see if we can get any fresh updates so basically our learn run flow is going to find the selectors we need to run our fetch process so you can see here we got no updates because the last scene date is of course going to be right now so let's go ahead and look at how this workflow runs so you can see here we have the two workflows this is a simple streamlit application we have these two workflows and the key piece I want to show you is inside of the agentic so this is the agentic workflow class structure I've been using first you retrieve R all information that you need so this does the initial scraping you run your llms your prompts your AI agents then you act on this information so let's go ahead and hop into the core of this in the agentic function you can see here we're making sure we have raw scraped content we're setting up the context for a chain this is going to be the information that we pass into our prompt chains and then we have our individual prompts that are doing the scraping we're going to skip over this for now and then we have our evaluator function so this is really interesting but here you can see the actual Fusion chain so you can see we're passing in the state-of-the-art models son 3.5 gp40 Gemini 1.5 Pro and then we're running Fusion chain with the parallel so that these all run at the same time over these two prompts that we have here so we have two parsing prompts and we're running those for every prompt here we then have our evaluator function so this is where we fuse the response in this case we're just looking for the best response out of the last responses from each model chain you can see here I'm running some validation and what I'm going to end up with here is a list like this we're going to end up with our best response and then we're going to end up with some scores so for every model we're going to end up with something like 0 point yeah exactly like this right so 0.25 0.25 or one so it's from 0 to 1 we can go ahead and look at the fusion results right here and if we collapse you can see that we have the model names and then we have our individual performance scores you can see every model performed at a .95 based on my evaluator function in this evaluator function we reduced 5% of their score for every header that they provided so you can see here I basically created a ranking system that said I want the fewest number of headers and whoever gives me the fewest number of headers wins so we can see the top response here but if we dive into all prompt responses you can see we have a variety of responses here this is coming from sonnet we have another set of responses here right so this one's going to come from gp4 o and it gave us this and we can go ahead and just copy one of these out these are all likely going to be right on the elements tab I can just search for this path and you can see here this is going to be the first five just like I asked the first five elements that contain contain both a date and a summary of the blog or the change log whatever updates they have it's going to contain um that information for me right so this agentic workflow automatically parsed through all the HTML here and it gave me just this tag and I was able to get this through the fusion chain right the fusion chain was able to allow me to say hey this is what it means to really get at this result so gb24 and then down here you can see we have this is going to be Gemini 1.5 Pro okay so that's basically it you can see it gave a you know kind of a totally different answer we can see if this one works as well let's go ahead and just paste this in yep so that answer works as well so kind of really interesting something interesting that happens with the fusion chain is that you also get to see uh how the models perform and what different results they all give and some of them are going to give you insights into better answers and worse answers frankly and that will then allow you to refine and strengthen your evaluator right which then decides which one of the prompt chains has the best answer so that's going to be it for this one guys this is going to be a longer video I'm going to try to cut this down as much as possible for you but I just really think that this idea of focusing on the higher level compositions like the prompt chain and the fusion chain I think these are going to be really really important ideas and really important abstractions for you to use and build really powerful technology with generative AI so if you like the video you know what to do thank you so much for watching and I will see you in the next one

Info

Channel: IndyDevDan

Views: 3,259

Rating: undefined out of 5

Keywords: prompt chain, prompt chaining, fusion chain, beam, beam chain, multi-chain, llm chain, agentic, agentic workflow, ai agent, indydevdan, gpt-4o, gpt-5, gemini 1.5, gemini 2, antrhopic, claude 3.5, claude 4

Id: iww1O8WngUU

Channel Id: undefined

Length: 19min 35sec (1175 seconds)

Published: Mon Jul 15 2024