Why Agent Frameworks Will Fail (and what to use instead)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

in this video I'm going to give you my take on agent Frameworks I'm going to tell you why I think they will fail and what to use instead if you don't know who I am my name is Dave abar I'm the founder of data Lumina and I've been building custom data and AI solutions for the past 5 years and next to that I create educational content like this to help you do the same and ultimately start freelancing so let's dive into this video so with the rise of large language models so-called agentic workflows and Frameworks became really popular and as a result we saw a lot of them pop up so for example we have autogen we have crew Ai and also Lang chain has a way to build agents and you have a lot more and the problem with all of these all of those Frameworks is that most of them are probably way too complex for what you're trying to do and not robust enough for what you're trying to do so let me explain all of these tools in all of these Frameworks are really most of them are built around the core idea of chaining agents together in a way where they can reason and figure out the next step within some kind of workflow so here if you look at the definition by Lang chain the core idea of an agent is to use a language model so an LM to choose a sequence uh of actions to take in a chain so a language model is used as a reasoning engine to determine which action to take crew AI follows a similar approach where you can very easily design agents you can give them tasks and I believe there's also this allowed delegation parameter that you can fill in so agents have Back stories roles goals and depending on the overall system they can decide the next PATH to take and the result of that is really cool and creative in a way that every time you run this you can get different outcomes you can get new stories you can build really cool processes for this but the problem what I found is that most most of the processes in the real world right now that you want to automate for a business for example does not require that much room for creativity actually mostly it's the other way around you want to take a process that is very clearly defined and if it's not defined you want to you want to Define that and then figure out the sequence of steps the sequence of actions to automate that workflow and then whenever you need AI to solve a particular step within that chain of problems that is where a large language model comes in so let's take a look at this example so I've been building apps using large language models literally since the day GPT 3.5 came out and in general if you look at the app flow of all of these applications it follows this process you have inputs in the form of your data your prompts then there is a processing layer which can be one simple processing step one llm call or it could be a chain of events multiple llm calls intermediate function processing steps external API calls whatever there could be a lot of processing and finally there's always some sort of output because when you're using generative AI these models create something they create output and typically you want to store that either in a database make it available to a front end in a chat application or whatever input processing output that is how these projects are setup now if you look at the agentic framework flow of how agentic Frameworks are trying to solve this problem it looks something like this and this is a very simplified version but really what's going on they take that input and then that processing layer that is where all of the agent comes in and typically there is some kind of a manager or orchestrator in between that can interact with all of these agents who all have specific goals Back stories and tasks so let's come back to crew over here here you can see initiate the crew so you have the crew the agents and the tasks so that is what I would refer to as the manager that is where you would bring everything together but when we look at all of these agents and the manager as a whole all of this is typically connected well it doesn't have to be the case you could also make it sequential but in general this is the agent work framework philosophy where there you have a lot of a agents working together so agent one might do something pass it onto agent two but then agent three might need something from agent one Etc and if agent 3 thinks the output is not good enough we go back to agent one and eventually if everyone agrees we pass it on to the manager and then we have the output so that would that would result in the flow that you you see over here and autogen and building agents with Lang chain all follow similar processes and the problem with that how I see it is we are literally collectively all trying to figure out the best way to build applications around language models large language models we're trying to figure out the best workflows we're trying to figure out how to do this at scale how to manage hallucinations and all of these tools they have their own opinionated way of going about it and don't get me wrong I don't want to bash these tools they're great for certain use cases I just want to point out that I believe for most of the automation use cases for businesses right now I wouldn't recommend using these tools because like I've said you're building on top of abstractions that other developers came up with in a new field which we are all still trying to figure out and because you're building on top of these extractions like this you probably don't know what's going on behind the scenes because that's really hard to understand if you dive into a new bloated library that you didn't write yourself so what I recommend instead is keeping things really simple and building these applications up really from the ground really from first principes so really consider for your situation for your application what it is that what is it that you need and what are the steps required in order to solve this problem and if we go back to the Whiteboard the way I currently do that which I will show you an example in a bit is I view the generative AI app flow not as an agentic problem to solve but rather I look at it as a data pipeline because if you look at the general FL flow input processing output it's very similar to a regular ETL pipeline extract transform load and the cool thing about data pipelines is they have literally been around for years ever since we had computers people have built data pipelines and there are a lot more solid principles designed patterns and approaches that we can leverage when we follow a data pipeline flow rather than an agentic workflow and also rather than designing your workflow using circular patterns where for example agent 3 can jump back to agent one to two and then back to one you ideally at least how I like to do it is I design my pipelines using a sequential approach so following a what's known as as a directed acyclic graph or DAC which is also what tools like airf flow are uh built on top of on that design principle meaning that data can only flow one way and not go back and this overall ensures the reliability of your system because in my opinion you should really design your workflows in a way that if the pipeline if the flow ends up at three it should already have processed one and two the outputs so that you can always continue with three and it's just about how you how you frame the problem and that's why the sequential order really helps for that so you always know it's first this step then this step and in between you can add all kinds of logic and validation but if your business process or automation process if you can't draw it out like this using simple sequential steps I recommend you to go back to the Whiteboard and try to make it simpler or split it up even further because almost every process for for most business problems that you're trying to solve they can be broken down like this and now the cool thing about viewing the problem that you're trying to solve with an llm as a data pipeline rather than an agentic framework or workflow is that it becomes much simpler to solve this using code without needing any fancy Frameworks or tools so let me quickly show you an example so I'm going to show you an example in Python but again the cool thing is you can do this in any language because it doesn't matter if you follow the similar steps of getting your data chaining together steps in a data pipeline bringing it all together and then pushing it to whatever kind of output you're pushing it to it doesn't matter what language you use and you don't need any framework so let me show you an example this is a project template that I'm building out so it's called generative AI project template it's pretty complex there are a lot of moving parts and we're building this internally for our company but I'm going to show you the pipeline process in there because that's what what we're talking about right now so we're currently using the example of creating a system that can take an incoming email then classify it and then generate a reply so this could be what you call an agent workflow or a problem that you can solve using a large language model now the cool thing about designing these pipelines and these solutions for these problems is that you can have one step or you could have 100 steps it doesn't matter and ideally in your code you want to build it in such a way that you can easily add steps and remove steps and change steps now a very common design pattern to do this is the chain of responsibility pattern so I've used that particular patter pattern I will show you in a bit what that looks like and give it my own spin to it by including some pipeline elements into it and now being able to very easily Define sequential steps and then in between when it's time to call the large language model we do that so let's see what's going on over here we're simulating an incoming ticket here from a ticketing system so let me zoom in a little bit so let's assume we we're getting in some data and the ticketing system ident I okay this is from an email this is from uh info datal um.com and it's an email asking for a potential collaboration now the cender here is dat luminat reaching out to me so that would be a little bit weird but you get the idea we're getting some input data here and throughout this project um we have identified some pipelines and for this ticketing system we are going to Define in at first two pipelines just for example so we have tickets coming in from email and we have tickets coming in from Instagram so what you how you can then view that is you have your data Pipeline and you have another one for Instagram so this could be ebil this could be Instagram and again you could infinitely duplicate this to expand your system so the whole idea with this project really is that everything has a nice and tidy place for it so we have our pipelines so let's see at what that then looks like if we come in here and we process the task so we take that data we're using ptic models and we call process the tasks we then call a pipeline registry which is using a registry design pattern again a design pattern that is solid that is proven and what this does is depending on whether the channel is email or Instagram we get the right pipeline so again through that frame we have different pipelines data comes in system figures out okay we need this pipeline so we can route uh different requests to the right place okay let's go one step further and look at what a pipeline actually is so let's consider the email pipeline the email pipeline is a sequence of steps first we have classify email and then we have generate response so in this case we're only using two steps but like I've said you can make this infinitely complex by simply adding more steps to the system and this is where the chain of responsibility pattern comes in so let me come back to the registry let's look at the email Pipeline and look at the base pipeline so the base pipeline step is configured with a run function where it Loops over all of the steps in the pipeline and then calls the step. process which processes all of the data and you can see there is an abstract method in here called process and it's just used to pass around the data between the different steps now I'm covering this high level if you want to know more about this let me know in the comments um you could research these design patterns on your own really the idea is not here to really get in great detail on how this works but rather show you how simple it can be by combining two design patterns and putting that together in a structured way so following all of that let's look at the two processing steps that we in have in here so we have a classify email which leverages the instructor Library if you don't know what the instructor library is you can use it to patch large language models and use it to uh validate your your output by defining a response model this is really powerful and will completely change the way you build applications around large language models I have a video on that as well which I will link afterwards but this is really big now what this all allows you to do let me zoom out a little bit is having one simple input over here and if I run this we can have a look uh we have the function over here and now we can just process this so let's just process this and see what happens and now it will just trigger the pipeline so it will run it will step one step two it will pass down the data and it will fire or it will run the processing steps as defined in the classes over here but now let's look at the cool thing let's let's see where is it let's look at the cool thing is so first we have the classification so it says it's collaboration it also adds a confidence score because I've asked it to we can validate it with penic and we also ask reasoning which is something I really like to do so next to just defining an output you also ask the the model to give a reasoning and that is then used in your system via logging or in a database to backtrack something to debug to reduce hallucinations and then over here we have the response and it says thank you for reaching out we appreciate your interest however we're currently not pursuing any collaborations and that is because there's a prompt folder over here um where we've created The Prompt and we can introduce some guidelines here for the system to consider okay so that's the output but now let's look at what is happening behind the scenes and what we get back so the result is now a task result and instead of a single like text output we have a dictionary with if this runs in production we have a task ID from saler we have the status it's completed we have the input data which is the original data we have processing context which is any context or any any intermediate steps that you need throughout the system so let me explain so for example in step one you might calculate or you might determine the category of the email then you might save that as an intermediate step and then use that in step two to for example get the right data through uh or the right contacts through a rack system that is an an an that is how you could use that for example so you store any intermediate data for example the category in there and then you also have the output data which in this case is the response and we're building this out so that we follow a a uh penic we follow ptic U models let me actually come in here and this is still work in progress where we have a predefined structure for input processing output task results and the events that come in so that we can use it for all of our generative AI problems and again if you want to know more about this uh I plan to do a whole video on this discovering this but we're still working this out and also that's not the goal because again I'm also right now in a sense creating my own extractions and my own systems and the goal really is to show you that you probably don't need someone else's framework figure out what your problem what your problem needs how to solve that problem using a simple data Pipeline and then build it up from the ground from first principle so that one you fully understand it and it doesn't become too blo it and now here to quickly demonstrate that if we come over here to Let's process the Instagram task so so you can see channel is now Instagram we have a username uh let's see so we got this and then we process that task and we can look at the results and that is currently uh coded to to generate the hardcoded reply but you can see that it correctly takes on the correct pipeline depending on the incoming user data and having built tons of generative AI applications this is typically what you want you filter the input you decide what comes in what kind of data what kind of user or what kind of platform you want to follow sequential processing steps and then output it to another system so that's it for this video and by the way if you're a developer and you want to get started with freelancing but struggle to find clients you might want to check out the first link in the description it's a video of me going over how my company can help you solve that problem and now in all transparency it's a funnel designed to get leads for my company so please keep that in mind you don't have to click it but if you want to get started with freelancing but don't know how to get get started go check it out and now if you found this video helpful please leave a like and also consider subscribing and then if you want to learn more about building reliable systems with large language models for example using the instructor Library make sure to check out this video next where I go over my entire workflow really deep diving into how you can set this up for yourself

Info

Channel: Dave Ebbelaar

Views: 15,908

Rating: undefined out of 5

Keywords: python, vscode, artificial intelligence, ai, tutorial, how to, llm, openai, rag, vector, vector database, saas, fastapi, pydantic, gpt, gpt4, claude, anthropic, database, data, data science, machine learning, freelance, freelancing, ai agency, data freelancer, datalumina, dave ebbelaar, pipeline

Id: KY8n96Erp5Q

Channel Id: undefined

Length: 19min 21sec (1161 seconds)

Published: Thu Jun 27 2024