AI Leader Reveals The Future of AI AGENTS (LangChain CEO)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let's talk about agents Harrison chase the CEO and founder of Lang chain did a talk at this Sequoia event that I made another video on a couple weeks ago where Andrew ning did a talk also there and Harrison's talk is also about agents and the current state of Agents what to expect from agents in the future where they work really well where they don't and so let's watch it together and I'll comment on it as we go through it so let's watch a quick note before I get to the video if you want a chance to win a rabbit R1 all you need to do is subscribe to my newsletter get awesome AI updates twice a week and stay up toate on the world of AI I'll drop the link to subscribe in the description below so check it out subscribe to my newsletter and maybe you can win this rabbit R1 now back to the video for those of you who are not familiar with Harrison he is as I mentioned the co-founder and CEO of Lang chain and if you haven't heard of Lang chain let me tell you quickly about what they do so Lang chain is a super popular coding framework that allows you to basically just take a bunch of different AI tools and plug them all together really easily the chain part and really this was agents before agents had a term and so of course Harrison is incredibly knowledgeable about agents so now let's watch the video thanks for the intro and and thanks for having me excited to be here so today I want to talk about agents uh so L chains the developer framework for building all types of llm applications but one of the most common ones that we see being built are agents um and we've heard a lot about agents uh from a variety of speakers before so I'm not going to I'm not going to go into too much of of a deep kind of like overview but at a high level it's using a language model to interact with the external world all right I actually want to stop it right away so one thing that I've heard quite a lot and less So lately now that agents have really become mainstream is agents are just prompts they're just complex prompts but that's not necessarily true and even if it were there's so much going on around that that that is what makes it so special and this is a great graph for actually understanding what's going on with agents so you can think of the large language model as this one little piece right here the agent itself then you can give that agent tools so they can have access to your calendar to a calculator to the web they can do code interpreter which means they can actually spin up environments and write and run code and basically there's an unlimited amount of tools that you can give agents then we give agents memory both short memory and long memory so short memory means memory between a conversation or within the conversation between agents and longterm memory is something like rag for example so retrieval augmented generation saving information to be used later and crew AI my favorite agent framework just released both short-term and long-term memory and has shown that the agent performance has significantly improved since adding these features so agents can also do planning which is reflection self-critique ique chain of thoughts subgoal decomposition and then they can also perform actions so with all of these additional superpowers agents or just the large language model prompt becomes so much more than just that and we're going to touch on planning in a moment because Harrison says something really interesting about it so let's keep watching tool usage memory planning taking actions is is kind of the highle gist and the simple form of this you can maybe think of as just running an llm in a for Loop so you ask the llm what to do you then go execute that and then you ask it what to do again and then you keep on doing that until it decides it's done so today I want to talk about some of the areas that I'm really excited about that we see developers spending a lot of time in and really taking this idea of of of an agent and making it something that's production ready and and and real world and and really you know the future of Agents as the title suggests so there's three main things that I want to talk about and we've actually touched on uh all of these in some capacity already so I think it's a great Roundup so planning uh the user experience and memory so for planning Andrew uh covered this really nicely in his talk um but we see a few the basic idea here is that if you think about running the llm in a for Loop often times there's multiple steps that it needs to take so I'm going to pause there for a second so I've already done a video all about the tree of thoughts paper which is incredible so be sure to check it out I'll link it in the description below and then I I haven't actually done a review of the reflection paper but the gist is you allow a model to generate this initial response to a prompt and then you simply feed it back and say hey what would you do better and that's the very simple explanation of what it does but essentially what we're doing is giving the models the ability to reflect to plan ahead to break complex task down into subtasks and that's something that the models themselves alone can't do yet and I've made a few videos about qar and qar has to do with giving the models the ability to plan and look ahead but that's not something we have to play around with today and in fact just today I released a video about the gpt2 chatbot large language model that was mysteriously released on LM CIS and then it was actually just recently taken down and a lot of people think that that is a model that has the ability to power agents because it does have this planning ability more so than anything we've ever seen and I haven't necessarily seen that but I also wasn't explicitly testing for that but the point is agents and agent Frameworks allow you to extract so much more quality so much more performance out of just a large language model prompt and so when you're running it in a for Loop you're asking it implicitly to kind of reason and plan about what the best next step is see the observation and then kind of like resume from there and think about the what the what the next best step is right after that right now at the moment language models aren't really good enough to kind of do that reliably and so we see a lot of external uh uh papers and external prompting strategies kind of like enforcing planning in in some method whether this be uh planning steps explicitly up front um or reflection steps at the end to see if it's kind of like done everything correctly as as it should and I've actually made a video about models that were explicitly trained to quote unquote think slowly and orca is a great example of that and orca is a project out of Microsoft that really teaches the model how to think slowly and use a lot of these techniques whether we're talking about reflection or tree of thoughts or other kind of slow thinking techniques automatically without us having to prompt or kind of code around the model to make it do that I think the interesting thing here thinking about the future is whether these types of prompting strategies and these types of like cognitive architectures continue to be things that developers are building or whether they get built into the model apis as we heard Sam talk a little bit about um yeah so that's really a question still and I'm not sure my guess is it's going to take a new architecture something completely new Beyond just the Transformers architecture to allow these models to really logic and reason properly to plan ahead to think to think slowly and that's just not what they do today so maybe that's what GPT 5 is going to be maybe that's what qar is but I haven't seen any evidence that we actually have a large language model that can do that so for now developers are going to have to build these tools and these strategies themselves which is fine because companies like crew AI make it really easy to do that and even when models will be able to think more slowly and have these things inherently in them agent Frameworks are still going to be very valuable for coordinating different models for giving tools for being able to coordinate different models coordinate different agents give them different tools and coordinate a very consistent workflow um and so for all three of these to be clear like I don't have answers uh and I just have questions and so one of my questions here is you know are these planning prompting things short-term hacks or long-term uh necessary components actually let me know what you think in the comments do you think that these types of prompting techniques reflection tree of thoughts are these short-term hacks and then event ually the models will just be able to do this without prompting them to do so or using external techniques or are these techniques we're going to have to do forever another another kind of like aspect of this is just the importance of basically flow engineering and so this term I heard come out of this paper Alpha codium it basically achieves state-of-the-art kind of like coding performance not necessarily through better models or better prompting strategies but through better flow engineering so explicitly designing this uh kind of like graph or or or state machine type thing and I think one way to think about this is you're actually offloading the planning of what to do to the human Engineers who are doing that at the beginning and so you're relying on that as a little bit of a crutch all right so that's a really good point and again that's why I'm so bullish on agent Frameworks they help you with the flow engineering piece and Beyond just prompt engineering now we're talking about flow engineering and that's a whole separate Art and Science in itself and it's still very early days we're still trying to figure out what types of flows work well how many agents work well together is there a maximum is there a minimum how should they plan what steps should they execute so it's still early it's still really fun to watch the next thing that I want to talk about is the ux of a lot of agent applications this is actually one area I'm really excited about I don't think we've kind of nailed the the right way to interact with these agent applications I think uh human in the loop is kind of still necessary because they're not super reliable and I want to talk about human in the loop so I work with large companies helping them with their AI strategy and consistency and reliability and quality is insanely important to them and when you're talking about large language models hallucinations are almost guaranteed so how do you avoid them well there's a few ways again agent Frameworks help you really reduce hallucinations through things like caching through prompt libraries through obviously reducing the temperature of the large language model but also human in the loop and that's really important especially to large Enterprise companies and I don't think human in the loop is going to go away anytime soon but as Harrison says if you have too much human in the loop basically you're removing all of the Automation and there's this fine balance of where you actually need human in the loop and I think it's essentially whenever you have a deliverable whenever the agents produce something that is substantial and is a piece of something that will be delivered and relied upon within the organization and so that's something I'm still experimenting with is what the optimal human in thee Loop strategy is but if it's in the loop too much then it's not actually doing that much useful thing so there's kind of like a weird balance there one ux thing that I really like uh from from Devon uh which came out you know a week two weeks ago and speaking of Devon as much virality as they had and as much dunking as they had and people calling it out as actually like doing less than what they've shown in the Dem Mo the ux is fantastic and shortly after Devon we had DEA and open Devon so obviously they did something right with showing all of the screens the browser the chat window the terminal the code all in one screen that was obviously a really powerful UI because it was copied and a lot of people like it so I think this was immediately one of the big contributions of seeing the Devon demo was just everybody realized oh this is a great way to structure the user interface um and and and Jordan kind of like uh put this nicely on Twitter is is the presence of like a rewind and edit ability so you can basically go back to a point in time where the edit or where the agent was and then edit what it did or edit the state that it's in so that it can make a more informed decision and I think this is a really really powerful ux um that we're really excited about uh at L chain and exploring this more and I think this brings a little bit more reliability um but at the same time kind of like steering ability to the agents so let's talk talk about being able to rewind and change things uh I agree this is a really incredible user experience because there are times where you kind of go off in a path in a direction and you find that that was not the right thing to do so let's go back and start from this state and I've seen one project do this incredibly well and they've actually been a sponsor of this channel but it comes to mind because they really do do it so well which is pythagora and that was the AI coding assistant that I've shown you before and pythagora has this ability to basically rewind to any step along the entire journey of a project and you can start from there you can edit it and continue on from there so really cool and that's kind of what Devin does that's also what Harrison is talking about I think that's going to be a very strong piece of agent coordination and I can't wait until all of the agent Frameworks build that in speaking of kind of like steering ability the the last thing I want to talk about is the memory of of Agents um and so Mike uh zapier showed this off a little a little bit earlier where he was basically interacting with the bot and kind of like teaching it what to do and correcting it and so this is an example where I'm teaching um in a chat setting in AI to kind of like write a tweet in a specific style and so you can see that I'm just correcting it in natural language to get to a style that I want I then hit thumbs up the next time I go back to this application it it remembers the style that I want but I can keep on editing it I can keep on making it a little more differentiated and when I go back a third time it remembers all of that and so this I would kind of classify as kind of like procedure memory so it's remembering the correct way to do something I think another really important aspect is is basically personalized memory so remembering facts about a human that you might not necessarily use to to do something more correctly but you might use to make the experience kind of like more personalized um so this is an example kind of like journaling app that that we're building and playing around with for exploring memory and you can see that I mentioned that I went to a cooking class and it remembers that I like Italian food and so I think bringing in these kind of like personalized aspects um whether it be procedural or or kind of like these personalized facts will be really important for the next generation of Agents um that's all I have so that is both long-term and short-term memory in the short term you should be able to go back and forth with an agent or allow the agents to go back and forth with each other and they can learn and improve along the way and that might be also where human in the loop comes in you can kind of steer them but then we have long-term memory which is also really important not only for personalization but also also within the context of businesses and Enterprise the ability for these agents to learn things to have obviously the company's knowledge at hand at any time but that's just rag but basically learn that and use that memory for the foreseeable future is a really powerful feature that is being built into or is already built into many agent Frameworks and that's something I'm really excited about now there's a lot of complexity there how much do you store how do you write the rules for when to forget something or do you ever forget something how do you change a memory businesses change all the time so the memory has to evolve with the business's needs and again all of this is very early it's so raw right now so just having the ability to give it long-term and short-term memory and using flow engineering and tools and all of these things it's possible but I think there's not really a tried and true path yet people are still figuring out what is the best combination what is the optimal combination of you know whatever we're talking about long-term memory short-term memory tools number of Agents different large language models should you use different ones in the same workflow there's so many cool questions yet to be answered and so that's it that's his whole talk uh this was a great talk let me know what you think in the comments if you liked this video please consider giving a like And subscribe and I'll see you in the next one
Info
Channel: Matthew Berman
Views: 93,489
Rating: undefined out of 5
Keywords: ai, ai agents, agents, langchain, llm, openai
Id: 9ZhbA0FHZYc
Channel Id: undefined
Length: 16min 22sec (982 seconds)
Published: Thu May 02 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.