LangChain’s Harrison Chase on Building the Orchestration Layer for AI Agents

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

it's so early on that like it's so early on there's so much to be built yeah like you know GPT 5 is going to come out and it'll probably make some of the things you did not relevant but you're going to learn so much along the way and this is I strongly strongly believe like a transformative technology and so the more that you learn about it the better [Music] [Music] hi and welcome to training data we have with us today Harrison Chase founder and CEO of Lang chain Harrison is a legend in the agent ecos system as the product Visionary who first connected llms with tools in action and Lang chain is the most popular agent building framework in the AI space today we're excited to ask Harrison about the current state of Agents the future potential and the path ahead Harrison thank you so much for joining us and welcome to the show of course thank you for having me uh so maybe just to set the stage agents are the topic that everybody wants to learn more about and you've been at the the epicenter of agent building pretty much since the llm wave first got going and so maybe first just to set the table what exactly are agents I think defining agents is actually a little bit tricky and people probably have different definitions of them um which I think is pretty fair because it's still pretty early on in the life cycle of everything llms and agent related the way that I think about agents is that it's when an llm is kind of like deciding the control flow of an application um so what I mean by that is if you have a more traditional kind of like rag chain or retrieval augmented Generation chain the steps are generally known ahead of time first you're going to maybe generate a search query then you're going to retrieve some documents then you're going to generate an answer and you're going to return that to a user and it's a very fixed sequence of events um and I think when when I think about things that start to get agentic it's when you put an llm at the center of it and let it decide what exactly it's going to do so maybe sometimes it will look up a search query other times it might not it might just respond directly to the user um maybe it will look up a search query get the results look up another search query look up two more search queries and then respond and so you kind of have the llm deciding the control flow um I I I think there are some other maybe more buzzwordy things that fit into this so like tool usage is often uh associated with agents and I think that makes sense because when you have an LM deciding what to do the main way that it decides what to do is through tool usage um so it's so so I think those kind of go hand in hand um there's some aspect of memory that's commonly associated with agents and I think that also makes sense because when you have an llm deciding what to do it needs to remember what it's done before um and so like tool usage and memory are kind of like loosely associated but to me when I think of an agent it's really having an llm decide the control flow of of your application and Harrison a lot of what I just heard from you is around decision making and I've always thought about agents as as sort of action taking uh do those two things go hand inand is agentic behavior more about one versus the other how do you think about that I think they go hand in hand I think like a lot of what we see agents doing is deciding what actions to take um for for all intents and purposes um and I think the big uh difficulty with action taking is deciding what the right actions to take are um so I do think that solving one kind of leads naturally to the other and after you decide the action as well there's generally the system around the llm that then goes and executes that action and and kind of like feeds it back into the agent um so so I think that in yes so I do think they go kind of hand in hand so Harrison it seems like the main distinction then uh between an agent and something like a chain is that the llm itself is deciding what step to take next what action to take next as opposed to these things being hardcoded is that like a fair way to distinguish agentes yeah I think that's right and there's different gradients as well so as like an extreme example you could have basically a router that decides between which path to go down and so there's maybe just like a classification step in in your chain and so the lm's still deciding like what to do but it's a very simplistic way of deciding what to do um and you know at the Other Extreme you've got these autonomous agent type things and then there's this whole Spectrum in between so I'd say that's largely correct although I'll just note that there's a bunch of nuance in gray area as there is with most things in the llm space these days got it it's like a spectrum from control to like fully autonomous decision making and logic um all those are kind of the spectrum of Agents interesting uh what role do you see Lang chain playing in the agent ecosystem I think um right now we're really focused on making it easy for people to create something in the middle of that Spectrum um and for a bunch of reasons we've seen that that's kind of the best spot to be building agents in at the moment um so we've seen uh some of these more fully autonomous things get a lot of interest and and prototypes out the door and and and there's a lot of benefits to the fully autonomous things they actually quite simple to build um but we see them going off the rails a lot and we see people wanting more constrained things um but a little bit more flexible and powerful than chains um and so a lot of what we're focused on recently is the being this orchestration layer that enables the creation of these agents particularly these things in the middle between chains and autonomous agents um and I can dive into a lot more about what uh exactly we're doing there but at a high level that's that being that piece of orchestration uh uh framework is is kind of where we imagine Lang chain sitting got it so there chains there's autonomous agents there's a spectrum in between and your sweet spot is somewhere in the middle enabling people to build agents yeah and obviously that's changed over time so it's fun to like reflect on the evolution of Lang chain um so you know I think when Lang chain first started it was actually a combination of chains and then we had this one class this agent executor class which was basically this agent thing and we started adding in like a few more controls to that class um and but eventually we realized that people wanted way more flexibility and control than we were giving them with that one class so like recently we've been really heavily invested in Lang graph which is an extension of Lang chain that's really aimed at like customizable agents that sit somewhere in the middle and so kind of like our Focus you know has has evolved over time as as the space has as well fascinating maybe maybe one more final kind of setting the stage question um one of our our core beliefs is that agents are the next big wave in AI um and that we're moving as an industry from from co-pilots to agents I'm curious if you agree with that take um and and why why not yeah I I generally G with that take I think um The Reason Why That's so exciting to me is that a co-pilot still relies on having this human in the loop and so there's a little bit of almost like an upper bound on the amount of work that you can have done by an external kind of like uh by another system um and and so it's a little bit limiting in in that sense I do think there's some really interesting thinking to be done around what is the right ux and human agent interaction patterns um but I do think they'll be more along the lines of an agent doing something and maybe checking in with you as opposed to a co-pilot that's constantly kind of like in the loop I just think it's I just think it's more powerful and and gives you more leverage if the more that they're doing which which is very paradoxical as well because it comes the the more you let it do things by itself there's more risk that it's messing up or going off the rails and so I think striking this right balance is is going to be really really interesting I remember back in I think it was marchish of 2023 uh there were a few of these autonomous agents that really captured everyone's imaginations like baby AI autog GPT a few of these um and I I just remember Twitter was very very excited about it and it seems like that first itation of an Asian architecture hasn't quite met people's expectations um I think why why do you think that is and and where do you think we are in the agent hype cycle now yeah I think um maybe thinking about the agent hype cycle first I think Auto GPT was definitely the start and and and then so I mean it's it's one of the most popular GitHub projects ever so one of one of the peaks of of the hype cycle um I think and and I'd say that started in the spring 2023 to Summer of 2023 is then I personally feel like there is a bit of kind of like uh law slashdown trend from the the late summer to basically the start of the new year um in in 2024 and I think starting in 2024 we've started to see a few more realistic things come online um i' I'd point out some of the work that we've done at linkchain with elastic for example they have kind of like an elastic assistant an elastic agent in production um and so we're seeing that we saw kind of like the Clara customer support bot um kind of like come online and get a lot of hype we've seen Devon we've seen Sierra these other these other companies start to emerge um in in the agent space and so I think uh with that hype cycle in mind talking about why the auto GPT style architecture didn't really work it it was very general and and and very unconstrained um and I think that made it really exciting and captivated people's kind of like imaginations but I think practically for things that people wanted to automate to provide immediate business value there's actually a lot it it's a much more specific thing that they want these agents to do and there's really like a lot more rules that they want the agents to follow or specific ways they want them to do things and so I think in practice what we're seeing with these agents is they're much more kind of like custom cognitive architectures is kind of like what we we call them where there's a certain way of doing things that you generally want an agent to do and there's some flexibility in there for sure otherwise you know you would you would just code it um but it's a very like directed way of thinking about things and that's most of the agents and assistants that we see today and that's just more engineering work and that's just more kind of like trying things out and and and seeing kind of like what works and what doesn't work and it's harder to do so it just takes longer to build and I think that's kind of why you know that that's why that didn't exist a year ago or something like that since you mentioned cognitive architectures I love the way that you think about them maybe can you just explain like what is what is a cognitive architecture and like is there a good mental framework for how we should be thinking about them yeah so the the way that I think about a cognitive architecture is basically what's the system architecture of your llm application um and so what I mean by that is if you're building an LM application there's some steps in there that use llms what are you using these llms to do are you using them to just generate the final answer are you using them to route between two different things are you uh have do you have like a pretty complex one with a lot of different branches and maybe some Cycles repeating um or do you have uh kind of like you know a pretty a loop you basically run this LM in a loop these are all kind of differents of cognitive architectures um and cognitive AR is just a fancy way of saying like from the user input to the user output what's the flow of data of information of llm calls that happens along the way um and what we've seen more and more especially as people are trying to get agents actually into production is that the flow is specific to their application and their domain um so there's maybe some specific checks they want to do right off the bat there's maybe three specific steps that it could take after that and then each one maybe has an option to loop back or has two separate substeps um and so we see these more like if you think about it as a graph that you're drawing out we see more and more basically custom and B spoke graphs as people kind of try to constrain and guide the agent along uh their application um the reason I call it a cognitive architectures is just you know I think a lot of the power of LMS is around reasoning and thinking about what to do um and so you know I would maybe have like a cognitive mental model for how to do a task and I'm basically just encoding that that mental model into some kind of like Software System some some architecture that way and do do you think that's the direction the world is going because I kind of heard two things for me there one was it's very bespoke and second was it's fair barly Brute Force like it's fa hardcoded in a lot of ways do do you think that's where we're headed or do you think that's a stop Gap and at some point more elegant architectures or or a series of default sort of reference architectures will emerge that is a really really good question and one I spend a lot of time thinking about I think so like at at an extreme you could make an argument that if the models get really really good and reliable at planning then the best thing you could possibly have is just this four Loop that runs in a loop calls the llm decides what to do takes the action and Loops again and like all of these constraints on how I want the model to behave I just put that in my prompt and the model follows that kind of like explicitly um I I do think the models will get better at planning um and and reasoning for sure I don't quite think they'll get to the level where that will be the best way to do things for a varet of reasons one I think uh efficiency if you know that you always want to do step a after step B you can just put that in order um and two reliability as well like these are still not a terministic things we're talking about especially in Enterprise settings you probably want a little bit more comfort that if it's always supposed to do step a after step B it's actually always going to do step a over step B um or after step B I think it will get easier to create these things things like I think they'll they'll maybe start to become a little bit less and and less complex um but actually this is maybe a hot take or interesting take that it had you could say like so the the architecture of just running it in a loop um you could think of as like a really simple but General um cognitive architecture and then what we see in production is like custom and complicated kind of like cognitive architectures I think there's a separate access which is like complicated but generic custom or complicated but generic cognitive architectures and so this would be something like a really complicated like planning step and reflection Loop or or like tree of thoughts or something like that and I actually think that quadrant will probably go away over time because I think a lot of that generic planning and generic reflection will get trained into the models themselves but there will still be a bunch of not generic training or not generic planning not generic reflection not generic control loops that are never going to be in the models basically yeah no matter what um and so I think like those two ends of the spectrum I'm pretty I'm pretty bullish on I guess you can almost think about it as like the llm does the kind of like General the very general um agentic reasoning um but then you need domain specific reasoning uh and and that's the sort of stuff that that you can't really build into one General model 100% like I think I think a way of thinking about like the custom C of architectures is you're basically taking you're taking the planning responsibility away from the llm and putting it onto the human and some of that planning you'll you'll move more and more towards the model and and and more and more towards the prompt but I think they'll always be like I think a lot of a lot of tasks are actually quite complicated in some of their planning um and so I think it will be a while before we get things that are just able to do that super super reliably off the shelf it seems like we've simultaneously made a ton of progress on agents in the last six months or so like um I was reading a paper the the Princeton swe paper um where their coding agents can now solve 12.5% of GitHub issues versus I think 3.8% um when it was just rag um so it feels like we've we've you know we've made a ton of progress in the last six months but 12 and a half% is like not good enough to you know replace even an intern right and so um it feels like we still have a ton of room to go um I'm curious where you think we are both for General agents and also for your customers that are building agents like are they kind of getting to I I assume not 5 9's reliability but are they getting to kind of like the thresholds they need to kind of deploy these agents out to actual customer facing deployments yeah so so theu agent is I would say a relatively Gish agent in that it is expected to work across a bunch of different GitHub repos I I think if you look at something at like v0 by versel um that's probably much more reliable than 12.5 um% right and so I think that speaks to like yeah there there are there are definitely custom agents that not 59 of reliability but that like are being used in in production so like elastic I think we've talked publicly about how how they've done I think multiple agents at this point and I think this week is RSA and I think they're announcing something new at RSA um that that that's in agent um and yeah those are um I I don't have the exact numbers on reliability but they're reliable enough to be shipped into production um General agents are still tough yeah this is where this is where kind of like longer longer context Windows better planning better reasoning will help those General agents you shared with me this great Jeff bezos's quote of just like focus on what make makes your beer better and I think it's referring to the fact that um at the turn 20th century breweries were you know trying to make their own on electricity generat on electricity um I think similar question um a lot of companies are thinking through today like do you think that having control over your cognitive architecture really makes your your beer taste better uh so to speak metaphorically or like or or do you seed control that the model and and just build kind of UI and product I I think it maybe depends on the type of cognitive architecture that you're building going back to some of the discussions earlier if you're building like a generic cognitive architecture I don't think that makes your beer taste better I think the model providers will work on this General planning I think like well work on these General cognitive architectures that you can try off the bat on the other hand if your cognitive architectures are basically you codifying a lot of the way that your support team thinks about something or internal business processes or the best way that you know to kind of like develop code or develop this particular type of code or this particular type of application yeah I think that absolutely makes your your beer taste better especially if we're going towards a place where these applications are are are doing work than like the logic the bespoke kind of like uh business logic or or mental models for for and anthropomorphizing these LMS a lot right now but like the the models for these things to to do the best work possible 100% like I think that's the key thing that you're you're selling in in some capacity I think ux um and UI and distribution and everything absolutely still plays a part but like yeah I draw this distinction between General versus custom Harrison before we get into some of the details on how people are building these things can we um pop up a level real quick so our our founder Don Valentine was famous for asking the question so what and so my question to you is so what let let's imagine that autonomous agents are working flawlessly what does that mean for the world like how is life different if and when that occurs I think at a high level it means that as humans we're focusing on just a different set of things so I think there's a lot of like Ro repeated kind of like work that goes on in a lot of Industries at the moment um and so I think the idea of Agents is a lot of that will be kind of like automated away leaving us to think maybe higher level about like what these agents should be doing and maybe leveraging their outputs to do more creative or or building upon those outputs to do more uh kind of like uh higher leverage things um basically and so I I think uh you know you could imagine bootstrapping a uh an entire company where you're Outsourcing a lot of the functions that you would normally have to hire for and so you could play the role of a CEO with the an an an an agent for marketing an agent for sales something like that um and allow you to basically Outsource a lot of this work to to agents leaving you to do a lot of the interesting strategic thinking product thinking and maybe this depends a little bit on on what your interests are but I think at a high level it will free us up to do what we want to do and what we're good at and automate a lot of the things that we might not necessarily want to do and are you are you seeing any interesting examples of this today sort of Live And in production I mean I think the biggest there there's there's two kind of like categor ories or areas of agents that are starting to get more traction one's customer support one's coding um so I think customer support is a pretty good example of this like I think um you know often times uh people need customer support we need customer support at linkchain um and so if we could hire agents to do that um that would be really powerful um coding is interesting because I think there's some aspects of coding that I mean yeah this is maybe a more philosophical but I think there's some aspects of coding that are really creative and do require like really I mean lots of product thinking lots of uh uh positioning and things like that there's also aspects of coding that limit some of the or or not limit but get in the way of a lot of the creativity that people might have so if my mom has an idea for a website she she um she doesn't know how to code that up right but if there was an agent that could do that she could focus on the idea for the website and basically the scoping of the website but automate that a and so I'd say customer support absolutely that's having an impact today coding there is a lot of interest there um I don't think we're at I don't think it's as mature as customer support but in terms of areas where there is a lot of people doing interesting things that would be a second one to call out your your comment on coding is interesting because I think this is one of the things that has us very optimistic about AI it's this idea of sort of closing the Gap from idea to execution or closing the Gap from you know dream to reality where you can come up with a very creative compelling idea but you may not have the tools at your disposal be able to to be able to put it into into reality and AI seems like it's well suited for that I think Dylan and figma talks about this a lot too yeah I think I think it goes back to this idea of like automating away the things that yeah get in the way of of make I I like the phrasing of idea to reality it automates away kind of like the the the things that you don't necessarily know how to do or want to think about but are needed to to create what whatever you want to create I think it also one of the things that I spend a lot of time thinking about is like what does it mean to be a builder in the age of kind of like gend of AI and in the age of Agents um so what it means to be you know a a builder of software today means you know you you either have to be an engineer or hire Engineers or something like that right um but I think what it means to be a builder in the age of agents and generative AI just allows people to build a way larger set of things than they could build today um because they have at their fingertips all this other knowledge and all this other kind of like all these other builders they can hire and and use for very very cheap I mean I think like you know some of the um language around like commoditization of kind of like uh intelligence or something like that is these llms are providing intelligence for free um I think does does speak to enabling a lot of these new Builders to emerge you mentioned reflection and and chain of thoughts and other techniques like maybe can you just say a word on like what we've learned so far about what some of these I guess cognitive architectures um are capable of of doing uh for a gentic performance and maybe just um I'm curious what you think are the most promising cognitive architectures yeah I think there's um maybe it's worth talking a little bit about why kind of like the auto GPT things didn't didn't work um because I think a lot of the cph architectures are kind of like emerged to counteract some of that um I guess way back when there was basically the problem that llms couldn't even reason well enough about a first step to do in and like what they should do as the first step and so I think prompting techniques uh like Chain of Thought turned out to really helpful there they basically gave the LM more uh space to think about and think step by step about like what they should do for for a specific kind of like single step um then that actually started to get trained into the models more and more and they kind of did that by default um as that kind of like is basically everyone wanted the models to do that anyways and so yeah you should train that into the models um I think then there there was a great paper by shenu uh called react which basically uh was the first cognitive architecture for agents or something like that and and the the thing that it did there was one it asked the LM to predict what to do that's the action but then it added in this reasoning component and so it's kind of similar to Chain of Thought and that it basically added in this reasoning component he put it in a loop he asked us to do this reasoning thing before each step and you kind of run it there um and so that was kind of like and actually that's that like explicit reasoning step has actually become less and less necessary as the models have that trained into them like just like they have kind of like the Chain of Thought trained into them that explicit reasoning step has becomeing less less necessary so if you see people doing kind of like react style agents today they're often times just using function calling without kind of like the explicit like uh thought process that was actually in the original react paper um but it's still this like Loop that has kind of become synonymous with the react paper um so that's a L of the that's a lot of the uh difficulties initially with agents and I wouldn't entirely describe those as cognitive architectures I describe those as prompting techniques but okay so now we've got this working now what are some of the issues the two main issues are basically planning and then kind of like realizing that you're done and so by planning I mean like uh when I think about what to do things subconsciously or consciously I like put together a plan of the order that I'm going to do the steps in and then I kind of like go and and do each steps and basically models uh struggle with that they struggle with long-term planning um they struggle with uh coming up with a good long-term plan and then if you're running it in this loop at each step you're kind of doing a part of the plan and maybe it finishes or maybe it doesn't finish and so uh there's this uh you know if you just run it in this Loop you're implicitly asking the model to first come up with a plan then kind of like track its progress on the plan and continue along that so I think some of the planning cognitive architectures that we've seen have been okay first let's add an explicit step where we ask the llm to generate a plan um then you know let's go step by step in that plan and we'll make sure that we do each step and that's just a way of like enforcing that the model generates a long-term plan and like actually does each step before going on and it doesn't just like you know generate a five-step plan do the first step and then say okay I'm done I finished or something like that um and then I think a a separate but kind of related thing is this idea of reflection which is basically like has a model actually done its job well um right so like I could generate a plan where I'm going to go get this answer I could go get an answer from the internet maybe it's just like completely the wrong answer or I got like bad search results or something like that um I shouldn't just return that answer right I should kind of like think about whether I I I got the right answer um and and and or whether I need to do something again and again like if you're just running it in a loop you're kind of asking the model to do this implicitly so so there have been some cognitive architectures that have emerged to overcome that that basically add that in as an explicit step where they they they do an action or a series of actions and then ask the model to explicitly think about whether it's done it correctly or not um and so planning and reasoning are probably like two of the more popular generic kind of like cognitive architectures there's a lot of like custom cognitive architectures but that's all super tied to like business logic and things like that um so so but planning reasoning are generic I'd expect these to become more and more trained into the models by default although I do think there's a very interesting question of how good will they ever get in the models um but that that's probably a separate longer term conversation Harrison one of the things that you talked about that um AI Ascent was ux um which we normally think about as kind of being on the opposite end of the spectrum from architecture you know the architecture is behind the scenes the ux is the thing out in front um but it seems like we're in this interesting world where the ux can actually influence the effectiveness of the architecture by allowing you like for example with Devon to rewind to the point in the planning process where things started to go off track can you can you just say a couple words about ux and the importance of it in in agents or llms more generally and maybe some interesting things that you've seen there yeah I I'm I'm super fascinated by uh ux and and I think there's a lot of really interesting work to be done here I think the reason it's so important is because these LM still aren't perfect and still aren't kind of like reliable and and have a tendency to mess up and I think that's why chat is such a powerful uex for some of the initial uh kind of like interactions and applications you can easily see what it's doing it streams its backs its response you can easily correct it by responding to it you can easily ask follow-up questions um and so I think chat has clearly emerged as the dominant ux at the moment um I do think there are downsides to chat um you know it's generally like one AI message one human message the human is very much in the loop it's very much a co-pilot esque type of thing and I think the more and more that you can remove the human out of the loop um the more it can do for you and and it can kind of like work for you and I just think that's incredibly powerful and enabling um however again going llms are not perfect and they mess up so how do you kind of like balance these two things um I think some of the interesting IDE that we've seen talking about Devon are this idea of basically having a uh like really transparent list of everything the agent has done right like you should be able to know what the agent has done that seems like step one step two is probably like being able to modify what it's doing or what it has done so if you see that it you know messed up step three you can maybe rewind there give it some new instructions or even just like edit it kind of like uh uh decision manually um and and and and go from there um I think other like interesting ux patterns besides this rewind and edit um one is like the idea of kind of like a inbox where the agent can reach out to the human as needed so you've maybe got like you know 10 10 agents running in in in in parallel in the background and every now and again it maybe needs to ask the human for clarification um and so you've got like an email inbox where the agent is sending you like help help me I'm at this point I need help or something like that and and you kind of go and help it at that point um a similar one is like reviewing its work right and so I think this is really powerful for we've seen a lot of like um agents for writing different types of things doing research like research style agents there's there's a great project GPT researcher um which which has some really interesting kind of like architectures around agents um and I think that's a great place for this type of like review right like you can have an agent write a first draft and then I can review it and I can leave comments basically um and and and and there's a few different ways that it can actually happen so uh you know what the the most um maybe like the least involved way is I just leave like a bunch of comments in one go send those all to the agent and then it goes and fixes all of them another ux that's really really interested is this like collaborative um at the same time so like Google Docs um but a human and an agent working at the same time like I leave a comment the agent fixes it while on making another comment or something like that I think I think that's a separate ux that is pretty complicated to think about setting up and getting working um and I yeah I I I think that's interesting um there's uh there's one other kind of like ux thing that I think is interesting to think about which is basically just like how how how do these agents learn from these interactions right like we're talking about a human kind of like correcting the agent a bunch or giving feedback it would be so frustrating if I had to give the same piece of feedback 100 different times right that would suck and so like what are what's the architecture of the of the system that enables it so that it can start to learn from that I think is really interesting and and you know I think all of these are um all all all of these are still to be figured out like we're super early on in in in the game for figuring out a lot of these things but this is this is a lot of what we spend a lot of time thinking about h it well actually that reminds me you you are um I don't know if you know this or not but you're sort of legendary for the degree to which you are present in the developer community and paying very close attention to what's happening in the developer community and um and the problems that people are having in in the developer community so there are the problems that Ling Chene sort of directly addresses and you're building a business to solve and then I imagine you encounter a bunch of other problems that are just sort of out of scope and so I'm curious within the world of problem developers who are trying to build with llms or trying to build an AI are encountering today what are some of the interesting problems that you guys are not directly solving that maybe you would solve if you had another business yeah I mean I think two of the obvious areas are like at the model layer and at kind of like the database layer so like we're not building a vector database I think it's really interesting to think about what the Right Storage um is but you know we're we're not doing that um we're not building a foundation model and we're also not doing fine tuning of models like we want to help with the data curation bit absolutely um but we're not kind of like building the infrastructure for for fine tuning for that there there's fireworks and and other companies like that I think I think those are really interesting um I think uh those are probably at like the immediate infray in terms of what people are uh uh running running into at at this moment um I do think there's a second question there or second thought process there which is like if agents do become kind of like the uh future like what are what are other infer problems that are going to emerge because because of that um and so like you know to and I think it's way too early for us to say like what of these we will or won't do um because to be quite Frank we're not at the place where agents are reliable enough to have this whole like economy of Agents emerge but I think like um you know identity verification for agents um permissioning for agents payments for agents there's a really cool startup for payment for agents actually this was the opposite is Agents could pay humans to do things right and so I think there's like I I think that's really interesting to think about like if agents do become prevalent like what is the tooling infra that is going to be needed for that um which I think is a little bit separate than like what's the things that are are are needed in the developer Community for building llm applications because I think LM applications are here agents are starting to get here but not fully here and so I think it's just different levels of maturity for these types of companies Harrison you mentioned fine tuning and the fact that you guys aren't going to go there um it seems like that two kind of prompting and and like going of architectures and fine tuning are almost substitutes for each other uh how do you think about kind of the I mean the current states of like how people should be using prompting versus fine tuning and and how do you think that plays out yeah I I don't think that fine tuning meaning and cognitive architectures are substitutes for each other um and the reason I don't think they are and actually think they're kind of complimentary in a bunch of Senses is that when you have a more custom cognitive architecture the scope of what you're asking each agent or each node or or each piece of the system to do becomes much more limited and that actually becomes really really interesting for fine tuning maybe actually um on that point can you talk a little bit about Lang Smith and Lang graph Like Pat had just asked you what problems are you not solving I'm curious what what problems are you solving and and as as it relates to kind of all the problems with agents that we were talking about earlier like the things that you were doing to I guess making to make managing State more um uh uh more manageable um to make you know the agents more kind of controllable so to speak like how how do how do your products help people with that yeah so maybe even backing up a little bit and talking about Lang chain when it first came out I think the the the L chain open source project um really solved and tackled a few problems there I think one of the ones is basically standardizing the interfaces for all these different components so we have tons of Integrations with different models different Vector stores different tools um uh different databases things like that and and so that's a big that's always been a big value prop of of Lang chain and why people use Lang chain um in Lang chain there uh also is a bunch of higher level interfaces for easily getting started off the shelf with like rag or or SQL Q&A or things like that and there's also a lower level runtime for dynamically constructing chains um and and by chains I kind of mean uh we can call them dags as well like directed directed flows um and I think that distinction is important because when we talk about L graph and why L graph exists is to solve a slightly different orchestration problem which is you want these customizable and controllable things that have loops both are still in the orchestration space um but I draw like this distinction between kind of like a chain and and and these cyclical Loops I think with L graph and when you start having Cycles um there's a lot of other uh problems that come into play one of the main ones being this persistent layer uh persistence layer so that you can resume so that you can you can kind of like uh uh has uh uh them running in the background um in kind of like an async Manner and so we're starting to think more and more around deployment of these long running cyclical human in the loop type applications and so we we'll start to tackle that more and more and then the piece that kind of like spans across all of this is Lang Smith um which we've been working on basically since the start of the company and and and that's kind of like observability and testing for llm applications um and so basically from the start we noticed that you're putting an llm at the center of your system llms are not deterministic got to have good observability and testing for these types of things in order to have confidence to to put it in production um so we started building L Smith works with and without Lan chain um there's uh some other things in there like a prompt Hub so that you can manage promps um a human annotation cue to allow for this human review which I actually think is crucially one like I think in all of this it's important to ask like so what's actually new here um and I think like the main thing that's new here is these llm and I think the main new thing about llms is they're non-deterministic so observability matters a lot more and then also testing is a lot harder and specifically you probably want a human to review things more often than you want them to review like a software test um or something like that um and so a lot of the tooling we're adding in Lang Smith kind of helps at that actually that Harrison do you have a turistic for where existing observability existing testing you know existing fill-in the blank will also work for llms versus where llms are sufficiently different that you need a new product or you need new architecture you need a new approach yeah I think I've I've thought about this a bunch on the testing side from the observability side I feel like it's almost like I I feel like it's almost more obvious that there's something new that's needed here and I think that's maybe that's just um because of these multi-step applications like it's you just need need a level of observability to to get these insights and I think a lot of the like data dog I think is really aimed Dat Dog is great kind of like monitoring but for like specific traces um I don't think you get the same level of insights that you can easily get with something like Lang Smith for example and I think a lot of people spend time looking at specific traces because they're trying to debug things that went wrong on specific traces because there's all this non- determinism that happens when you use an llm um and so observability has always kind of felt like um there's there's something new to kind of like be built there testing's really interesting um and and I've thought about this a bunch I think there's two maybe like new unique things about testing um one is basically this idea of like pairwise comparisons so when I run software tests I don't generally like compare the results of like it's either pass or fail for the most part um and if if I am comparing them maybe I'm like comparing like the latency spikes or something but it's not like necessarily pairwise of two individual unit tests um but if we look at like some of the evals for llms the main uh the main eval that's trusted by people is this llm CIS uh kind of like Arena chatbot Arena style thing where you literally judge two things side by side and so I think this pairwise thing is pretty important um and pretty distinctive from kind of like tradition software tting um I think another component is basically depending on how you set up evals you might not have kind of like a 100% pass rate um at any given point in time and so it actually becomes important to track that over time and see that you're improving or at least not not regressing um and I think that's different than software testing because you generally have everything kind of like passing um and then the the third bit is just a human in the loop component um so I think you still want humans to be looking at the results of like it I don't wants maybe the wrong word because there's a lot of uh downsides to it like it takes a lot of human time to look at these things but like those are generally more reliable um than having some automated system if you compare that to software testing like software can test whether 2 equals 2 just as well as I can tell that 2 equals 2 by looking at it and so figuring out like how put the humans in the loop for this testing process is also really interesting and and unique and new I think I have a couple of very general questions for you cool I love general questions um who do you admire most in the world of AI um that's a good question I mean I I I think what open AI has done over the past year and a half is incredibly impressive um so so I think um Sam but also everyone there um I think across the board has has has I have a lot of admiration for the way they do things I think Logan when he was there did a fantastic job at kind of like some of of bringing these Concepts to folks Sam obviously deserves a ton of credit for a lot of the things um that has happened there uh what lesser known but like David Dohan is a researcher that I think is absolutely incredible he did some uh early model Cascades papers and I chatted with him super early on in in Lang chain and he's been like he he's he's like he's been incredibly uh just influential in the way that I thinks about things and so I have a lot of admiration for the way that he does things I separately you know like I'm I'm touching all all different uh possible answers for this but I think like uh uh Zuckerberg and and Facebook like I think they're crushing it with with llama and a lot of the open source um and I also think like as a CEO and as a leader the way that he and the company have embraced that has been incredibly impressive to watch so I have a lot of admiration for that um speaking of which is there a CEO or a leader um who you try to model yourself after or who you've learned a lot about your own leadership style from it's a good question I think um I I I definitely think of myself as more of kind of like a a product Centric kind of like CEO um and so I think like Zuckerberg has been interesting to watch there Brian chesy I I saw him uh talk or I listened to him talk at the seoa base camp uh last year um and really admired the way that that he kind of like thought about product and thought about kind of like company building um and so Brian's usually my go-to answer for that um but I can't say I've gone incredibly into the depths of everything that he's done if if you have one piece of advice for current or aspiring Founders trying to build an AI what would your one piece of advice for them be um just just build and just try building stuff it's so it's it's so early on that like it's so early on there's so much to be built yeah like you know GPT 5 is going to come out and it will probably make some of the things you did not relevant but you're going to learn so much along the way and this is I strongly strongly believe like a transformative technology and so the more that you learn about it better one quick anecdote on that um just because I gotta kick out of that answer I remember at our first AI asent in early 2023 when we were just starting to get to know you better um I remember you were sitting you were sitting there pushing code the entire day like like people were up on stage speaking and you were listening but you were sitting there pushing code the entire day and so so when the advice is just build you're clearly somebody who takes your own advice I think well that that that was the day open AI released like uh plugins or something and so there was a lot of scrambling to be done and I don't think I did that at this year's Sequoia asent so I'm sorry to disappoint and regress some that capacity uh thank you for joining us we really appreciate it [Music] [Music]

Info

Channel: Sequoia Capital

Views: 9,364

Rating: undefined out of 5

Keywords:

Id: 6XZLoW0-mPY

Channel Id: undefined

Length: 49min 50sec (2990 seconds)

Published: Tue Jun 18 2024