LangChain: How to Build ChatGPT for Your Data

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
foreign [Music] hi everyone and welcome to Lang chain how to build chat gbt for your data my name is Greg lochman and I'm the founder of the machine learning maker space a brand new online learning community focused on empowering data scientists and machine learning Engineers to build generative Ai and llm applications that create real value we appreciate you taking the time to join us for the event today please share in the chat where you're tuning in from we're so happy to see you at our first kickoff event during our event today you'll learn not only how to build a chat gbt like interface on top of a single document using langchain but you'll also learn what it takes to build a multi-document question answering chat bot complete with agent memory and backup Google search if that already makes you feel overwhelmed don't be we're going to take it step by step one piece at a time build everything up just like Lego blocks and we're going to take it really easy when we get to the super Advanced part if you hear anything during our event that prompts a question please follow the slido link in the description box on the YouTube page we will do our best to answer the most upvoted questions during the Q a portion of today's event without further Ado I'm so excited to welcome my good friend and colleague Chris alexic to the stage today as we'll be working as a team to deliver this lesson Chris what's up man hello yes how are you doing very excited uh you know to uh to be here yes yes Chris is the founding machine learning engineer at Ox an experienced online instructor curriculum developer and YouTube Creator he's always learning building shipping and sharing his work like a legend he embodies what we aim to build here at the machine learning maker space and I couldn't be more excited to share the stage with him today one quick note on the flow I'll share Concepts and Chris will do the demos so with that we're going to get right into it welcome to today's event hey Chris what do you say we tell them a little bit about where we're headed with the data that will be chat gpting today and share a sneak peek of the final product absolutely Greg today we are going to be heading down the rabbit hole so we're going to be looking at some of the texts produced by Lewis Carroll so the Alice in Wonderland series more specifically uh so we're going to be using that as the documents that we're going to query across and chat with and uh with that we we have an agent that helps us do that using land chain that we've named the Mad Hatter and so we'll just ask it a sample query here which is something like how can you avoid potential pitfalls of falling down a rabbit hole you can see that the system uses our agent as well as our supplemental chains in order to produce a response and that eventually will output a response to us unfortunately there's no specific information available how to avoid pitfalls of falling down a rabbit hole in Alice in Wonderland uh we we know that our main character does fall down the rabbit hole so that makes sense many lessons to be learned and many things to ask the Mad Hatter from here um Chris thanks so much for demoing that let's see how we built this thing up piece by piece so first off when we're talking about Lang chain we're talking about chains this is the fundamental abstraction Innovation that we want to keep in mind at all times and beyond that we want to kind of build things up with core components of the Lang chain framework to do our single document question answering we need a few pieces that are going to be really common for really anything that we build we're going to need a model we're going to need a prompt template we're going to need to use a chain maybe multiple chains and we're going to need to create an index when we get into multi-document question answering when we get into agents when we get into doing things are a little bit more complex we're going to add some additional layers some additional pieces but really fundamentally these same core pieces these core components are going to remain just the same so we're going to spend most of today's event focused on those first up Lang chain is all about combining llms with other sources of computation and knowledge to really create those complex applications this is the purpose this is what it's doing for the world today this is what you you should be thinking about having it do for you so how can you leverage an llm to create something even better with chains a chain is nothing more than a sequence of calls to other components including other chains so it's a very generic abstraction but it's also a very useful one these single document question answering is going to essentially require three simple chains and we'll see how these are built up in the following slides and demo we're going to use the core components that we talked about in the outline models prompts chains and indexes and we're going to take them one at a time first off we need a model and the type of model that we're going to use is called a chat model this chat style model is a little bit different than maybe just the way you've been interacting with llms in an input output text oriented way so far instead the chat style the i o schema that we're using is going to be an input of a list of messages of chat messages and the output is going to be a single chat message now as we get into the chat model we need to sort of differentiate between the types of messages that we're going to be using the way that we do that is we Define what are called roles and we have the roles outlined here in yellow we're seeing we have the system the user and the assistant roles these are core these are fundamental this is directly from open Ai and this is how we leverage Lang chain it's thinking about these roles so diving a little bit deeper into the roles the system role is going to be that thing that provides a little bit of context provides maybe a voice maybe a user a Persona maybe a character we're sort of telling it a stance to take some place from which to answer the question in this case you are a helpful assistant is the system message the user message is simply the user of the program could be you could be anyone else using the application that you've built and the assistant message is essentially the AI message it allows us to act in place of the AI answering questions effectively producing a One-Shot example in a prompt input okay so let's uh go a little deeper on that idea um after we just just really note here that open Ai and Lang chain is very similar terminologies but not exactly the same so the system message is the system message straightforward user message in open AI is called human message in Lang chain and the assistant message is called the AI message in Lang chain again this allows us to provide outputs from the perspective of the AI that we are interacting with essentially providing a few shot example so let's check out how this works with some Kraft mac and cheese Chris yes okay so we'll pop into the code here uh and I mean first things first we just want to be able to interact with our llm using Lang chain so we're gonna in order to begin doing that we're first gonna have to grab some dependencies so we're going to start with openai and Lang chain uh since again like Greg said we're going to be leveraging the open AI endpoint here uh we just have to set up our openai API key so this is to ensure that we have access to the openai endpoint we have this helper function this is just to display the text in a way that's uh nice to look at uh there it's not doing anything anything uh fancy though and down here is where we first set up our chat model now the first thing to keep in mind is that because we are using a chat model so this is gbt 3.5 turbo we do have to make sure that we're using chat open AI from Lang chains chat models uh and this is all we have to do to set up that chat model is just you know instantiate it with the model name as gbt 3.5 Turbo now we can Leverage What Greg was talking about right so when it comes to the language that is used between Lane chain and openai we have the system message user message assistant message that uh Greg was talking to in Lane chain that's the system system message and then the user message is instead referred to as the human message and the assistant message is referred to as the AI message but they're they're the same thing uh so you don't have to worry too much about that just the naming convention you'll notice that in our system message we can input some text so we have content and it's you are a food critic we have some content in our human message or our user message which is do you think craft dinner constitutes fine dining and then we have our assistant message or AI message and line chain with the content egads know it most certainly does not so you can see that we've set up the system message the user message and the assistant message we do this to guide how the llm is going to respond when we give it a second user message which is what about Red Lobster surely that constitutes fine dining we just need to combine these messages into a list so we have our system message our first user message which was uh this do you think craft interconstitutes fine dining our assistant message which was the response that we're prompting it with and then our second user message which is meant to get a response from the uh assistant or in this case the open AI endpoint we call that chat model that we built above with this list of props and we can get a response uh ah Red Lobster well it may offer a casual dining experience with a seafood Focus oh and classify it as fine dining fine dining typically involves a higher level of culinary progress I mean it goes on the idea here is that we're able to provide all of this context to our element llm in different and cool ways in order to guide how it's going to respond to us and how it behaves so we'll go to Greg to learn about the next uh concept that we'll be leveraging awesome very cool Chris uh so Red Lobster casual craft not fine dining check single document QA prompts is what we need to do next we need to look a little bit closer at How We Do prompting beyond the idea of this chat model and let's sort of recall a couple of prompt engineering best practices many of you have done a lot of prompt engineering so far and the core thing that we need to always keep in mind is give Specific Instructions now beyond that we want to provide some context that's always providing some sort of role or Voice or character a place from which the AI can stand again in this case you are helpful assistant and then additionally we want to really specify the input when we're doing prompting right zero shot gives us some decent results sometimes but when we can give it an example or two that kind of gives us better and better results so that's where we see in this open AI example we're giving it one example output in the assistant role here who won the World Series in 2020 the Los Angeles Dodgers won the World Series so we can ask it a follow-up question all right so as we get into sort of a prompt template this allows us to do all of this a little bit easier and allows us to kind of do things over and over without replicating copy pasting prompt stuff all the time over and over again so it's really a straightforward tool that you're gonna have to use pretty much every time you go to try to build anything in this case we've got you are an expert in subject and you're currently feeling mood and we can provide any sort of user prompt simply with this content here so Chris walk us through how this prompt template works and a quick example that we're going to do on sparkling water you bet so as Greg was discussing one of the things that laying chain is best at right is reducing boilerplate so it does this in a number of ways uh both reducing boilerplate code so code that you'd typically have to just write a lot of times as well as reducing prompt boilerplate so in this case we're using props templates in order to build you know kind of pre-built props that we know are going to be effective for the task at hand and then modifying them on the fly with specific user provided information so you can think of it like we're building an F string almost in Python that we can uh get it set up properly by including these additional pieces of context so like we discussed you are an expert in subject and you're currently feeling mood so the things between these curly braces are going to be replaced by user provided context and the way we make that happen is by using this system message template from templates and then we provide this prop template here you'll notice as well that these do have roles so this is the system message prop template and this is the human message prompt template so you can have a prop template for all of the different roles uh that your llm has we're going to build the human message prop temp template to just be content so it's going to be the user's question or query and that's all we're going to pass along to the llm in order to make this work in one shot we need to make sure that we create a chat prompt template from messages these messages are these two templates we've already set up so there's a system prompt template and a user prompt template and this is familiar to what we saw above when we created this list except this time we're able to format whatever we'd like in place of these variables so let's see an example of doing that we're going to use the format prompt method to format our subject to be Sparkling Waters the mood to be joyful and the content to be high what are the finest Sparkling Waters we're going to send this to messages just so we can send it directly to our llm chat model now all we're doing here is saying the subject becomes sparkling water so you are an expert in Sparkling Waters and you're currently joyful that's that's that's that's what we're doing we send that to our chat model and we display it so it looks uh decent in markdown and we get this response right so as an expert in Sparkling Waters I can assure you there are plenty of wonderful options to choose from uh we have Perrier San Pellegrino Topo Chico Gerald Steiner and Lacroix and that's basically how we do it right we could substitute anything we wanted for that uh subject or mood and we don't have to rewrite the prompt we don't have to do anything like that uh Lane chain handles that for us through the use of the prompt template we'll pass it back to Greg uh to explore the next concept we're going to be leveraging nice very cool Chris great to see how that prompt template can help us make decisions about not just what to eat but what to drink as well uh let's see as we get into actually chaining things together this is no more complex than simply putting things together so our llm chain that we're going to use is the most popular chain that you're going to come across within Lang chain it's simply taking our chat model and taking our prompt template and linking those two things together with a code that really is as simple as what you just saw on screen Chris let's see how how this works exactly you bet uh justice Greg said this is very straightforward we are just looking to chain our prompt into our llm so again this is to reduce boilerplate right we sure we could just wrap this in a function and uh you know call it whenever we needed it with the llm with the chat model so you know just just wrap this all instead though we can build a llm chain which is going to have that prompt have knowledge of that prompt and chain it into our llm as well as returning the response through our llm so in order to build this all we have to do is provide our chat model which we created in the first demonstration and then our chat Pro which we created just a moment ago and put them into an llm chain now we can call chain dot run which is going to run our chain unsurprisingly we can include our subject our mood and our content so this time we're saying sparkling water again just to stay on theme but the mood is angry and we're asking it is bubbly a good sparkling water to which it responds bubbly are you kidding me that stuff is a disgrace to the world of sparkling water there's nothing more than cheap imitation trying to ride the coattails of true sparkling water brands the flavors are weak the carbonation is lackluster and don't even get me started on the aftertaste so I watered down disappointment so this is the idea right we're able to modify these things on the Fly and we're reducing the amount of boilerplate that we have to write when we're creating these applications and while it's very straightforward with this simple example right once it gets more complex you're going to need to leverage these tools in order to effectively keep track of your prompts and how information is flowing through your application so we'll pass it back to Greg to see an exam or to learn about the next concept we'll be leveraging all right so down the sparkling water rabbit hole we found out that bubbly is Disappointment in a can so moving on we've got indexing and this is really where a lot of the magic happens outside of just tapping into the llm when we're talking about building applications so we need a couple of different components here but this is where we get data Centric this is where the data comes in and this is where we put it into a form that the llm can interact easily with we're going to need things like document loaders text Splitters text embedding models Vector stores and retrievers so let's try to break down some of this terminology before we look at the code to see if we can get sort of the big picture of exactly what we're doing you know when we're creating a question answering tool to look through documents we essentially need to First create our index and our index is our Vector store it is our Vector database in fact a vector database is simply one type of index it's the most common kind so we're taking our documents we're splitting them into chunks could be one document could be many documents we're creating a ton of embeddings which is essentially turning those words in the case of the documents we're looking at today into numbers and into vectors then we're storing those vectors in the vector store simple enough the retriever allows us to then search that Vector store it's an interface that allows us to query the vector store and really get a response of what is most similar to what it is that we're looking for that exists inside of our data and that's the question answering chain is kind of the index and the retriever store so let's double click in a little bit on index here an index is a generic term so don't be scared of it it's simply a way to structure documents so that an llm can interact but really the index that you probably care about today is the vector database aka the vector store this is just what you know we just kind of talked about where it's numbers the vectors are being stored and the Lang chain default is to use chroma DB that's the one that we are going to use today and we will share a link with you in the chat a little bit more on why Lang chain chose chroma DB as the default link chain supports a ton of other Vector databases and that's something that we're going to get into in future events and in future Community content but for today we're going to focus on chroma DB the simplest possible index which is a vector store and as we build up a single Vector store sort of the canonical Vector store steps or we're going to load documents we're going to split up the text now splitting of the text is more of a black art than a science so Chris will kind of walk us through that we're going to create embeddings from the text we're going to use kind of the industry standard today the open AI Ada embeddings model is what we're going to use in our application and then we're going to store the vectors this Vector store is kind of the backbone of the retriever which simply wraps around the vector store and allows us to query in natural language something that we're looking for and it looks for something similar inside um to take out and it does this really really fast that's the benefit really of the vector store and timely that it does it really really fast and not surprisingly we're going to ask it about why the rabbit is running so late um as Alice chases him down the rabbit hole Chris let's see how this works to uh put all this together yes okay so first things first we have to get her documented right so in order to query across documents we have to have some documents so we're just going to go ahead and uh wget one uh which is going to be the first Alice in Wonderland book by Lewis Carroll she's just going to name Alice underscore1.text uh essentially first things first every time is you have to get the data to python right so we're going to go ahead and load this into memory using just classic python here nothing nothing Lane chain about this and once we have it loaded we can go ahead and start thinking about splitting it now there's a number of ways we can split text and we kind of need to split text in most cases anyway because LMS don't truly have infinite context Windows right so you can't shove everything into the context window of an llm so the character text splitter is going to help us break our data down into bite-sized chunks that retain as much information as possible but are otherwise just there to ensure we can grab only the most relevant context we don't need a bunch of other context as well if we include too much context it could potentially confuse the llm and so we want to avoid that as well we're just going to use the character text splitter here we're going to go ahead and we're going to split on the new line character and we're going to have a chunk overlap of zero as well as a chunk size of a thousand now what this means is that we're going to every time there's a new line we're going to potentially split if what's after the new line is too much for our context window so you can imagine we're just splitting apart by new lines until we have you know the the most information in this 1000 uh character length window all we have to do is call Dot split text on Alice in Wonderland and it goes ahead and splits it we get 152 resulting chunks so this process is also called chunking once we've finished chunking we can go ahead and create our openai embeddings model now this is as easy as just pulling it from blank chain we have embeddings.open ai and then we get the open AI embeddings which is going to use that ADA uh embeddings model from uh open ai's endpoint as Greg was discussing we have to get a few dependencies since we are going to be using chroma DB and as well we're going to be using tick token in order to tokenize correctly make sure we have the right number of tokens when we're actually running this through our uh embedding process in order to embed though all we have to do is call chroma from our Vector stores we use Dot from texts on it we pass our text that's our 143 chunks we pass the embeddings model and we just add some metadata this metadata is just associating each passage with its number in the 143 long sequence and then we of course have our retriever which is the retriever wrapper is Greg discussed so that we can query this in order to get relevant documents from our Vector store again we're storing everything as numbers so any information that flows into this will be embedded and then uh it will be con the relevant text will be extracted so you're going to be able to interface with it through text we can see an example of this by asking it what is the rabbit late for we use the get relevant documents method on our Retriever and we just pass in our query and we can see that it finds some context that's you know relating to the rabbit being late says Oh The Duchess The Duchess oh won't she be Savage if I've kept her waiting right so we know that the uh the Dutch is going to be mad if that rabbit is late um finally we're able to integrate this into a QA chain which is again built using an llm and we're using this chain type of stuff which just means we're going to stuff all of the relevant contacts we found into the prompt so that our llm can leverage it as potential context and then we're going to pass our query we're going to run our input documents which is our docs extracted from get relevant documents and that's going to be about it we can call this chain using chain dot run and we know that the rabbit was late for something but it's not specified what is late for in the given context so this is basically putting all the steps we discussed earlier together we have our chain we have our prompt we have our Retriever and finally we get our response so uh we'll pass it back to Greg to continue on with uh with learning yeah thanks Chris and it's really interesting to see how we kind of have to be real specific we have to sort of learn exactly which context to use how exactly to interact with the data how we're chunking the data matters each piece of how we build up this application really does affect the user experience and so it's it's really really interesting to see all this kind of come together this is in an image what we just built this single document QA so we enter a query we use our templated prompt it looks for the answer within the vector store that we created with chroma DB that we took our documents and we chunked and converted to embeddings before we put in there and the interactions with the llm occur on the search and as well as when we're doing our prompt so this is sort of the simplest one of the simplest most common things that you could do the single document QA then this fundamentally is a natural language interface for your data chat GPT for your data right here in an image and so what we want to do today is we want to give you some insights into how to build a much more complex application but we're not going to go through each piece of code we we're going to share the code but we're not going to go through every single piece of it because of time constraints on the event today however if you do have questions please do share them in the slido and we will get to them at the end or you can wait till the end to ask them in slido as well we should be able to answer any and all questions that you have today so with that we're going to take it to the next level here and again we're getting more advanced we're going to do a multi-document QA with Google search chat bot all right so we're going to add and we're going to share the collab notebook with you now although we're not going to go very deep into it Chris is going to produce a collab notebook run through post event that we will share okay so this is a little bit more complicated but again we have the fundamental chains The Prompt chain the tool chain the data indexing chain we just have a few extra pieces in each so what are those extra pieces well before we get to that let's talk about agents for a second this is one of the most confused Concepts within llms and within generative AI right now and I think one of the best ways to think about it is with a quote that I pulled from a book that I was reading recently called complexity agents is sort of a generic term agents might be molecules or neurons or species or consumers or even corporations molecules form cells neurons form brains species form ecosystems consumers and corporations form economies and so on at each level new emergent structures form and engage in new emergent behaviors complexity in other words is a science of emergence and what are we doing today we're using Lang chain to build complex llm applications agents are key way that we can take our complexity to the next level agents in Lang chain are simply in a similar way to indexes Lang chain talks about these they are used for applications that require a more flexible chain of calls to llms and other tools agents essentially have a tool belt they have access to a suite of tools you can think of agent Smith from the Matrix with a tool belt although it's not a Perfect Analogy which one to use in the tool belt is based on the user's input and the output of one tool can be the input to another there are two types of agents in Lang chain we're going to focus on action agents to build our Mad Hatter agent chat bot today now the action agent simply receives input decides which tool to use in this case we're either going to Leverage a search of our index or we're going to do a Google search and it calls tools and Records outputs it decides the next step to take based on the history of the tools the tool inputs and the observations now this history piece is what requires us to add a little bit more to our tool chain specifically the memory buffer we have to remember what we were doing so that we can select the next best step in addition to the memory buffer we're also adding Google search to the tool chain and in addition to everything else we're adding multiple types of documents so this is different file types and multiple documents to our data indexing chain but again fundamentally we're going to use data indexing chain a tool chain and a prompt chain and that's really the key components that we need to add there is some additional complexity when we go to implement the code but what we want to show today is we want to show how this all comes together when we build not just in a Google collab notebook but when we actually create a chainlet application a chat bot-like interface a true chat GPT for your data interface on top of this multi-question answering agent system that can also do Google search so Chris can you walk us through a few examples so maybe we can learn not just with the Cheshire Cat Cheshire Cheshire treasure Cheshire Cat is up to but maybe if we can interface a little bit with that agent to learn how it's working without digging too deep into the code for today's presentation yeah of course yes I mean really quickly I'm just going to go through a couple of the concepts that we're adding so we added more data it's basically the same process we did before but you add new data uh we're persisting that Vector store it's great the first real thing we're going to leverage in order to get that chat GPT like experience is we're going to add memory to our chain now we'll you know we're not going to talk too much specifically about this but the idea is that we can provide both a conversation buffer memory which is like you know what's happened in the conversation so far memory as well as read only shared memory which is this idea that some tools can get access to the memory but they can't actually change it right so conversation buffer memory can be modified uh this read all these shared memory cannot so this is useful for tools that should be able to leverage memory but shouldn't be able to commit anything to memory in order to add that to our chain all we have to do is include it memory equals read-only memory you'll love to see it we're going to set up a couple tools you can think of tools as the an extension of chains that uh you know can be leveraged by our agent that sits on top and gets to see the descriptions and choose which tool is right for which job uh you know in this case we have our main QA system which is our index uh powered QA retrieval chain and then we have a backup which is Googling right so obviously this is not going to be every case you're not going to just be able to Google but if you can this is this is an example of that we create the actual agent this is just showing you how you can go through this we don't have to focus on much here except that we have full control over how the agent acts what it's supposed to do we have the ability to give it this chat history so that it can make decisions based on its memory we have the ability to ask it questions which are input and then we have the ability to let it quote unquote think um you you can leverage that thinking a little bit more complexly but for right now we're just going to let it do what it needs to we create our zero shot agent with our tools and then with our propped templates and then we provide it what the inputs should be we create our chat model this is going to be our llm chain that kind of powers that agent and then we set up our zero shot agent with that llm chain this is the same LM chain that we had before with our chat model and our prompt are prompt being the agent prompt and then we make an agent executor basically all this is is it's making something that's allowed to call other tools call other chains and then use those outputs to strategize or come up with a more uh clear or concise answer once we have all of this set up we're able to ask it the question what is the deal with the Cheshire Cat we can see that it enters the new chain it has this thought this is the agent executor it has this thought and it decides it needs to use the Alice in Wonderland QA system which is our main retrieval QA system we're going to ask that system what is the deal with the Cheshire Cat it returns a response and our executor makes a observation which is that treasure cat is a character and Alice In Wonderland's in Alice Wonderland is known for instance it gives an answer then the thought of the agent is that it knows the final answer and so it could just give us that when we see it here the Cheshire Cat is a mischievous an enigmatic character and Alice Alice's Adventures is known for its distinctive grit I lost my scroll I'm so sorry for incentive granite and ability to disappear and reappear at will so then I asked the question well what makes it enigmatic and this is where that Chad GPT experience comes in right we don't have to ask what makes the Cheshire Cat enigmatic we ask what makes it enigmatic and her agent knows right I'm not sure about the specific details that make the treasure cat enigmatic so it asks again the QA index and it gets a response the response is the Cheshire Cat is enigmatic because it's ability to disappear it's riddles in its knowledge of Wonderland so then I ask well what are some of those riddles the Cheshire Cat uh you know he has riddles but we want to know them we asked the context and the context doesn't find it this is likely because riddles isn't present he probably doesn't say I'm going to ask you a riddle now Alice right he just asks a riddle so we have to go to our fallback tool which is the Google search we can see that the agent doesn't find any Riddles In the QA retrieval train and so it goes to the backup and it Googles what the riddle is and we get the example of a riddle which is what road do I take now again what is Alice's response to that riddle we're not having to provide that context because of the memory we're in an ongoing conversation here right so uh in this case because it knows what that riddle is it's what road do I take we can actually see the agent executor decide to ask our QA retrieval chain what is Alice's response to the treasure cats riddle what road do I take and because it's provided the riddle itself we get the context and the context provides us the answer which is I don't care I don't much care where so long as I get somewhere which is Alice's response to the Cheshire cats riddle so with memory with the agent executive with the tools we're able to have a seamless chat experience just as you expect from GPT 3.5 that includes our context that we've given this particular agent through the QA retrieval chain and with that I'm going to pass it back to Greg to to wrap us up boom that was awesome Chris that was so cool to see how exactly the chat bot is working behind the scenes what it's thinking how it's making decisions um really just a lot to take in there as soon as we start playing around with agents um really enjoyed you walking us through that from today's event you know we really hope that we showed you that the potential for building truly complex applications is there you know really emergent applications it's all there it's just like what are you going to build single document QA is really a fantastic entry point for anybody getting started with this Tech it is going to teach you about the foundational constructs within the link chain framework and it's going to allow you to get that chat gbt like experience with your data doing Simple queries although if you want to take it to the next level after you master those Basics and you want to get that true chat bot chat GPT experience you really can't beat adding agents adding additional ways of finding information and really just getting creative with what exactly the pieces are that you're chaining together so you know thank you so much for you know Chris for showing us that thank you everybody for joining us we've got the Google collab links we're going to share the slides the chain lit demo that you saw today as well in the beginning we're going to share that reach out to us directly Within any questions that we don't answer today Greg mlmaker dot space or Chris mlmaker Dot space and with that I'd like Chris join me back on stage as we go through the questions of the day so if you have additional questions please go ahead and add them to the slido we've only got a few going today so if there are more questions we've got time for them otherwise we'll go ahead and break early all right so first off love this thank you Anonymous however I'm correct am I correct in assuming there's no way to quote ground these models without sending data to the open AI endpoint I'm concerned about pii Etc yeah I mean so you can uh luckily open source models generally are performant enough now that you can use them to power this uh which means you can run them on-prem you could also use your own proprietary models you can also use Azure uh services to provide you like a closed Source open AI instance in which you don't have to expose any pii so there are many different ways that you can kind of control the pii flow that doesn't expose it to any public endpoints uh or or semi-closed endpoints so you can ensure that your customers information is kept private and never leaves whatever you know contract data contracts you have it never leaves that scope so um you you are unfortunately incorrect but also fortunately incorrect right so uh you don't have to use uh open AI but for this example specifically we are leveraging it yeah we're hearing that more and more aren't we people trying to move away from open AI but um it's a great teaching tool I feel like we find uh it's a great entry point play around with some open source data with it um but yeah if you're using your own data certainly yeah might be worth looking at some other things um okay so uh from not a number we have if raw tabular data with labels is presented as a as a text document and QA is like quote predict new label for row where row has new feature values would this work sure I uh why not uh yeah yeah I mean you can make it work right so we can format the response from our llm through Lang chain using format or response formatting tools that are provided within the langchain framework so you can format it right back into the row if you want um the there is a lot of research and papers that indicate that the llm can be used in such a way so to predict like what the next thing is going to be it's obviously not going to be without some serious modification or introducing new tools it's not going to be like great necessarily but if you've built say a custom llm that does this right and you hook it up to langchain it can absolutely be leveraged as a tool in that the agent could use right so you can put it into that flow but I would say it depends on what performance you need or what you're using it for but you can do that yeah absolutely yeah right yeah like almost like if you can dream it you can do it with this stuff right I mean it's generic as it as it gets um so the the next question is like the question that we keep getting like everywhere we go Chris like what open source models do you recommend for on-prem etc etc yeah uh I don't uh uh unfortunately and fortunately the models do as it exists right now for lightweight llms or open source llms or however you want to Define what you're meaning by open source here um it they're all specialized in some way at this point or can be specialized and so I think that when it comes to different tasks or different part of the agent chain there's different responses to this a more General well performing instruction model is likely going to be your best bet for like the agent right because that has to be a general uh it has to have general knowledge and ensure certain response response format and then within each of the tools you know there's models that are better at that kind of open QA right uh or close QA I mean it's it's totally dependent I would say though if I have to give an answer start with with something like openlms open Llama Or Falcon 7B these models are doing well enough to kind of fit wherever you want in the stack you do have to keep in mind that it's going to error sometimes right so this is a not a deterministic task sometimes it can give a response that doesn't make sense and so you're going to have to build some custom error handling or parsing in order to kind of get over that hurdle that you might not have to if you use a much bigger model right so falcon 40b or open AI endpoint but the idea is if you need an answer to just get started Falcon 7B or open law or open LMS open Llama are a great place to start as well as shopify's new model is also fantastic if you were looking to uh to use that great insights and and I think the other layer of this is commercial availability versus no commercial availability so definitely check the license on the models that you're looking at if you're just playing around and learning it doesn't really matter but as you start to do things for your company for your business this is another key question that we're getting all the time so that's uh that's definitely a deep dive for another day but a definite emerging space space there which model for which application which size and which commercial availability how much privacy all of these things are are you know really interesting open questions all right so from Anonymous can we ask this bot to summarize the information from our knowledge base in a particular way like can we set the system persona for example as a product manager you sure can yeah I mean I mean I mean it's a not an exciting answer yes yeah yeah yeah I'm kind of thinking like how would you train it to be more like your product managers I don't know you could kind of fall down the rabbit hole a little bit there but yeah you could certainly just kind of in that system message that we saw within the chat model kind of just you are a product manager and start there also I did miss speak I'm very sorry not Shopify Salesforce the new Salesforce yes okay good call good call okay um uh tactical question how do you control the size of the tokens sent to the model by Lang chain there's a Max tokens parameter uh that the openai endpoint accepts and so you can set it um you can also limit it on your side using a number of languages so like that Max tokens is for the response by the model if you're wanting to limit what's sent to the model you can also just uh use Lane chain to that there are parameters that you can set that help you to do that you can build custom functions to help you organize or determine how much tokens you're willing to send uh all of those things can be explored by uh Lang chain on either side of the llm chain yeah so as a follow-up to that um great presentation how would you go about utilizing Lane chain to create an application to generate docs that are three plus pages long for commercial use yeah I mean you know at that point we have to kind of think about what are llms good at uh we have to think about what's the best tool for the job if you're wanting to do it all in one shot with very good context adherence you're looking at some of the larger context model uh context window models so something like anthropics Claude MPT or even now open AI has like up to 16k tokens for chat GPT 3.5 turbo so there's a number of considerations to make another way is to break it down into Parts only do one kind of you know part at a time so paragraph by paragraph I mean there's a number of ways you can approach this but ultimately what it comes down to is context window and you know how much do you need persistent context through the entire three plus Page Long uh document I would stick with your kind of quads and your big context window gpt4s if you really wanted that to to shine because they are the the best at those long contexts right now now yeah and finally we have our last question from Deepak other than the langchain website is there any other website you recommend to learn Lang chain also in your experience what is the best way to make retrievers yeah I guess I'll just throw out the Deep learning AI course that they recently put up with Harrison is a great place to start um gives you uh some some insight some overview I'd recommend taking the prompt the chat gbt prompt engineering for developers first to kind of get the vibe and feel uh before you head into that but you could have those two done by the end of the day today there's a couple hours each maybe one to two hours um any other thoughts on that Chris and then on retrievers I gotta I gotta do my catchphrase thing here or whatever so I would recommend just building with the tool to learn land changes build stuff with it like you're gonna find cool stuff in courses and like you're to find cool stuff on websites but if you don't have a reason that you that you're building or you don't have like a thing you're excited to build you just like they're not gonna stick like they would if you're trying to solve a problem I would also say like really don't worry about the fact that you're going to have over solved a lot of the problems like this is potentially overkill for a lot of tasks uh right you could do these with much simpler models and everything like that but just use the tool to solve a problem um you know even if it's over engineering like it gets you used to the tool gets you in it and then the best way to make retrievers is there is some trial and error involved really understanding your data understanding how to chunk it understanding what potential context you could be losing uh and you know setting those overlaps setting the kind of chain you're going to use and the kind of retrieval you're going to use right so like we have normal cosine similarity but maybe MMR is better because that takes into account what we've already retrieved so we're expanding our potential context space there's a lot of little fiddly knobs you can use uh but I would say the best way is to really understand your data and then uh you know leverage that information you have to to set the correct parameters and and use the correct chains in other words build build build okay awesome Chris that brings us to the end of today's event thank you everyone for your participation this has been brought to you today by the machine learning maker space our community is just getting started and we're so grateful that you joined us for today's event we're also excited to announce that our first ever four week live course which is going to be on llm Ops llms in production is going to be offered starting August 14th that's going to be the kickoff date so in the follow-up email that you receive after today's event we'll share not only a post event survey but all the details on our upcoming llmops course Chris is also going to put together a long form video explanation of that multi-document QA chat bot with agent memory and Google search and we'll be sure that you receive that as well for everything else please follow machine learning Makerspace on Twitter on LinkedIn and YouTube at ml Makerspace on Twitter machine learning Makerspace on LinkedIn and YouTube to stay up to date on everything that comes next until then we'll keep building shipping and sharing we hope to see you do the same till next time everybody later
Info
Channel: AI Makerspace
Views: 2,310
Rating: undefined out of 5
Keywords:
Id: Azfc-TjG9Tg
Channel Id: undefined
Length: 56min 35sec (3395 seconds)
Published: Thu Jul 06 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.