Chat With Your Documents Using LangChain + JavaScript

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
he's going to walk over how to build a chat GPT like clone in in JavaScript um we're doing one that next week on SQL and how to how to query SQL with language models um and then where uh and then we're going to do one the week after that with uh yet to be announced guests but we'll announce that on Friday um my uh was here I think he's coming back shortly hopefully because he'll be doing most of the interesting talking here um and so yeah so for the the focus of today's session basically what we're going to walk through is how to create a chat gbt like clone um in JavaScript um and so why why these two things um the first uh the first the chat GPT like stuff I think that's been the first major kind of um application that we've seen come out of these English models everyone has proprietary data um and there's uh there's there's uh different ways that people want to interact with it but the main one is uh is chatting over it and so chat GPT over your data has been one of the main uh applications that we've seen and then why JavaScript um well so we recently released a typescript and a JavaScript package um and uh this has been really exciting because we've seen a whole new set of users come in um and uh Maya's been doing a bunch of great uh work there highlighting a lot of questions and it's I mean if we're being honest the package is a bit newer than the python one so there's there's more rough edges and has been helping out a bunch um with with that so really excited to have him here today really excited to be going over all of this um this will be recorded um we will do uh 20 to 30 minutes ish of uh demo and then we'll just take questions from you guys for the remaining time um so if you have questions there's actually a dedicated q a um Channel um and would prefer that you use that rather than the chat Channel just uh just to separate um the the two um but yeah really excited to uh to do this and uh yeah my are you ready do you want to give a brief introduction of yourself or just jump right into it yeah can you hear me yep oh um yeah I mean we've spoken about this while I was probably one of the very earliest users of snack chain uh I had a major issue where I was building an application where the uh the person I was building for wanted to summarize YouTube videos there were you know long three hours four hours five hours and I just remember spending just sleepless nights trying to figure out how do I reduce how do I chunk this thing right because the contacts window issues and um it was just very expensive as well at that time um to basically send all these API calls and then each of them was just changed the cost and a bomb and that's when I just stumble across the repo and uh haven't looked back since so here we are today and um yeah so hopefully you know we can cover um you know mostly answer most of the questions people have I'll try and show a demo as well and um you know you can just jump in and talk we'll talk through it but I think one of the first things I'm just going to start with just an overview of the architecture in terms of just showing the visuals and then you can just chime in as I I go through sounds great cool so let me just get that going just let me know if you see my screen I can see it yeah hopefully everyone else can see it and I think this video is also recorded right yep cool so in a nutshell the first thing we want to kind of talk about is is you know what is Lang chain and you know what problems is like chain solving so I just gave an example of myself personally where I was trying to build an application and it involved dealing with text that was very large right this is Texas beyond the context window which is uh about 3 000 words and if you know most documents we're trying to deal with most data we're trying to deal with is much bigger than 3000 words and so one of the things that Lang chain helps with are one of the problems that it tries to solve is this splitting of text into different chunks that we can then send to open AI to get a response right so let's say you have a 50 page PDF or you have a ton of CSV files or you could just have a directory of different files what lag chain has is a different a bunch of different loaders that would be assisting it might be good to zoom in a little bit I think uh some people including myself are having a bit of trouble reading some of this text box a lot better yeah that looks great cool so how do we um uh just give me a second let me switch from my wind of my tab to my window so that I can just jump over to another example real quick just give it a second let's go all right I will just um yeah my window all right is he clear now yeah cool so what line chain has affected me these are the documents here for Js is a bunch of these things called loaders what loaders are effectively is they take different formats of text or different formats or documents you have and convert that into text which you can then doing a vector store retrieve and use the response you want so let me just show you a quick demo of what this looks like in real time so lag chain already has a you know chat bot at the top top right here so let's say you come in here and you say okay how do I build a chatbot for um a 50 page PDF right you ask the question and so what happens is the chatbot will basically say if they cannot find what it's looking for it'll give a best estimation then let's say you you follow up and say something that it knows about so you can say okay how do I uh you how do I uh split the text of a large document okay so if we press enter we should see at this point that it then gives you a bunch of functions uh it also gives you reference documents as well so you can see looks like this is what you're looking for can be found here and you can see it's got verified sources as well right so not only do you get a response but you also get um uh the sources as well from what you're looking for so this is just a brief demo of what's possible and if you can imagine this for your documents or document or application you're trying to build for someone else but what's actually going on under the hood is what we want to cover today so if I jump back to diagram what Lang chain does effectively is it takes your documents it loads these documents so these are the document loaders that we spoke about so you've got loaders for CSV files Json PDF text files so let's say you had a PDF file you would just run this function and this function will effectively split the document into chunks so you would have chunks of text in each chunk you can Define so what blank chain lets you do for example is you have these things called uh text splitters which should be in the they're in the indices tab indices tab right here we go so when you split a text you basically split by character lines but what's pretty cool is you can specify you can say I want to split by 2000 characters and overlaps or overlap means that it basically controls how the text how much of the text overlap between each chunk right so you can split say a 50 page PDF or a bunch of markdown files into chunks of 2000 words and those chunks are going to constitute these different chunks in here right now once we have these chunks what we need to do effectively create these things for the beddings so if I zoom out a little bit water embedded so embedding is basically just number representations of text right so you can kind of think of them as basically floating Point numbers the decimal numbers in each decimal number is a representation of a dimension right so the best way to explain it very very simplified is Imagine you've got three text right one is a line one is a pet and one is a dog right and we can represent these in these vectors or these embeddings right now let's say we're working with the two or three-dimensional view of all of this well we can see obviously that when we plot this uh these vectors that the line is far off right because what this is saying is okay the pet and the dog have more similarity to each other than the line right so this is effectively what happens when you're doing a uh when you when you when you do these things called embeddings all we're doing is turning this text into these numbers which we can then compare uh from one to the other right so these embeddings are then stored so first of all we we will then create these embeddings from uh Lang chains open AI embeddies so we have the embeddings here the open AI has an embeddings function which effectively takes chunks of text like this and it returns a bunch of numbers that look like that it has one instead of just four dimensions here it has 1536 Dimensions right so you have those thousand Thirty uh 536 dimensions for that particular text that you store as a chunk right so you have the chunks then oh looks like there's Maybe a freeze on uh on Mayo's end um while uh we're resolving that um I've seen a few questions in the chat about some stuff around document loading and splitting text which I think are really good questions and there's a lot of nuance here so I'll try to speak a little bit about that um first uh the the document loading so these are all um so these are all contributed by the community so okay there's there's a lot of documents that you could possibly want to load things from or a lot of sources that you'd want to load documents from so you could um you know you could load things from the internet you could load things from local files you could pull things from PDFs um and so a lot of the the loaders in um in in LinkedIn are commit or contributed from the community because oftentimes you know there's such long tale of places where you could possibly want to load text from um that that it's uh it's it's it's tough for us to add them all so we really rely on a lot of people to add and share their own document loaders with other folks um there's also a bunch of really cool companies that are doing um this as a more focused effort like obvious like this is this is just one part of Lang chain um and so we want to integrate and play really nicely with those companies as well so one in particular is unstructured um and so unstructured is mostly python-based but they're releasing um uh a kind of like hosted version of their platform which will let um or or they're releasing uh like a non kind of like um in in-memory version of their platform which will let people uh use it from JavaScript as well and so we're really excited about that integration that should unlock a lot of different document parsing type so they have support for HTML word docs PowerPoints everything like that um I also saw a lot of questions around text splitting um and and so I think um yeah this this is another kind of like nuanced Point um and so first of all like why why do we even need to split um uh text in in the first place um we need to split text um that because we need the the overall documents by themselves are too big to feed um into the the the model by itself um I'm going to take a second to put myself back on the screen um all right um okay so we need to split these documents um because there's uh there's this um that we need to split them into chunks that we can pass to the language model so how how do we split them into chunks the the naive thing to do is basically just go like you know every thousand characters or thousand tokens we put in this chunk the next thousand tokens we put in another chance the next thousand jump into another chunk what are what are the downsides of that so basically that means that when you're passing these chunks to the language model it may not have kind of like the full amount of information that it needs or it may not have the semantically kind of like meaningful Parts together um so I saw a question about splitting a PDF into chunks based on page I don't know if you I just kept talking I didn't realize the thing went off yeah we lost we lost you for a bit but maybe you can just carry on um I was actually just going through the diagrams to explain how things work under the hood but you can just carry on if you if you want to yeah I'm just I'm just finishing up some stuff on splitting on the text splitting stuff um okay yeah but then but then just just yeah but but then after that after you go back so basically the um yeah the the idea of spleen taxes you want to create like meaningful chunks and so splitting it by Page or splitting it by paragraph is a pretty good way to do that because that's generally what like we as humans kind of find semantically meaningful um there's also obviously some information that's translated between paragraphs um and between pages and so you may want to do something with like the chunk overlap there um and so yeah I think there's a lot of nuance in the defining these chunks so by General so by default we use something that kind of like first tries to split by paragraphs and then split by sentences and then split by words and then if it needs to split by characters um but we also have more fine uh fine-grained ones um so for markdown we have a markdown specific text splitter that splits based on like the different headers um and things like that and then for python we have a python specific um text splitter if you're if you're like reading code files um and and that splits by um basically classes and then methods and and then Lines within that um so yeah there's we don't have as many um variations of text Splitters as opposed to like document loaders for example but I think that's uh I think that's an under kind of like invested in area so if people have ideas for like different types of text Splitters we'd love to would love to get those contributed and get those shared with the community I think there's a bunch of interesting stuff to do there um that's pretty much all I wanted to say so uh Mario if you want to take back over cool yeah um maybe just uh you know make some ad-libs as I'm talking just so I make sure that you're here and everyone can still hear me so what was the last thing you heard me say before I dropped off um I think you were talking about putting them in a vector store okay cool so let me just go back to share my screen okay you see the screen yep okay uh so this is where the last part you heard me say about just throwing in the vector store did I go did you see this section about the embeddings yeah we we yeah we did the embeddings part and then you jumped back up to talk about putting the embeddings in the vector store yeah so yeah I was just saying that once it's in the vector store these things called the vector stores just a place where you store the uh I'm basically these number representations of your text right so you have different choices this pros and cons of each one which we can talk about later on once you've chosen what Vector store you want to work with you're pretty much good to go so this phase is known as ingestion right and it's just a phase of effectively transforming your documents into numbers that you the computers cannot understand so what happens when a search is made right so this is a very uh brief uh just quick overview so let's say with this example here we have the lion the pet and the dog right now before I go into this intuitively if someone is going to use the word cat right uh it should have similarity with pet and dog right without even seeing the numbers so what happens is we take the query so this is cat we transform this into an embedding which you understand now in this case it would be Open the Eyes embeddings and then we create these vectors then we now we go to your vector store wherever you store it for and we basically say hey we have this new thing called cats and these are the numbers that represent cat can you please respond back with anything you have that's similar to it it goes in it checks it runs some calculations I'm not going to get into that but effectively returns back X that is similar uh Chase can you still hear me yep cool so now we're going back to if this all makes sense then let's go back to a typical typical case right so what Lang chain does effectively is this entire process you can imagine I'll just scroll down here a little bit which will come and revisit this this entire process which is also up here blank chain basically makes it much easier much faster for you to do to do it for any documents you have or the vast majority of file formats it makes it much easier to do so it solves this problem of the context window if you've ever tried to copy paste into you know uh opening hours chat GPT and you'll you know you've got a large document and then you just try and copy paste the whole thing well obviously it's going to say it's too big right uh also if you try and ask chat gbt questions about your document say you say okay tell me you know um what's what what is this employee handbook have to say about you know the policy for XYZ company well the GPT doesn't know anything about your company uh because it's trained on a data set the first of all it's it's pre-2022 and also um it doesn't have that data so it doesn't have knowledge of your data it has character limits fine-tuning is expensive there you have the issue of memory how do you keep track of previous conversations as well so land chain helps to resolve all these issues uh through the different tools you can see here so we went over the the document loaders the indexes which is the stores you've got a chain which is just effectively um what I showed in dot which I'm going to go into in a second um you've got agents which um Chase can also talk about as well but effectively it's you can think if it's your personal assistant that goes out in the world and interacts with other tools maybe other apis and then you have a chat which is basically you know using the new GPT chat models uh for your application as well and all of this which you would have had to figure out which I had to figure out on my own the painful way you know um last year it it's just so much easier with Lang change so if I jump in to this drive this diagram this is effectively what the vast majority of chat Bots are going to end up using this is the architecture of a chatbot so you basically have a question so how do I purchase a Premium plan this is a question to your document right and this question is effectively combined with the history so if you've ever used chat gbt you notice that you can ask questions get a response and then ask another question based on the previous question and then you have a response from that context well how is it able to do that is because it takes the chat history and you're able well lime chain does is it can take the child history combine it requestion create a standalone question so basically the problems would be something along the lines of given this question and the chat history create a standalone question based on this topic or context then you take that standard question we go and run through this process here right where we embed that question remember like we took the cat we embedded the cats we got the vectors went to the store retrieved similar similar things and then got a similar result so that's literally What's Happening Here the users made a question about your document is to create a standard question check for Relevant docs they're embedded in your store wherever you want to store it you can even store it locally if you want to you don't have to use um hosted solution you get the record box we combine the Standalone question with the relevant Docs you get the GPT to basic General response based on the standard question and run for docs because you're providing these relevant documents as context that context is going to result in an answer so the answer is going to be something like to upgrade to a Premium plan please go to this URL so we saw that example we saw something similar when we came here we said you know how do I split a document into uh chunks for but I split a large document to chance I can embed right and so what this is basically what's going on beneath the hood you made this request right this request effectively went through this process and if we jump back in and say okay maybe I don't understand something so I say okay what do you mean what do you mean by chunk slice press enter we're going to see that it's going to remember the history of what we said and I can go back and say Okay um can you explain what a text splitter is is based on the first response so we press enter again now we're going to see a response back the reason it's able to do this is because like I said earlier it it's going back in history and each title question is made it's picking up from the previous history so you just get a stock of chat history with the new question create the sound known question then you get this uh response that comes back um chase you can still hear me yep you're good and and so yeah so I might jump in here with a few uh comments if you want to zoom back out to the luxury architecture and then um I've seen a bunch of questions as well so yeah so maybe we can transition to some of those after um but I think the first thing that I wanted to highlight is this um is the kind of like creation of the Standalone question and and why we do that um so so that's kind of like done or uh that's done for two purposes the main purpose is looking up relevant documents in the vector store basically the the issue with um could you stay on the diagram by any chance you want me to zoom in now actually zooming out to maybe even better just to get a whole sense of it okay but luckily the yes so um so you've got the chat history so we want to look up like relevant documents and so we need some query to or we need some embeddings to look up against the the document store and so one option would be to embed the most recent question but if that references previous things then you're getting the embedding you're getting the numerical representation for something that doesn't have the full context of what it needs to look up um likewise if you embed all of the chat history and the new question if the new question is completely Standalone and doesn't use anything in the chat history then you're getting some kind of like no longer relevant Parts um that are that are now in the embeddings um this isn't the only way to do this by the way there are there are definitely other techniques um that you could use um so rather than rephrase the question you could uh you know you could look up um documents based on the question and then documents based on the chat history and combine those that was an approach that I tried as well um you could also look up you could also do some sort of like um uh logic to determine which of the previous responses are relevant maybe you could even use embeddings for those um so you know take take the new question create an embedding for it um look at previous things to see if they're relevant and then pass all those as the final embedding that I haven't tried that although a would worry a bit again there about the like pronoun and the referencing stuff um so yeah I think the I just want to highlight that that is just one particular way of um of doing that part um and I think uh there are a lot of other ways hopefully link chain provides a bunch of the modular components that can be used to do those other ways but we also provide this kind of like end-to-end way of constructing and inquiring the vector store for related things um and then another question that I've gotten a lot is basically around like the idea of like sources in the answer um and so there's like two things that you want to keep in mind when I'm trying to get the language model to uh reference sources one is it has to have access to the sources itself so if you don't tell the language model what the sources are there's no way that it's going to know um what what the sources are and so this means that when you're passing the documents in to the prompt you need to one have sources associated with those documents if you have and those sources need to be what you want to be cited as sources so if you're trying to sort of cite a specific uh web page um you you want to have that as the source if you're instead doing the question answering over a YouTube video and you want to sort of cite the like minute Mark or something like that you need to have that somewhere in like the metadata of the document um and then once it's in the document that needs to be like put into the prompt and so a lot of a lot of the prompts that we have do this by default they'll look for a source and they'll put it in there um but you can also change these prompts and by Source I generally mean like a specific key in the metadata called Source but you can edit those prompts to pull in like more things so besides just a single Source key you could also pull in other things so for example um if you're uh and I'm going to zoom back in to uh yeah uh oh I'm going to zoom back in off the off the screen because I don't think I'm gonna share anymore so so um yeah if you if you're doing like question answering over a book and you want to cite like the chapter and the page number and everything like that um then you want to have those different components somewhere in the metadata and be pulling those into the prompt um so I think those are the two things that I and and then also in in the prompt you want to tell uh you want to tell the language model that it should cite its sources um so if you don't tell it it probably won't do it or it will do it erratically or something like that um so it's pretty important to tell the language model hey make sure to set your sources and stuff like um Maya is there anything you wanted to add before we transitioned to questions uh well I mean we just look at time but uh I guess it depends if they want it going through the court or they just want to go straight and so you just walk you in it um so I guess let's just scroll through the questions I think this is just seems to be more interesting um getting answers from you so uh we can go either way maybe I could just do a brief code overview and then you can just go back to q a sure let's do a code overview for like uh yeah like three or four minutes or something like that and we'll we'll include a link as well because I imagine it might be hard to look at it live and then while I was doing that if people could go to the questions um panel and just upload ones that they'd like to get answered um we'll probably go through them in in that order um and so yeah and so there's the Q a tab which is below the chat box um on the right of your screen and so please yeah upload questions that you want to get answered and we'll prioritize those oh all right let me share my screen again so we can try to do that okay sure okay she is great yes and let me Spotlight that is that is that fine yep cool so yeah this is just more um code representation or examples of very basic um opponents if you want to say our blank chain which I covered in the diagram which effectively thought you chains your agents your document loaders your memory problems uh indexes and then uh you also have the the splitters and the other things that we we discussed earlier so basically What's Happening Here is um I'm just going to run through different examples so just to show you how how much easier this is what Lang chain has is basically the models that you typically have to install so let's say you wanted to use open AI you would just simply get open AI from line chain llms which is basically a uh you can you can go in here you can choose from go away you can choose open AI so it does give you different choices so you're not you can swap in and out you know and that's a really good thing too is if you do change your mind about any external Services you're using you can just swap in and out so in this case we just have a template which is just what capital was the city of the country then a new prompt template so a prompt template effectively is a way that helps you to structure your prompts now this is a a bit of Overkill here because I don't need is the prompt is so basic where you pass the template and then you say you want this thing here to be the input variable this will typically come from a user so in this case um the country which will be passed here but imagine a much more complicated situation where you're dealing with large documents and you want to have specific problems with different users or you know you just want to have a lot more complexity of variation and problem templates are useful because what they do is effectively allow you to almost create these inputs these variables for text right and so if I was to run what the France looks like so let's see if so let him run and we are in games yes I think I'm the wrong folder run technology yeah Chase I think I'm targeting the wrong directory um okay um I think um I think what may be interesting is if you want to show the um if you want to show the prompts that we use for the question answering actually they're they're um they're they're a little bit simple but I think we'll be instructive to see how they're put in so if you want to go to like or uh oh this would be different Repository right okay I'll jump into the repo right now just give me a second um I'm also happy to share my screen and jump through them as well if we're just looking at like the prompts yeah you can go ahead that's fine yeah all right so um so can you guys see this I'm going to assume that's a yes all right um cool so yeah so this is the the entire prompt that we use for uh for for the questions and so you can see that like here is where we tell the the language model to use the following pieces of context um and so this this is where we put all the pieces of context into one prompt so we put it all into there um and we tell the language models to use it um this is where we tell it I saw a question about how do you uh how do you stop it from hallucinating um there's a yeah there's a that there's this work is generally pretty well where you basically just tell it to only restrict it to this to this uh information and if you don't know just say that you don't know don't try to make up an answer um and so yeah it sounds a bit silly but when uh whenever there's questions about you know how do you get a language model to do certain things um it's generally you kind of ask it you tell it to um this is the context and so this is where the docs will come in and so this is as a put as a variable in between uh curly brackets because basically this is going to be different for every question and this is where like the idea of like retrieving relevant passages and putting them in comes into play so this is something that's not hard-coded this will this will this will vary on runtime same with the question um so that comes in during run time and then helpful answer and so this is just prompting the language model to return a helpful answer there's some really interesting work here um that maybe it's probably less relevant with some of the chat models but with a lot of the old language models they basically um or even with some of the existing chat models if you can put words into the language model's mouth and make it think that it's already started talking a certain way it will kind of like continue talking that way so by putting helpful answer rather than just like answer we're kind of like we're kind of having the language model pattern match that it's giving like a like an answer that's actually helpful and this is actually um produces kind of like better results than um just having like answer um this is maybe a little bit uh less relevant again with some of the chat models but um this this prompt was uh uh optimized for um just generic um uh uh models we can see below um and this is so this is a prompt that's optimized for chat models and so I'm happy to chat more about the the specifics of differences between like chat models versus regular models but here where we have the exact same prompt um and and so this is used as a system template and this is just saying uh this is uh like the instructions for the language model and then we can see that we have a human message prompt template and this is just with the question um and then we no longer have the helpful answer and that's because with the um existing uh chat GPT API you can actually ask the link you all new messages are new messages so you can't actually have the assistant start talking um or you can't put words in the assistant Mouse so to speak and so I just wanted to show these as kind of like pieces of of code for for what the prompt templates look like and then if we look at some examples um so I think there should be some examples and and this is so this is all in the link chain JS repo um and and so I would I would come over here and look for some if you want um if we look in Chains at something like question answering um and actually this is just over regular documents so let's look at like vector database question answering so this uses the vector database we can see here that it's like 20 25 lines of code to set this up and run this um and so specifically first we're initializing the language model that we want to use then we're using um that then we're reading in um a document and so we're not actually using the document loader here um just just for uh kind of ease because it's a simple text file um then this is splitting it up into chunks and so we're using the recursive character text splitter and so this will split it this this is the text splitter I was mentioning before that splits it up into paragraphs and then sentences and then words um we're creating documents and so documents um are a pretty simple concept they're just text and then Associated metadata so here because uh and by metadata what usually goes in metadata is like source and so here because everything's from the same file the source will be the same um but but that's generally where that would go then we create the vector Shore and we also pass in embeddings so this is Talent this is saying hey use open AI embeddings create this type of vector store from from some documents and then we initialize the chain and so we give the chain the model and the vector store because those are the two lovers that people most often want to control you can also pass in custom prompts so if you want to instruct the language model to respond in a certain way you can pass this in um in in this method as well and now we can um and now we can call uh kind of like the the the the chain over these documents um I know we're a little bit short on time so I'm going to stop sharing there and I think now is probably um a good uh segue into the questions um okay so I can show um a live demo of a finished product if you want and then you can jump if you if you want to do that or if you're pressing time um if it's going to take a minute let's do that but we're we're down to 14 minutes so let's uh yeah yeah literally it's just gonna be a minute so uh let me just show you that real quick you see the screen yep cool so um if we add more time but we don't but effectively what happened in this case is we're able to do with live trainers imagine you've got uh loads of documents somewhere so in this case it's in lotion and each document in this case we're using cron so cron is a calendar app they've got support documents in here so each section has text in each page has text inside as well and so what we're able to do is use Lang chain for example if you can imagine a document you have loads of pages and loads of words or you could just say something like you know how does cron work and we basically run through everything that chase just said now where it goes it does the Q a uh combines into a standard question goes into the vector store and then just gives you a response so obviously we're showing time I can't show more of this but again just show what's possible alongside the uh chatbot you saw in the live chain dots that that's it awesome all right and then so I'm going to bring uh I'm gonna bring and and we're joined by uh Nuno as well oops I can't seem to figure out how to add all three of us onto the big screen so I'll put Nuno on the big screen so Nuno has been leading a lot of the development for the the JavaScript package um and so has uh yeah probably knows more about the JavaScript package than anyone else so we're uh we're honored to have him here um and uh yeah let's go to the question and answering so I'm just going to go down by the the order of what's uploaded so if if you want specific questions to be answered please upload them and I'll probably defer to you for for a bunch of them you know I think the first one deals with um oh awesome all right we're all on the big screen um the first one deals with uh yeah three ways to filter data in Pinecone index namespace and metadata and the Pinecone store from index um yeah so um I'm not so we don't intend to kind of like fully replicate all of the vector stores kind of like functionality um in in Lane chain like we kind of wrap it um and so um I'm not 100 sure what the status of the current um uh implementation is I know in the python one you can use metadata to filter um I know there's been some updates to the client Nuno recently uh yeah I I imagine this is maybe something that's just on the backlog but hasn't made it in yet yeah so this is a feature we can basically easily add it's not there yet but it's uh definitely something we can cool um the second one is about reducing hallucinations yeah so I think just instructing it to pay extra close attention to um the the documents and not make stuff up is is generally what I do um have you guys seen any other kind of like tricks for reducing hallucinations someone said they lost your case oh can you hear me I can't hear you because someone said they can't hear you oh yeah I think we're good um yeah do you guys have any other tricks for kind of like reducing hallucinations besides telling the model not to make stuff up well I guess depending on the use case uh sometimes reducing the temperature can also help um yeah or some of the other more exotic parameters uh but uh yeah I think just what you're saying about putting it in the prompt is usually the best solution yeah cool um is there a way to log out all API calls in their entirety from an llm model for JavaScript typescript you know do you want to tackle this uh so we added the Callback manager functionality one or two days ago so that's going to be your best bet for achieving that so so you can now uh subscribe to a bunch of different events including llm start llm new token llm error llmnt which should give you all the detail about all the calls and that go in in your models as well as a bunch of other events throughout the whole Library so just really check that out so do we have a link for that that we can maybe drop one question that I get asked a bunch and and I think you've think it's in the documents right yeah let me find that yeah let's drop that link in here um so so I get this question a bunch because yeah you've got these like complicated chains and these complicated prompts and it's kind of tricky to know like what's going into where and how they're all constructed um so yeah the the main way that we had to deal with this is so we have this idea of callbacks which basically get run on the start of an alarm call on the end of an alarm call as well as the start of a chain call end of a chain call and stuff like that um and so for logging there's generally like two ways that we've seen people want to log things one is to like the console and we call this I actually don't know what's called in JavaScript but in Python we call this the standard out um uh kind of like callback Handler is that what's called in JavaScript as well you know or uh travel script it's usually called just console.logs but yeah I think yeah so we have a so we have a callback Handler that logs things to to the console and then we also have a concept of like tracing um and so this is um this uses uh so this is a separate like platform that basically logs all these things um and displays them in a UI um for you to kind of like click through and look at um and so both of these were recently added as of like a few days ago um and so that's why that's why it's so exciting and I think yeah if if you know can share awesome yeah and so you know you just shared a link and I would definitely check that out this is like one of the biggest questions that we get and this is a new feature that we're really excited about so check that out um how to split a large document per chapter in order to do embeddings using JavaScript um yeah so so this is uh so right now there's nothing that does this by default this is one that you'd probably want to write some custom code to do um and if but this is this is one that so this is this is something where there's like a long tale of possible ways that you'd want to split documents and so we we want uh and so like how do we think about this as a library um and so we think about this in two ways one we want to make things modular so that if you need to write custom code for some part you can easily plug it in because we recognize that there's a really long tail of use cases we're probably not gonna be able to cover them all um but we want you to be able to learn how like write custom code but still utilize all the other parts so we try to design the library and make it like modular as possible and then the other thing that it says that uh you know I hope someone contributes this to the library this would be awesome to get in I think like um you know there's probably different ways to do this depending on the type of book or things like that um and so yeah I mean um you know I I'd love for some example of this to get in but unfortunately at the moment it's it's not um another question about kind of like loading documents is do we support do we plan to support docx formats in the future this will this will get for free when we do the unstructured integration um you know I don't know if you yeah I think we're still waiting for them on that right yeah let me confirm the status of that um all right Chase before we wrap up I think one uh frequently Asked question is about production deployment and you know the options for that so maybe we can cover that before we head off um what in particular I think people are just asking about you know deployment is it how does it do basically does memory work and deployment to split in work the Surplus functions people talk about for sale um I know Nuno is speaking to you that edge functions and not you know the streaming of effect yeah it doesn't work with the resale at the moment so I can quickly touch on that let me also add a link to the docs here we go uh so um at the moment we support node but we're actively working on so you can other environments like uh Edge and Deno and even browser so yeah just uh keep an eye on it it will come soon yeah and and for so the memory part I saw a few questions about memory um and and that with production um so right now we rely on that kind of like being managed outside of of Lane chain um I think uh yeah there are um we're we're considering kind of like options to make that more baked in but I think like um right now that's that's done kind of like outside of link chain we're gonna be working on ways to like uh help we're gonna be working on at least like instructions for how to kind of like manage that and then hydrate the memory um with previous things so that's definitely on the roadmap I would expect that in the next few weeks um but for fully kind of like managed memory I think that will be a bit longer that was literally something that takes a bit longer to um uh to go through um all right I'm gonna just go through some rapid fire questions um thoughts on llm based Splitters so I'm assuming this means using language models to split text themselves uh I'm I think language models are getting pretty good at a lot of things so I think it's entirely feasible I haven't tried it myself but but I like it in theory um how to deal if we have sensitive data and don't want to share info with third-party service um so there are um so one when using link chain like everything's run locally so so you're not sharing any data with us the way that you'd be sharing it is if you're using like these models like open Ai and you're sending and you're sending data to them um I know they're working on ways to kind of like uh do private deployments of stuff there's also like local models like hugging face that you could run locally um I think most of these have better support in Python but you can run these in a separate python server to the side and then kind of use these in in link chain um so I would recommend that um what is mendable mendable is a great company they um I interacted with the those guys um a long time ago and there yeah they power the search bar and they're um uh they're uh they're they're doing this like search natively for all kinds of like Dockers I think it's docosaurus kind of like builds um so any site that's using docky source for their documentation they can jump in there um under what circumstance can we use index alongside Lane chain that's more of a python question so we'll probably cover that in the python part um I'm interested in using an agent that interacts with the vector store specifically would like to answer technical questions with cited sources augmented by different specific knowledge provided by other tools it appears the current agent implementation does not yet provide detailed sources for the information or retrieves in the vector stores this feature music with our map yeah so this is um I think this is something that there'd be two ways to do this one you could change the you could use like a question answering tool that cited its sources as a tool itself and then rely on that um but then you'd also probably want to change the default prompt of the agent to tell it to cite its sources um so by default you know this is a generic agent um it won't know decided sources it won't know that it should be doing that so I would definitely definitely definitely change the prompt that instructs the agent what to do and then you may also need to change the vector store implementation however you're using that to to return the sources um yeah I think this is a bit more unfortunately I don't have a great kind of like surface level answer for this this is a bit more um in in detail one but I would definitely start by one making sure the agent has all the information and so like yeah making sure that the executive chair is returning the sources and stuff like that and then two making sure that it knows to like cited stuff so I'd probably like make a custom agent in some regard um we are out of time there's a lot of questions that we didn't answer we'll we'll yeah we have a Discord um we'll try to answer them in there there's also other folks that can help answer them um this was a lot of fun we'll do it more so so keep an eye out for more we're doing one with SQL next week and then and then we're gonna announce a pretty cool one for the week after that um yeah any any final thoughts you know or or Maya yeah you know first time mistakes but it's all good I mean the main thing is you know if people get an introduction and you know they're also always can be follow-ups and I think the main thing is people wanted to talk to you and and you know get you to answer questions because you're you're not the easiest person to reach so uh hopefully everyone enjoyed this um and um yeah I look forward to the the next one coming up soon that's next week Thursday right uh next Wednesday so we'll do we'll try to do these uh pretty consistently on Wednesdays mornings yeah join the Discord and um uh before you jump off let me just post the Discord link if anyone doesn't have all right and yeah you know any final words from you uh no yeah just uh you know where you release new stuff every day so just keep your eyes out for new features yes because they're they're released coming today uh there is a release coming today with the whole video Carlos I love that suggestion of the channel for events we'll definitely add that in the Discord so it's more easy to keep track of these that's a great idea all right we're over time thank you guys
Info
Channel: LangChain
Views: 11,225
Rating: undefined out of 5
Keywords:
Id: AKsfHK_4tf4
Channel Id: undefined
Length: 58min 50sec (3530 seconds)
Published: Wed May 03 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.