Build a No-Code Chat-with-PDF LangChain app using Flowise and Bubble

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hey everyone. So in this video, we will build a chat with PDF app where users can upload documents. They can filter the search documents and then ask questions directly how we can add a chat widget. To any website. We will also look at different scenarios that you might have to build your chat application for. So let's get started. So the way we are gonna build this chat app is using Lang Chain. Lang Chain is a very popular open source library that makes it very easy to build search applications for PDFs, documents. it also allows you to use these agents and perform many different tasks. So the way we are gonna use Lang chain is, Using flow wise as Lang Chain is a code-based library and flow wise makes it easy to, drag and drop these elements on a canvas and get started from there. And our interface to users is gonna be through bubble, and we will connect Bubble to flow wise to build our no-code link chain app. So today we are gonna cover two different ways of deploying flow-wise. One is using render.com, and we have been doing this in our previous videos. And second is gonna be a different way. It's called Railway app and it's very similar to render. We are gonna test it out as well. And remember, both of these are temporary solutions until flow wise team deploys their cloud-based solution, which is gonna make it easy for us to just log in and use our flow wise account. So first one is the, the usual way we have been doing, which is on render. So the first step is gonna be to, go to the flow wise repo, which has, different instructions. So we're just gonna be using this option, which is to fork the repo. So once you do that, you'll see you have an option to fork it. Since I already have it, it says it already exists for me, but for you, you could just go ahead and create the fork. Once you do that, you'll see it's available under your account so once that is done, we can go ahead, and go to render and deploy and people who have. The flow wise repo from last time or last videos. You'll probably see something over here that says to update or synchronize the branch, you could click this sync fork and then, there's gonna be another button over here to synchronize again. Remember if you do synchronize, it will give you all the latest features of flow-wise, but that will delete your. Flows from previous videos or previous work that you had done . So that is something you might wanna remember, you could download and then later on upload those flows for the updated functionality. So now you're gonna head over to render.com and create an account. Link your GitHub and give all permissions. So I give permissions for two of my accounts, and then you are gonna create a new web service. And then you'll see a page similar to this, and you'll basically select , the repo, which is flow wise under your account. And once you do that, you see a page similar to this. I'll call this flow wise chat. Doc and, this is gonna be your url, that will be available to you in a little bit. And then we will switch docker to node. And then if you were to look on flow wise repo, they have some instructions which says, you can call these commands. and that's yarn and yarn install, and then yarn build. And then this is a start command that's also on the repo, and this is something that I have mentioned before as well. A free instance tends to sleep after 15 minutes of inactivity, so it's highly recommended to use at least starter. Although some folks have complained that even with starters, sometimes they have noticed their flows disappear. So that's where we're gonna save it, after every use. Now, other thing is to go in the advanced setting and then ensure that the auto deploy is off or no, in this case. So any changes to repo is. Not gonna directly affect the deployment, but you can come back and manually deploy again. And with that, you could create a web service. So once you do that, you'll notice that it's running those commands that we provided, and it will take a few seconds. So I would probably give it maybe half a minute or a minute or so so with that, you'll notice that your service is live. It took about five minutes or so. And once it's live, then you have the link that you could follow to basically access your flow wise account. So the second app mentioned is railway.app. Uh, it's very similar to render, where you could bring in code and deploy. So I realize while I'm making this tutorial, that there are some errors in deployment using railway. So we probably are going to cover this in a future tutorial. So there are a few different options that I'm exploring, which can hopefully keep the, flow-wise flows. Persistent across use. So it doesn't delete that. If you happen to find any solution, please let me know. So we can benefit the community. So for now, we'll keep the render based deployment and then we can look into additional options in the future. So perhaps we start with a quick summary. I've covered this in previous videos as how document Q and A systems work. They basically have these two arms. One is document ingestion and the other is search, or retrieval arm and document ingestion. We want to take the document, PDFs in our case and then extract the text out of it. And that text will be made into small chunks, and those chunks will be converted into vectors or numbers, which will be saved in a vector database. And in our case, we'll be using pin cone vector database. So once we do that, anytime a user searches for a particular, term or a meaning or a sentence or any question about the document, then first thing we do is we also convert that question to numbers and then compare those numbers to numbers. And they're usually a few relevant solutions, that we. Send over to open AI for completion. And all of this we're gonna be doing in flow-wise. So just to make it easy, the way we will break down these two, arms is splitting right by the, the vector database area. So we'll take the, this left portion and the right portion, flatten out, and then we'll work on these two floors in flow-wise. So document ingestion will be the uploading part of our flow where we take document from users and then upload it to Pine Cone and follow this chain. And the second is the search in the app. So one could always start with a new flow or a blank canvas. What I'm gonna be doing is go through some of the available ones in the marketplace and benefit from that. So I see there's one for conversational retrieval, QA chain, and that's the one we can explore what it does. It's taken a text file. And then upstairs or basically saves the, the text embeddings in pine cone. And then we can ask questions in a conversational fashion. So this is very similar to what we want do. Instead, we would like to use a pdf, block in, our case. So I will use this template. And now it should be available to make changes. We will delete this, search for a PDF loader, bring that over so this makes it a bit easier for us to go from something which is available in the marketplace. So pretty much that makes our flow, for PDF based files. Now you notice that there are a bunch of these blanks that we need to fill. So chunk overlap is basically if you split each document into different, pieces. Now, how much of overlap do you want? It helps keep some overlap, as context between each chunk. , second is all of these, , API keys I've shown this in previous videos how to create open AI API key as well as a pine cone, index and namespace. So I've filled out API keys as well as the environment and index. you could perhaps name this as test chat just to see if everything works Fine. And I usually suggest to upload and test the flow, although we will make it where we'll connect it with bubble and you can dynamically upload, any PDF to it. So let's do a test with any test pdf and I'm gonna use the Constitution of the United States as the test file. And once you do that, you'll basically have to save I'll probably call this, pdf absurd. So with that, you should be able to now chat with the document. So you can ask question, what is this document about? And what's gonna happen is it'll go through the flow as mentioned before, and it will absurd to pin cone and generate the response for us. And if you were to go in your Pine Cone account, you'll notice that the namespace test chat is created for us with, vectors available in it. Now that means that, , we were able to run the flow. So next step for us, since we now completed document ingestion, we'll go to the search or retrieval so now we want to create another, flow, similar. There is an option to duplicate chat flow. So this is something that we're gonna use. We'll duplicate that and instead of the up search block, we want to use the,, other block, which is the search, or I believe it's, let's see, load index, um, block. So we're gonna use that. So we'll delete this one, bring it over, connect this to our. Previous QA chain and then you notice it just takes embedding. So we'll give open AI embeddings. And it doesn't need any of this cuz we already observed and have that flow available. So we'll delete this and this should suffice for us. And again, we'll fill out these API keys. So I filled out all the details as before, now I'm gonna use the same test chat name space just for testing it out again. So it should reach out to the same pine cone, index and give us, answers similar to the one that we have seen previously. Let's save this. We'll call this pdf, query and then go from there. So now it should work as intended. Let's see. Okay, so it gives a similar answer, so that means that it connected to the proper, source, and it's given answers from there. So now our, flows are built. So both of these are completed. We wanna take this over to bubble and connect it with bubble based app., the most important part for us, is gonna be this portion where if you look at Curl or Python, any of those, there is a URL that we would like to use in our, in our application. And then another cool thing that was, released is this embed feature. So we'll start with this embed actually, and then we can connect with the rest of the APIs. Now, there is an option to have authorization. You could add a new key. The way you can do that is you have to go in this section, create a new key. Give it some name. You could call this, uh, maybe in our case bubble app. And then you have this API key that could be used for authentication, in your application. So you're not getting any calls from, unauthorized sources. So the other aspect that I would like to talk about is the input config, this is basically the configuration that will help us with many different aspects,, especially with upload and query. If you look at these configuration, you'll see that, there are node which are named Open ai. There is something open AI in embeddings, and then there are also pine cone. What that resonates is basically the. Blocks that you have on your flow. , if you were in this query, , flow, then you'll see the configurations are a little different. Then the upsert, which will have a few additional,, possibilities. So let's say for example, if we take the pine cone absurd block. We can see that it has three variables, the Pine Cone Environment Index and name space, those are the ones that we basically input ourselves so having that as a possibility to. Override using our a p I call from bubble, , gives us the dynamic behavior so we can upload different documents, we can upload in, , different name spaces, different indexes, so that becomes possible with this new update on flow wise. So these might seem quite a a few different settings, but we're gonna use some of those. And for that, I would like to perhaps start with a few scenarios. So first thing is, let's say if you have a document base, this could be PDFs or it could be text or any sort of documentation. And your goal is to add a chat widget on your website. At the bottom, which basically if you click it, you can search through documentation. So for example, if you were to go to Lang Chain's website, you'll notice that there is this chat widget that can help you search through documentation, and you don't have an option to upload or make changes, anything along those lines. So this just a simple widget in that particular case, you do not need. Any sort of input or upload from users, so just remember that. Now, second scenario is where you are part of an organization or a team or so, and you have many different documentation. There's another user. They also have different documents, and the end goal is for all of the documents to be in one vector database that could be then accessed through a chat app. , then in this scenario, you don't necessarily have to separate the documents uploaded by A versus B, cuz at the end of the day it's part of same team or organization. And the goal is to help everyone search within the vector database. Now third scenario is where you have multiple users and it's a website like chat based or so where each user's document needs to be partitioned and separated from the other user, and they have the capability to upload document as well as search within their document. In this case, we need to make sure that each user's document are in a separate. Name space, perhaps that's, that's what we are gonna do. So we will have three scenarios that we will work on in this video and we will try to design our system accordingly. So for the first use case, the good news is, flow wise have provided us with this embed code that you can just add to your website. And it will take care of adding the widget as well as making the search available. So one could take the code available. , in our case, we are just gonna use a blank page, as an example on bubble. And the way you can embed that is you search for HTML tag and then you can drop it anywhere. And within the code, we are gonna insert script of the code that we got from. Flow-wise, that should suffice building of this chat widget. So let's test it out. So once you hit preview, you can see there is this chat widget. It's also hidden behind this,, built on bubble. Of course, if you deploy and pay, then this will disappear. And if you search the same question, it's gonna do the same thing where it's gonna call the same flow wise app and it will get you the answer. So if you have any pre-built website, or if your website is not about, uploading documents or so it's some landing page or, something like what we have seen in land chain, this already takes care of the queries for users. So you don't have to do anything after this. Now for the second and the third scenario, we're gonna benefit from the metadata tag or filter available in. Flow wise, as well as we're gonna utilize the name space. So name spaces could be imagined as these partitions or separations where you can basically mask information from the other partition or other name space. So if we are looking at a use case where you have an organization and the goal is to make all of the information easy to access. You can upload everything into one namespace, and then you don't necessarily have to make any additional partitions or name spaces. And that's what we're gonna do in our example right now. Then we'll come back to a case where we can have multiple partitions or multiple name spaces and look at that in a little bit. So for this particular scenario, we will just keep the name space as one name, space and we will send files to this particular flow. And there are a few given examples. It might be a bit tricky to configure that in bubble. So we'll walk through that. Now I have actually built an app , which has a section to upload documents and then another section to chat with the document. I'll make this available, with the video so you can, look at the workflows So for the document upload section, the main part is the file uploader, which is available as an input form. And I have some texts, which just shows what file was uploaded, name of the document description. There could be many other fields that you could add to a file. And then process file button. And once this is clicked, I have separated it by two different scenarios. And we'll look at each of those. So in the scenario two that we're talking about, we will first create a document,, and this is something called Make a Thing or So create a new thing and then you can select document. And then I have taken those three input filled from the user. And then I just created a data table. So that new type, which is called document,, these are tables in your bubble app. And then I save those three given user input. Now there could be more, for now we'll start with this and we might modify in a little bit. So once you create a new document, then you wanna send that document over to our Flow Wise api. And the way we have configured this API is, first thing you need to add a plugin, which is API connector, and then you can name. , as flow-wise API or so, and then authentication. If you have enabled authentication,, then you can add your, Authorization, something along these lines, and then you add your token value to the right. Since we're not gonna be using that, so I'll just delete it. And for scenario two, we basically take the URL that we had available from flow wise and. We have to set it as a form, data. So this is the body type as well as the data type is text. This is something that worked for me. There might be different ways but this essentially worked without errors for me. Now we will take file as the file that we are gonna send to the flow wise app, as well as Pine Cone, namespace. I'm just calling this like organizational docs or something along those lines, since it's just gonna be common,, namespace for everyone. And once you do that if you initialize, you'll have some sort of, Response saying, I don't know, or, so because, it's actually asking the app for a question, which we haven't supplied this reason. It's saying that, and just to verify that worked for us, we're gonna go to Pine Cone and you'll see that we should have the, org doc available for us. So now this means that our bubbles upload functionality is working fine. Now we can look at the querying side. All right, so now we can go back to our workflow and then utilize the plugin available to us. And I'm gonna say this is the value from step ones file URL since step. That's what we use to save the file and then we send that over. So anytime the button is clicked, it creates a new document. It sends the document as the scenario two describes in. We're gonna test it out. So before I select and send document, let's check what's going on in Pine Cone. So we have in this our dock, so we have 4 48 vectors. This should change. So let's upload Constitution again, take a second. It shows that, okay, that particular file is to be uploaded. So now we can give it a name and then process. File. This should initiate a some sort of bar, which indicates that. We have our file being processed. It doesn't reset anything since we did not put any logic for it. But you see that the number of vectors have changed, which means that it did absurd. So just to make sure that this resets,, each time we have, , an API call made, we are just gonna say reset real relevant inputs. And that should take care of, clearing out the form for, next use. So that's for uploading document and basically taking care of the scenario where we can send any document to that one particular name, space. Now we wanna see how the querying could be done for that scenario. For querying, again, we'll come back to our PDF query, flow. So, you know, we have both of these. So in the section where we found our API url, again, we're gonna take this particular url. , that's what we're gonna call from the query api. And, a few additional things if you were to look at the input config, there are quite some different keys available that we can search for, and any of these we could use, as shown previously as form data. We'll add these keys. So in our case we said namespace, so we'll be using that And we'll probably gonna need the question itself. So let's go back to our API connector. So now I have a, a query API call. this is. It is going to use the, the URL as we saw, please make sure you have the header as content type application json. This gave me quite some errors as well as the data type is text and the body type is json, and the way we can set our call is if you. Remember, so we need to have question and then we have some value with that. That's what I have said as question and value, so in bubble, if you use lesser than greater than sign, uh, that means that we would like to use it later on in our workflow. So first this question then is override. Config. So if you. Look at the, the, the way it's set up here, we are gonna basically take exactly that format and then we have another curly braces inside, which will have a few additional keys that we will ask for. So basically,, overhead config, and then the only key we're using in this case is namespace. So then I've defined that in here. I'm keeping this private, cuz it's locked, basically it's not gonna be available anywhere in the application., you could keep it not private. Uh, but in this particular use case, we're assuming everything will be saved in one name space. So once that happened, you can initialize the call. I did it and then I received this response back. We could go back and actually build a, a workflow around it. So what I've done is I built this chat section. This is very similar to one of the apps we built in previous tutorials. So please feel free to, watch that video. also, I'm gonna make this app available for you to, to learn any of the workflows. So, as mentioned previously, it has messages that shows up here. Then one could input any queries or questions, and that will be sent, to our api. So the way we're gonna start is, this button starts the workflow and behind that, Button. The first thing we do is create a new message. So this is to help save messages in our local bubble storage before we send it over. I created a new thing, which is message, and then I have two values for the message table. So if we look at data type for messages, messages and is user, these are the two that we're gonna use right now. Type is something we could use. And this is for. Saving history and sending history to the chat app. We're probably not gonna be due in this, tutorial, but we'll do that in future, tutorials. So I'm gonna create a new message saying This is a user yes. And then I'll reset relevant input. So it basically clears the input box for the user. Then I'm gonna call the query. The only thing I'm gonna send from here is the message. Because namespace is locked as we saw. And once we do that, we get some response and we take that response and then we save it to the database and we see that it's not a user because it came from AI or the api. And then once that happened, we just basically scroll down to the bottom of the page. And the way that looks is if I ask. What is this document about? It's gonna call the API and get the response back. So this is kind of what we expect. This takes care of the second scenario, as we, , spoke about earlier, where all of the documents go into one namespace, and then a chat app is used to search through all of the document in that namespace. Now the third scenario is where you have one user where that user saves in one name, space, and another user who's gonna have a separate set of documents, uh, that are gonna be saved in a different name space. And there are no overlaps when we are searching for that name space. And the way we are gonna design our system is we will say that each user's, uh, unique ID or unique name, Will be the name space. So then we can identify which partition belongs to which user. And the second is that we will use, uh, metadata. This is something available in our up SEARCHs. If you were to look into PDF file block, then there is something called metadata and it takes a similar structure, what we have seen before, as a json. So, we'll go over that as well. And what we will do is in that, uh, metadata, we will give a certain name to the document. So let's say our user wants to search a document X, Y, Z or document constitution they can do that, or if they want, they can just search within the whole partition without any particular filters. So it will search in, in their name space, among all the documents available. So that is something we'll look at by filtering and searching. So the way that we can build that system in our app is everything looks fine with the initial setup., we are gonna change the workflow a little bit. , we start with creating a, a new document regardless. So we save the name, that user provides us description and the file url, and then the. API call is gonna be a little bit different than the one before. So this API call takes same URL that we had, so we don't necessarily have to change the url. So it's the same URL as absurd. And then we add the metadata as we spoke before. So namespace remains same. , one could start with a test, doc, as well as some test metadata. Uh, this helps Just to make sure the API connector is working fine, so let's do that. I've already provided documents, so initially, once you select send file, it will give you an upload button, and then you can upload a document. I'm gonna call this user a aoc and we'll call this metadata as document name, and I'm gonna call this constitution since that's the document I'm sending. And with that, we should be good to initialize. Okay, so since we did not provide a question, it's saying that I don't know, and then if we were to look in our pine cone, we should see user A docg. So that's good. It means that we have created this partition or name space for user a. Now this is available for our workflow. So that's what I'm gonna do. I'll take the API of call. So this is scenario three and I'm gonna call my metadata. I'm adding only one metadata. So this is the reason you see that I have a key and then I have a value that goes after, the codes. If I wanted more, I would say comma, and then I'll add more keys and values accordingly. So I'm just searching for the document name. And then I'm saying that the, the pine cone name space, Will be current users, unique id, so current user something available. And then unique ID is made by Bubble. So it is for each particular user who registers on our website, there will be one unique id. So it helps to send that, as it'll be unique across your app., and then of course the link that we are sending to upsert will be Dynamic Link. And once that happens, we can, reset our input. So, Another document could be uploaded after that. Okay, so let's try again where we select Constitution and it's gonna show us there. I'm gonna call this a little different, maybe I'll call this doc a and then I'm just gonna call this test doc and then once we hit process, it should call the api and the name of the document will be the metadata filter. This user's ID will be the partition, so let's give it a try. Okay, so it's sent it, it cleared of the form, which means it should be available for us in bubble. So there we go. We have the name space with this id, this belongs to the user and they have the vectors saved and it also has the metadata filter. We're not able to see it in this view, but we can search for that. So that's what we're gonna do right now. So what I've done is I created a new page, which is chat user, and I cloned it based on the original chat. So if you were to create a new page, you'll see there's an option to use Original pages. So that's what I did. And then I'm adding a new section here, which is basically to select a particular document. And in this option I'm saying that it's gonna be a dynamic choice. This is a dropdown . And , the search for document will tell us what documents to provide in that document list, right? So we wanna show each user. Their own documents that they have created, , in their name space. I'm gonna add a new constraint, which is, uh, created by the current user. So it will just search in that name, space, and then, uh, can sort it out as as needed. And then once we, look for options, it's just gonna show what this user have created. Now the other thing is, everything else remains same messages, and, we will configure the API call in a second., so this is basically preview and we see the, the documents those were created by this particular user. Okay, so let's see how the workflow looks like. So once a user clicks send button, we can configure the workflow. It starts with creating a new message as we have seen, and it saves the, the message, with. The user as, yes., this type, we will use that in future. Basically it's either user message or it's the API message. It helps with chat history., now the second step is just resetting the relevant inputs is to clear the input box. The third step is the API call. Now this is the most important step where we call , our flow-wise api. Now, as you see, there are a few different options. What I did was I added a new api. So our original API did not have the option. To have the document metadata filters. So I just created based on that first one, and the only difference between that API and this is to override config, I'm adding these two, keys and their particular values, which. Are configured in the, the body parameters. And I use these, which were used for testing, for the API setup. And once we initialized that, it should search in that namespace, uh, for that document, and it should tell us, what this document is about. So that's what it did. Now since this is available, we have three different, , values that we can take from user. So that's what we're gonna do. So we're saying current users unique id, for document name, we'll select that dropdown that if you recalled where we had to choose an option. , and then we select that value. And the question is what the user asked. We send that over. Once we get a response, we save that again to messages. And this time we say that it's not the user, it's the API or ai., and then we go down to the bottom of the window. , So just to test it out, I'll say I want to test in doc a probably. It's a lot of times using the same document, but it just helps, as a placeholder. Okay. So what is this document about? Nice. So what I did is it looked into that doc a section and it gave us an answer. We can also configure where does not reset this, input. So with that, basically what we have done is we looked at all scenarios. You know, we looked at something, just a widget, which is the easiest. then second is where all users can observe to one namespace, and then each user observes to their particular namespace and also filtering through that. and the functionality has shown,, there this filter can. allow them to filter by document and search within that. If you do not select an option, it will search in the whole user, namespace. So anything within that could be searched, which is also helpful. , and then we configured various, workflows as well as API calls., of course there could be other a p I calls that could be configured in addition. Sending chat history, what that will look in the future, as well as perhaps deleting these name spaces or deleting document, so many other possibilities . I just wanted to highlight the new functionality of embedding chatbot as well as uploading document and making it available for each user to handle separately. So one last thing I would like to talk about is, Saving these workflows. So you could export this chat flow and it downloads to your computer. , there's that option. or you could go back to where all of your chat flows are, and then export database in that way. It exports all of these, so you can later on import Into your flow wise, so I will make this bubble app available where you can view through all of the settings and mimic that. So that concludes our tutorial for today. If you know, off any business or organization who would like to build these chat applications for their knowledge base or their documentation, , feel free to let them know about Menlo park lab. , We are helping out businesses and organizations. To build these apps and deploy them. In addition, there have been quite some interests of people reaching out, asking to build a course Starting with the basics of bubble and then going from there onto no code based Lange chain, app development. , and deployment. So if there is an interest of such a course, please do let me know. Thank you so much and see you next time.
Info
Channel: Menlo Park Lab
Views: 18,248
Rating: undefined out of 5
Keywords:
Id: kOwmPe8aLAA
Channel Id: undefined
Length: 36min 34sec (2194 seconds)
Published: Tue May 16 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.