LlamaIndex and Streamlit: The Ultimate Combo for LLM Apps

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this lesson we are going to cover very popular framework streamlit which is used to create UI for LM applications or any machine learning models so you can just Google this streamlit it will give you this link you can just click on this and then you can come to official website of streamlit and here you can see how easy it is to build an application on streamlit using just few lines of code and it will build a UI for your application so as you can see above we do have several sections over here our use case is most related to generative AI so you can click on this generative Ai and you can also try some example code and how it can work it will give you a chatbot like this just by writing few lenses of quote also you can see here they have collaboration with anthropic Lang chain and you can build a lot of application using Lang chain framework using streamlet and if you want to see some sample application then you can click on gallery and here you can see a lot of application which has been built on a template what I you can see like most of the location we are using Lang chain with the simlit but in our us case we are going to use llama index so in this video we are going to create a web application using llama index and this is how it's going to look like so let me go to my diagram and here you can see it's going to look like this where we are going to have a header which is going to ask okay what is Aslam index and then what would you like to ask and then any user can come over here you know submit their question in this particular field hit the submit button and then they are going to response in this section this is a General layout of the application but I am also highlight few of the components of streamate which we are going to use to build this application and here you can see st. title we are going to use it for any kind of header or just provide like any kind of title in our web interface and then because we want to pass this header at top of our box we are going to utilize text input for this one then just to display the button we are going to utilize s do button once we get a response then we are going to use the s. success for this so this is a General overview and now actually let me just show you how we can write a code for this using all these components here I am in my visual studio and I will just go over it what exactly we are doing over here this step you are aware of we are just retrieving the key and after that as you have seen in the diagram first we are providing st. title and under this we are asking our slam index so it will give us this kind of you know display on our interface then next step is we are using St textor input and then we are asking what we would you like to ask this is where we are going to store our query so this is going to be a default prompt user are going to put their question in this particular box and then we are going to get it into our query and all this thing this St what you see it's basically coming from this streamlit here we are importing streamlit as a St and then instead of writing streamlit do text input or something we are just utilizing an abbreviation of this St and then we are using it where actually we need to use stream after that we are using S do button and for the display we are using submit and after that we are checking if we have got input in our query then we are checking for a particular condition if the query is not there then we are just going to display please provide us a search query if you know nothing has been provided if the query has been provided then we are going to hit this L section of the code and here I have just written whatever we have used earlier so it's nothing new over here we are just we are just invoking custom llm model and we we do have our data stored over here in the data folder so you can see in this section we have this sample.txt file it is the same file which we have been using earlier also and it's stored in the data folder so that's what we are utilizing here and then we are creating a service context so that we can pass our llm predictor and after that the vector toore index and we are going to get a response on this only thing is when we are getting a response then just to display it we are using s. success and now let me try to execute the this code so to execute this code we have to use streamlit as shown in this diagram first actually we have to install a streamlet if you're planning to use this library and we can use like pip install streamlet and then once your code is written if you want to run it then we have to use streamlit run and then your application name and py file uh our code is in a streamlit UI so I will do I will go in streamlet UI and then I will try to execute the code using streamlit run and then UI streamlet dopy so as soon as you run that code you will see it will automatically redirect you to your local browser and it's going to give me a display like this as we have seen in our diagram so we do have a header then what you would like to ask and then let me ask normal question only what did author do while growing up I will hit the submit and I'm hoping like it will give me a response now so now you can see like it has given us a response and if you want to ask any other question then we can also ask that what was the you can see like we have got a response for our another query also so it's not a chat board you can see the difference like it's not a chat application it's a web application where you are displaying a section where user can come and ask their question then they can hit the submit button and then they can get a response in our next lesson we will try to create an chat application using streamlet but before going over there I just wanted to show it to you like how easy it is to use streamlit even for Lama index and we don't have to do anything much over here is just that like we have utilized few of the components of the streamlit library and then our code is basically same we just wrapped it around those stream libraries and our base code is still same and so this is how actually we can use a stream with L index and create a web application in this lesson we will try to create a chat application using streamlit our chat application is going to look like this where we are going to have a prompt where using can come and put their question then their prompt is going to display like like this where we are going to have a user icon and then we are going to get response from our llm application and actually we are going to notify with some of the icon which would represent an assistant so let's see what kind of component we are going to use in building this application so you can see s. title whatever we have used earlier in our web application to display the header and after that you will see there is some new components over here like st. chat message so whether it's a user or whether it's an assistant we need to use a chat message which can hold this kind of information so you can consider is as a container which is going to hold all the messages what we are going to display over here then just to display this particular box where we are asking users to put their question we are going to use this sd. chatore input so these are the components which we are going to use other than this we are also going to use st. sore State and we are going to cover it later like what kind of benefit it's going to provide it to us so keeping this diagram in mind let's dive in in our code the first part is is a simpler one where we are using s. title and you know putting aslama index after that we are retrieving our key after that actually I have just defined a function over here it's a query response so what I'm hoping when a user is going to put a question over there we are going to take it as a query and this query only we are going to pass to our query engine so it's the same which we have seen so many times it's just that like I have defined it under a function and also one thing I just wanted to highlight I need a streaming kind of a response just like what we get in ch up word by word uh the information or statement by statement the information should be getting printed so I have modified the code so that I can have that streaming thing enabled so just enable that thing I'm using streaming true over here as we have covered earlier and also in our predictor streaming Golder too so this is the known part now we are going to Deep dive into what is going to be needed to build this kind of chat application so first thing is going to be we are initializing this st. session underscore State what exactly it does so one trick what I follow every time like if I'm seeing any kind of new library or new component then I just take it over here and I just go to Bing chat and just ask us to explain it similar thing you can also find if you're thinking some kind of code you're not able to understand or you feel okay some kind of functionality you're not able to understand then you can just take that code out or you can just you know take that particular function out and just put it over here and just ask Bing chat like what does it really do so when I ask it what does s. session state do then it's a way of sharing the variable between readons so where does it help suppose I'm having this kind of chat conversation over here with this bot and after that I just refresh it if I logged in then what I'm hoping okay every time it should be having my chat conversation stored so that like once I come back again I can continue my conversation just to have that kind of functionality we can have this st. session disco state so that it can store our sessions and if you're rerunning the pages again and again so what would happen in that case it's going to store our messages and whatever the chat conversation we had what would happen if you're not going to use this so I just ask it so what if I don't use it in that case it's saying if you don't use the session State then every time you interact with your app the stream L we rerun your script from top to bottom and each rerun takes back in a blank State no variables are shared between runs similarly you can just re di more you can go into the documentation you can ask Bing so that like you can get a more understanding around it yeah so let's go to the code again and we were over here so here what exactly we are saying okay if there is no messages in session. state then just initialize a new one if there is any messages then we need to have those kind of messages displayed and after that we are asking it to have those messages from history every time like when the reren is happening so if any messages are there in the you know uh already stored in this session State already if we had some kind of conversation then based on the role just display those messages and after that these are the main part and this is where actually we are going to control the messages we are going to display on our app so the first part is chatore input and we are just asking WhatsApp this is basically pointing to this section and here we have highlighted it we clicking on this one and we want like WhatsApp kind of messages over here or anything else whatever you want you can control it using chatore input so once you have that messages after that we are having two kind of blocks over here right one is related to user one is related to assistant and this display we are going to control it using st. chatore messages same thing I'm doing over here chat. messages then one is going to be user we are getting from user we have to display it and it should be linked with the user so what I'm doing over here okay we have chat. messages this belongs to user and then the prompt what we are getting from here and after that we are going to add it in the session State session State belongs to this particular role user one Whatever prompt getting we are just going to keep add it over here so that if the the user's login again wants to see the input of all the conversation then he can have a vi just like a normal chat application so this section is for user another thing what we have is for assistant assistant is where we are expecting the response should be coming out of one chat message whatever actually we have used for user same thing we are going to use it for assistant now in the assistant what we want to dis so here whatever the message is going to come that we want to show it against the user but in this the response should be PL so how we are going to capture that response so we know the response is going to come from this query engine so we are going to take the response from here and that's why we are calling this query response function so query response function would be coming from here this is what we have displayed and it should be able to capture a response that's what we are going to store it in the response one but now we know we are expecting a streaming kind of response so it is going to come like word by word so how we are going to display it because we are trying to store it in a variable then we want to display it on the chat screen so how we are going to achieve it so for that what we are doing first of all we are defining a message placeholder and we are just initializing it with mt and then later on we are just going to add the messages into this one and another variable we are defining it it's a full response where we are going to capture all the responses so we know like we have got a response over here now we have to display it on the prompt so for that if you notice here I'm using this response unor gen so what it does it's going to give you the chunks of the messages just like what we are expecting from stream response so once you get a chunk what exactly you need to do you need to Loop over it for Chunk in response. gen and then you just need to keep appending all those messages whatever is capturing in the chunk and then you can put it in the full response now you have got a full response and you just need to display it using this but this particular method is important over here because using this one we are able to capture a streaming kind of response so ultimately we are just pending all those messages what we are getting and just just to display it on the chat UI we are using this message Place holder mark down and the full response and this particular pipe at the last once you get a full response we are going to just add it in the message placeholder do markdown and you know capturing all our full response and at the bottom because uh here we have captured chats and conversation of users similarly we have to store it for the assistant also so whatever response we have got we are just adding it over here so that even if you rerun it you can have that thing displayed so now it's time to run this application and we are going to follow same thing we need to go to this directory and then run this particular application so I'm going to follow that I'm just going to so I'm just going to switch to our particular folder and now I'm going to run this application we need to run this command stream L run UI stream chat. py and as expected it it has routed to my local browser and now you can see this kind of display so here you have the title and at the bottom you see a prompt using WhatsApp and now I'm going going to ask it a question what did and what I'm hoping when I'm getting a response it should be in the streaming fashion with that pipe thing at the end you can see how we have got this response so we are getting it word by word and you can see also the pipe let me ask another question let me ask one more question so you can notice over here when I'm asking question I'm getting this icon which is a by default icon for user and this this particular icon is for assistant and and that's what actually we have covered in our diagram also we are expecting okay there is going to be user and there is going to be assistant and you can see all your conversation over here this is how actually you can build a chat application using streamlit with a streaming response in this video we are going to build another chat application using streamlit but without a streaming response so let me show you first how does it look look like and how is it going to work and then we will go over the code so just to run this I'm going to use streamate run and then my python file name till now everything is same we are getting this header and then we do have this chat prompt I'm just going to ask a question now and you will see a slight difference when I'm going to ask this question how this bot is going to respond to to this query earlier actually we are getting that streaming response but now you see it's enabling this thinking one and once the processing is going to get completed then we are going to all this response all together so let me ask another question again it's going into thinking mode and once the processing is completed we are getting this response how are we going to build this kind of application right so it's mostly going to be same we are going to use most of the same component but there are some difference how we are going to get that response out and how we are going to display so first thing first here everything is same what I have used instead of you know using this llm predictor I'm using llama index openi and now this openi have this default model gbt 3.5 turbo the temperature 0.1 and all those things so I'm not going to display it over here because the default is going with GPT 3.5 turbo so this is also like one of the option you don't have to use the chat openi which comes from Lang chain you can also rely on the Llama index only this openi comes from index. llms and you can import it directly from here here everything is same one new thing what you see over here so let me just take this one and then let me go to Bing and let me ask what does it do this is also another component of streamlet you can see over here it's a kind of decorator that can be used to Cache functions that return return global resources so such as databases connection or machine learning models so that's why we are using cash resource over there and this show spinner parameter is used to enable or disable this spinner that you display when there is a cach miss and the cash resource is being created so we do not want to show a spinner over here so that's why it's false let's go to code step by step so here everything is same again we are just getting a response and capturing it in last video we have used a query engine here I'm using a chat engine so it's up to you just wanted to show you like we can use any of the methods over here instead of limor you can use the llm instead of query engine you can also use chat engine and uh I'm using the condens question mode over here and then just capturing a response you can see here I'm not passing streaming equal to two because that is not what I need in this case and after that again we are going to have this session state if the messages are there then displayed if there is no messages then we have to create a new one that's what actually we did in our last video also and after that for user we are going to have same thing what we have done in our last video you know we are asking question that prompt is going to be display along with user Ru and then we are going to capture everything in the uh session state for that particular user and the chat conversation uh The Prompt whatever actually he is feeding only change what you notice in this section you know which is related to assistant because in last one our requirement was different we wanted to get a streaming kind of a response here it's a different because we are not concerned with the streaming kind of response so that's why actually we have added a spinner over here first of all if it is taking a time then we can have thinking the spinner would come up it will just display to your users so this is still processing and they can just wait for response to come out and after that everything is same we are going to get that response from here because it's not a streaming response then we do not have to Loop over it we are just displaying it all together so we can just get that response and we can just write it directly so that's what we are doing over here and then it's going to display along with assistant so this message we are storing in this one and then just adding it to a session State upend messages and at the end again if you run this you're going to have you know this kind of window where you can just go and ask question was the biggest and you can see like it's thinking and means the computation is happening and after that it will just show you the response okay so this is how actually you can have another variant of stream application so if you see this particular Series so we have covered how we can create a web application using streamland that is a simpler one you can also create a chat application one with the streaming one and one without streaming one so we have covered everything related to streamlit so this is how you can create this application thank you for watching
Info
Channel: TechyTacos
Views: 985
Rating: undefined out of 5
Keywords:
Id: IgVGNbAnVAM
Channel Id: undefined
Length: 19min 30sec (1170 seconds)
Published: Sun Feb 18 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.