Building a RAG Based LLM App And Deploying It In 20 Minutes

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
if you've ever worked on any AI application then chances are you've dealt with difficult documentation tons of trial and error lots of frustration getting your app to work and then once you actually get it working asking yourself how do I actually put this into production and use this in a way that's going to be scalable and can integrate with my application now in today's video I'm going to show you a platform that solves all of those issues lets you build applications super super quickly but better yet has them really automatically deployed for you and lets you integrate them with your apps seamlessly and that platform is called Vex they are the sponsor of this video but don't worry they are completely free and we're going to dive into building a full rag application with them starting right now so I've left a link in the description to the website that I'm on right now and you can simply go to it and sign up for a free account you don't need credit card details or anything like that now I'm going to show you a quick demo of what will be building in this video and you can see the power of using a platform like this to create your own custom AI app that it's really automatically deployed for you now the app that we're going to be building here is a rag based app that's meant to act as a travel assistant so for me I'm actually traveling all over the place for the next few months I have an itinerary and I have all of these different travel plans but they're kind of all over the place so what I thought is I'd build a simple application where I can give it access to all of my different travel information things like visas where I'm staying different flights Etc and then it would allow me to ask questions relevant to exactly what I'm doing so for for example I'm in Bali right now and I could ask it hey when am I leaving Bali or do I need to be aware of anything in Bali what should I go where should I go for dinner where should I go here Etc and it will know based on all my travel plans my accommodation Etc to give me a really relevant answer so let me show you a little bit about how this works and then we'll build it out so we start with a simple pipeline where we have a query just the user typing something in we then go to an llm for the llm we have a bunch of different options here I'm just using meta llama 3 and you can see that it says turn the question user ask below into a string of keywords that can be searched on Google then what it does is give me that string of keywords I then can go to a tool where actually search using dunk deck go search for information about what the user typed and then we can go to a vector store database where we can actually look up information from our various travel documents so itineraries accommodation flight tickets Etc and then we pass that all to a final llm that llm will have access to all of this information and it will act like a travel assistant and give us really good recommendations or answers based on the data it has access to so now that we've looked at that I'm just going to ask this a quick question so you can see how it works so I said hey can you tell me about my trip to Thailand and give me some recommendations on activities to do next month I'm going to Thailand I'm going to a few different places it's going to be able to see all the places I'm going when I'm flying there where I'm staying and then give me a personalized recommendation so you can see here it was able to find information about my various flights and where I'm staying and then give me personalized recommendations based on that now building out an app like this completely from scratch or using a framework like Lang chain would take a very long amount of time especially if you've never done this before meanwhile I was able to build this out in just a few minutes and I'm going to show you exactly how you can do the same and how to build something that's customized for your use case so with that said let's get into it all right so I'm on the dashboard of the Vex website here and to get to this page you just need to make a new account again it's completely free and the link is in the description now what we're going to do is start by creating a new project then I'll walk you through the platform a little bit show you how we upload our own data and then how we build up that pipeline you saw before which is really just going to take a few minutes so I'm just going to go here to create project and I'm actually going to use a template although we could build one on our own now what we'll do is we'll go to open source prompt with rag that's a template we can start with to use some open source models which are free now we can use closed Source models as well if we want to but in this case we're just going to use the open source ones now you'll see that it starts us off with a pretty simple rag based app rag is retrieval augmented generation and really all that means is that we're retrieving some information from a data set that we have access to and then injecting that into the model so that's where the vector database comes in that implements the rag component where it can actually go it can vectorize all of our data and it can really quickly look up similarities in data and pull that out and feed that to our model I have a bunch of videos on rag I'll link one on the screen right now if you want to learn more about it but for now let's build up this app so you see that we have a query we have a vector database and then we have an llm now what I want to do here is use an additional tool this tool is going to be the internet so we can actually get recommendations based on upto-date information and we can search for specific places that we're staying or going or whatever it may be so the basic pipeline that I want to walk us through is we're going to start with a query I then want to give this query to an llm and I want to tell the llm to create some different keywords that we could use to search on Google or on any search engine that we want this way we can take what could be a long prompt and we can convert it into something that would work well for a search engine we then can get those results back and later on we can pass that to a final llm as well as information relevant from our database of travel information or our itinerary so we have kind of a complete answer so it's going to go like this query llm llm is going to give us some keywords that's going to go and search a search engine we're then going to go to the vector database the vector database is where we're going to have all of our different itinerary flight details Etc we're going to get out whatever is relevant based on the users prompt we're then going to pass that as well as the results from the search engine to a final llm and that's going to give us a result so in order to do that let's just start with the query you can see that in order to send a query we can actually do this using an API endpoint we're going to look at that later on or we can just click on the playground tab here and we can mess with it in this visual editor next we have a vector database now the vector database will take some kind of input in this case it's the payload you can see the different variables down here and then it will return to us different information based on whatever the similar information was from the payload I'm not going to explain exactly how Vector databases work here but the idea is you put in some kind of search text and it will go in the database and find anything relevant to it now what we want to do is add some steps in between here where we're actually going to go to Google or another search engine in this case duck duck goo search and uh get out some results so I'm going to add an llm to start and what we need to do is give this a prompt and tell it hey convert whatever the payload is into some keywords so we can then use that in the next step so what I did here is just pasted in the prompt that I used before and it says turn the question the users ask below into a string of keywords that can be searched on Google to effectively find results about travel recommendations we then are able to use the payload notice that if you just click here so down on the variable it just automatically adds it in and we have access to any variables that have been used before us in the chain that we're writing so at this point we have access to the current date and time the chat history if we're keeping track of that and then the payload which is what the user actually asked or the question or prompt and then I said only provide the keywords nothing else and now we need to select our model now for something like this we can use a simple model the model I'll use will just be meta llama and then we'll go with 70 billion but we will use a different model later on and you can see there's a bunch of different options here for the open source models so we're going to do that and also notice you could bring in your own llm if you want to do that again we're not going to do that for this video and now we have our llm next step is to go and actually search on duck. go search for these keywords and get some result back so what I'm going to do here is go to my tools and when I click on tools I have access to a bunch of different ones so I have basic math open Weather Wikipedia Etc we also have duck. go search so I'm going to click on duck. go search and now we're going to give this some input and the input we're going to give this is the action one output now the action one output is the variable from the previous step you can see kind of behind here it's labeled action one so whatever the llm gave us is stored in action one output and that's the kind of string of text or the keywords that we're going to search here and then get some result back so you can see it's quite simple to follow along here and just all of the different results are stored in variables which we can just click on to use okay so now we have our query llm tools and now we have the vector database now for the vector database to work we need to give it some data so let's look at that let's give it a few different pieces of data in this case it'll be our travel plans or itinerary and then we can move on so to do that we're going to click on manage data set here and it's going to bring us to a separate tab where we can upload some different files for our project so let me find a few different files I'll show you what they look like we can upload them here then they'll automatically get added to our data set all right so you can see here I have a notion document open and there's a bunch of different pages in this document that actually represent all of my different travel plans and my full itinerary now I'm not going to leak all of this to you because I don't want to expose everywhere I'm going all my flights Etc you can see here we have one page P Bali Indonesia I'm here from April 22nd to May 20th it shows my flights my accommodation is down below and what I'm going to do is simply go here and just export this as a PDF so if I go to export I can choose a bunch of different formats I'm just going to choose PDF and then I'll save to my downloads and then I can upload that to my Vector store database now this supports a ton of different file types you can use other things like text documents but PDFs are probably going to be the best so if you have different flight tickets if you have itineraries if you have booking confirmation emails whatever you want you can simply download those or export them as some kind of PDF and then upload them to this platform like I'm going to do right now now to add a new source here we'll simply click on ADD Source we go to select data source and notice there's a bunch of different options we have plain text upload a file page crawler this will actually go and scrape the HTML of a page which is pretty cool we have co- fluence Google Drive Etc Confluence are so I'm going to go to upload file and then what I can do is select the file I want to upload so let me find that and upload it all right so I uploaded a few different documents here these are just different notion pages that I had and what's actually going to happen is when we upload them they're automatically going to be parsed out by Vex and it's going to identify anything like tables or more advanced structures within the PDFs so we get a way better Vector search result when we actually start to look for information so in case you're not aware what will typically happen in a rag based app is we're going to take something like a PDF and we're going to chunk that out into a bunch of different text X strings or a bunch of different pieces of information and typically if you use just a normal parser it's not going to be able to identify something like a table or structured data inside of that PDF that you might want to keep together so what's going to happen here is we're going to use a more advanced parser automatically you don't need to enable it or anything it will just happen for you so it's going to keep that data together and just generally give you a much better result again I don't need to get into this a ton it's just kind of a nice perk of this platform that a lot of other platforms don't have regardless what we're going to do now is go back to our project so let's click on AI projects here and let's go into the one that we're working on I have two of them named the same thing so I should probably change the name here so in fact let's do that let's call this travel planner okay perfect and now let's continue with our pipeline so we have our llm our tools and our Vector database and for our Vector database let's just look at what we're passing in here so we're passing in the payload and then notice it has these three different files that it's going to be searching through now we can also pass in maybe the action one output if we wanted to give the keywords from the first llm but in my case I'm just going to stick with the key uh with payload sorry we then have top K now what top K is going to do is determine how many results it should return to us from the vector search so in my case I'm just going to bring this up to three but you can adjust this uh based on what might work best for your application and if you think you're going to have a lot of different results then you could make this be something like 6 7 8 Etc you're just going to take take in this case the three closest answers because we have top k equal to 3 okay next thing we'll do here after our Vector database is we're going to go to our final llm and for this llm we need to write a prompt and give it all of the information it needs access to to be able to give us that final recommendation or that final answer so in the name of time I'm just going to paste in a prompt here that I've already written it's pretty basic and we can kind of walk through it and I'm also going to change the model here to be mistal 8 * 7 billion just because this one will perform better for this specific type of query okay so you can see here that it says you are a sophisticated AI travel assistant equipped with access to a vast Vector search database Your Role is to help users plan their Travels by providing personalized recommendations answering queries and assisting with itinerary planning based on their preference and past travel history then what we do is provide some additional information so it said you found the following information that could be helpful and that's action 3 output now that is coming from our Vector store database and we can see that because the vector database is labeled action three we then have here is some uh data from the internet and that's from action 2 output which is from our tool here which went to Duck Duck Go search and got us some results and we can give it additional information like the current date and time the chat history we could give it the original question if we want which is actually what we do down here and you can see that we can make this as advanc as we want and create kind of this prompt template now in our case uh the reason why we're using this I NST and this SLS and stuff is because this is how the prompt needs to be formatted for this mixt tral model but for different models you can use different types of prompt templating and to do that if you click here on the question mark you can see that will actually give you some documentation that explains how to write the prompt template based on the model you're using in our case again we're using mixl 8 * 7 billion okay so I'm going to save this and now what we can do is actually test this model out and see if it's working that's it we've built this entire AI application obviously it took a little bit longer because I was explaining it step by step but you can see in just a few minutes we have a full AI app so what I can do now is test this and I want to see if it's using my Vector search database so I'm going to say something like when am I going to Thailand question mark and it's going to go through this entire Pipeline and it's going to give me an answer and I'm going to show you how we can look at the logs and dive into the details of what's actually happening so here you go it says you're going to Thailand from May 27th to June 6th which is correct at least that's part of the time that I'm in Thailand and what I can do is go to C log and when I click on C log here it's going to bring me to this logs page and we can view all of the logs for this app so we can see that we have the timestamp identifier query output and then if I go into log detail here it will actually give me the thought process and show me what happened at every single step so you can see the first llm says please respond to the following question blah blah blah and that says Thailand travel dates planning okay that makes sense from what I asked here and then we go to our tool duck. go search where we do Thailand travel dates planning okay Thailand offers a diverse array of experiences blah blah blah gives us some information and then we go to the vector store database where it says when am I going to Thailand and it pulls up all of the different information of when I'm actually going to be in Thailand and where I'm going then we give that to the final llm and it gives us the result you're going to Thailand from May 27th to June 6th so there you go we built a full AI application in just a few minutes here obviously we can continue to test this fine-tune it and make it better but what I really want to show you now is how you can take this and actually use this now in a production environment one of the most challenging things is once you built the AI app okay how do you actually utilize it now and how do you allow maybe your website or other services you have to ingest this and actually make requests now this is automatically built into the platform and what you'll see is that if we go to the query here or we go to the output it actually shows us a way where we can send a request to this and get a result back so it's automatically hosted in the cloud as soon as we build a project like this we can enable or disable it using this kind of ticker that I have over here but if we want what we can do now is write some basic python code where we can send request to this app all we need to do is pass an API key which I'm going to show you how to get access to and then it will give us the result back in a way where we can just ingest it in our app so let's actually look at how we do that and the first step here is going to be to create an API key that we can give our code project it could be python JavaScript command line however we want to call this so I'm going to go create API key and I'm just going to go tutorial here and for the project I'm just going to give it access to the travel planner it's going to make that key for me so I can copy that now and now let's open up vs code and let's write a simple script that can actually send a request here and then get the result back in our python code so I just opened a new python file here and the first thing I'm going to do is just load in my API key so I'm going to say API key is equal to and I'm going to paste that right here now in order for this to work we are going to need the request module installed in order to install that you can simply go to your terminal and type pip install requests or pip three install requests and then you should be good to go I imagine most of you already have that installed so I'm going to write this pretty quickly because I'm just trying to demonstrate how this works but we're going to say that we have import requests I'm then going to import my Json let's zoom in a little bit we also want to spell requests correctly now now what we're going to do is have our URL which we'll copy in from the website in 1 second we're then going to create some headers now the headers are what we need to actually put our API key and to say that this is going to be Json Etc so we're going to say content type okay and this can be application SL Json we're then going to have our API key which I believe will be like this and then this is going to be api-key and then we're going to put our key in here using an FST string so I'm just going to say FST string and then API uncore key we're then going to have some data uh and this data is going to be a payload and this is going to be whatever the user types in so we can just say hey where am I going in Thailand and can you give me some recomendations okay so this is just like the prompt that you're going to be giving to the model so now that we have that what we can do is send a request so we're going to say that response is equal to request dopost we're going to send this to the URL again we're going to fill this in in one second we're going to pass our headers equal two headers and we're just going to say that Json is equal to our data and we'll print out what the result of this is in a second but we need to have the URL first so let's go back here and go to our AI projects okay let's go to our travel planner and let's get the information that we need so I'm going to click on query and you see it shows me the URL right here so let's copy that in okay we're going to go here and paste this and notice here that we have this channel token now the channel token is something that we can use to identify ourselves and we're keeping track of something like message history in this case we have an enabled message history so we can just make this whatever we want we can just do something like hello but really we would usually make this unique identifier or something that identifies our specific client user whatever anyways here we go we have the URL as is there anything else that we need let's go back here and have a look we have api-key API key it's kind of showing us a templated request here I believe just checking again that this looks good and what we should do now is just print out the response so what I can do is say print response. text and let's see if this actually works so let's run our code and give this a test all right so we got a result here you can see this a little bit difficult to read but it says hello based on the information I have you're going to puket at this time your flights are booked with this blah blah blah uh and you can see that it's kind of giving me this this whole result and telling me where I can actually go now I'm probably going to blur some of this out because it is a bit of a privacy concern uh but you can see that it is giving me a personalized result based on the information in the vector store database and we can make this better we could improve this code but the point is that we can actually send a request to an API endpoint and now all of a sudden this app we built in just a few minutes is pretty much live and we can use utilize it wherever we want so that's really all I have for you guys here I wanted to show you this really awesome platform that makes it super fast to build out these AI based apps massive thank you to them again for sponsoring this video definitely check them out from the link in the description and if you do end up using this quite a bit obviously you will eventually have a limit in terms of the number of queries that you can send and you can always upgrade your plans you can get access to more queries more tools Etc but even just using the free version is super powerful that's all I was using in this this video with that said I'm going to wrap it up I hope you enjoyed if you did leave a like subscribe and I will see you in the next one [Music]
Info
Channel: Tech With Tim
Views: 13,463
Rating: undefined out of 5
Keywords: tech with tim, how to build ai application, large language models, no code tools, embeddings, long chain python, how to code, python programming, llms, tech, coding, programming, ai, llm, ai agent, rag, rag llm, llm rag, ai programming, llm deployment
Id: C0HwZipOqXI
Channel Id: undefined
Length: 21min 13sec (1273 seconds)
Published: Tue Jun 04 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.