Building a Full-Stack Complex PDF AI chatbot w/ R.A.G (Llama Index)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

all right hey guys and welcome back to another video I know it's been quite a little bit since I last posted however we just busy with a lot of different projects as well as a lot of other things going on in my life however I'm not going to bore you with that in this video I'm going to be going over a recent project that I completed for a client which is a full stag full stag full stack AI retrieval augmented generation project but this project really pushed my technical capabilities to the Limit I had to figure out a lot of things and I'll break down everything in this video um as far as what I did what I took away from it and the lessons I learned so without further Ado let's go a and jump into the computer however before we go there I just want to say if you find this video useful in any way or if it helps you with your projects please do go ahead and leave a like it helps the YouTube algorithm and if you have any suggestions for any other future videos please go ahead and leave a comment down below so here we are on the computer and basically before we get into the actual details on how I actually went about building it and the problems and challenges I face I just want to explain the scope of the project and then show you the actual project itself so essentially what this client wanted to do is they wanted to create a chatbot with retrieval augmented generation for a large number of PDFs essentially these PDFs were based in the medical field and had to do with a medical system however um they had thousands of pages and they had a lot of unstructured data so these PDFs were very very large and they had a lot of tables meaning you couldn't easily chunk the context and the text in the PDF into vectors and so that's really the challenge with this um with this project was doing that and so they had thousands of pages it was over 40 PDFs and the AA was to convert this into a chatbot product that he could use as a proof of concept so essentially I'm going to have to blur out a lot of this but here is the um front end that I designed um I will take I will probably blur out pretty much all of it but it's a basic front end using react it has both the login functionality and registering functionality so what I'll do is um is just log into the account and show you what the actual product does Without Really revealing too much details uh since it's just a proof of concept so once you log in it takes you to this page with a simple chat UI window the basic idea of the pro project is you can chat uh with the bot and you can basically ask any question from those PDFs and it will retrieve an answer and run it through uh GPT 3.5 and come back with a response and you can have basically like a chat GPT style interface um to go ahead and talk to those PDFs also in the project I built out a very simple admin dashboard since the client wanted to track the average questions asked per user as well as the average time spent per user and a ability to view um all of the um details on every single chat session um that happened for a specific date so that is a very quick overview of the entire project again I can't share too much details but hopefully that gave you an idea of what I what it what I did however we'll break down the technical aspects of it on um on here so yeah essentially that was the goal and for the AI functionality the problem that I was having or the problem that a a lot of people have is that most uh parsing of PDFs into a vector database does not support unstructured data such as tables pictures and diagrams uh especially pictures and diagrams so the reason for this is that when you chunk a PDF it basically looks for very easy ways to chunk context and so it look at something like the white spaces on a PDF and if you structure your PDF very in a clean format it can easily do that and convert it into a vector database however when you have a table the data is all over the place and and a lot of these parses really don't have the ability to understand that like the thing over here on this table equates to this thing over here and so that's kind of the hard part with unstructured data a lot of the project was spent um finding a good solution for a framework that allowed to convert those PDFs into a proper Vector store so if you kind of look into like doing retrieval augmented generation with semi-structured data or unstructured data you would quickly find that they use this Library called unstructured which is a um an open source library to basically read this data and then they will basically follow this process of converting the documents into tables and then text chunks having a text Chunk with a table summary in a separate uh store called the box store and then a vector database of the uh of the actual document um text and then using those in combination in a multi Vector retriever to then go ahead and get your queries so I tried that and that pretty much worked however the problem was that Lang chain retrievers use an inmemory store basically meaning that the data on the tables um after it was parsed could not be saved into a persistent database which meant that like every time you start up the system you would need to reparse the data um and load it into memory which obviously wouldn't work for 40 PDFs and really does not fit this use case and so the hard part was really finding something to fix this problem and to save all that table data into uh some sort of persistent database and then also have some sort of multiv vector retriever to go ahead and grab all that data that's a lot of technical terms but essentially the solution that I found was this YouTube video by um these guys here and what they essentially talk about is rag for complex PDFs and they use a library known as llama index uh and the unstructured element node parser and what this allows you to do is essentially take your PDF convert it into an HTML file allow the unstructured element node parser to go ahead and read through that HTML file and correctly map the tables and then you can put that into a persistent Vector database like pine cone or whatever and so I'd highly recommend for anyone that actually wants to implement them this themselves this video does a really good great job and I'll probably link it in the description it's just called um rag for complex PDF by AI maker space highly recommend them they have a jupyter notebook that goes over uh everything in this process so yeah that was arguably the hardest part was then converting all those 40 PDFs following the method and then uh making sure that that all that gets updated uploaded to um pine cone as a vector database and and everything works properly so once the AI functionality portion was taken care of it was now time to actually do the full stack portion which was the front end and the back end so essentially this is a text tack that I use right here uh for front end I use react jajs and tailin CSS to make a very simple UI uh for database for user data and for chat context I I use SQL light locally on a flask server for the vector database to store all the information that I generated using this I use pine cone for the back end I use flask uh to run um both the connections to the database um sqli database as well as the AI functionality and also um connections to the vector database the general use cases of the want to have was an ability for users to log in and register ask context a rare questions similar to having chat GPT you can ask followup and it'll know what you were talking about before have chat session and then have some sort of tracking of questions asked and time spent on the application so here's kind of a from a system architecture perspective what actually goes on so you have the front end hosted on netlify you get a question message you create a chat session you check if there's an existing conversation already uh present if so we can retrieve that conversation from an SQL database uh to use as context when we query um open AI if not we go ahead and prompt the vector database with the question the vector database is a pine cone server uh which just has all those um chunks generated from uh llama index it retrieves those relevant chunks and then comes back then we query GPT 3.5 uh with the context plus the prompt the reason for using 3.5 versus 4 is mainly for uh cost savings for the client however I'd imagine four is a lot better and then finally we uh we send back a response which um goes back as a chat uh response message on the front end and the user can continue the cycle over and over and over again with their existing context or end the session that data would get logged on the admin portal so now very quickly I'm just going to go over some of the main lessons that I learned going through this project and um some of the takeaways that I took so production is 10 times harder than development environment so as a junior engineer this is uh something I guess you realize kind of early um which is is that production and getting something from a testing environment uh to actual use case of being able to use it is a lot harder and there's a lot of small little things that come up like SSL certificates or not being able to send back and forth messages from um from a different IP address like all this stuff that just comes at you out of nowhere um is not present when you just get the thing running locally and everything's fine in local and then you take it to production and everything goes yeah this was kind of a Learning lesson that like I should be thinking about how this will actually get deployed as a actual product when I am going ahead and building these applications I think that's something very important um to learn and again some of these might be obvious to a lot of you that are experienced however these are just things that I'm learning along this um Journey so the second one is probably also obvious but is to think about the end product and work backwards and that will prevent repeat steps so I'm going to give you a very simple example of that now those PDFs those 40 PDFs that are in the um system need to get updated every time those PDFs and those standards change now that that means that whole process of converting them into a vectors and then put storing them on our database needs to get re- repeated and there needs to be a functionality for that now the client told me this at the end um and there was really no way for me to go back and redo it in a way that would be easily updatable just because of how Vector databases works and since it's unstructured it's a whole process so like knowing this earlier what I could have done is I could have added metadata into the vector database to actually store what the document was about so we can just update those vectors however now the current situation is that you need to update all 40 of those PDFs regardless if you change one PDF so that's a bit of a unfortunate incident however uh you should always think about the end product and work backwards to avoid things like that number three is everything is so new in the world of AI and the reason for this is that this video uh where I got this essential understanding of how to do this with llama index was posted three 3 months ago and I had a few conversations with a lot of people across different slack channels on like how I could go about doing this since the Lang chain version version was not persistent and I was finding a lot of people were having like mixed Solutions and just different things that they had put together and so everything in this space is quite new um and so a lot of people are just trying to figure it out similar to you are and it's not like regular fullsack development where someone has done it a thousand times before you kind of need to go out and find the solutions yourself and that takes a lot of time as well and finally uh is to always talk to the client and get feedback promptly your vision for a product is not theirs it is better to build what they want not what you think they want um and this is again like such a simple lesson but it's a key takeaway so for the back end this stuff really does not matter however when you're doing a front end for a client this is something that you really want to keep in mind and also go back and forth A lot of times because what you think that the client wants is really probably not what they want and they probably have a different vision for it or they like it this way or there's just a 100 different things and it's always good to just incrementally get feedback and then build up from there rather than just spending like 10 hours just grinding on one thing and then getting it to be perfect in your head however it does not look anything like what the client wants so key key lesson there I think this is goes across not just development but every other aspect um but always get feedback uh incrementally and then build increments so hopefully you found this video useful if you did please go ahead and leave a like again helps with the algorithm and yeah I will continue to document cool little projects that I do as well as the whole journey of um learning all this stuff um so if you want to go ahead and subscribe to the channel go ahead and do that and I'll see you guys in the next

Info

Channel: Paragon - AI & Automation

Views: 9,812

Rating: undefined out of 5

Keywords: gpt, ai agency, ai automation solutions, ai for business, chatgpt for business, How to use AI in eCommerce brands, agency automation ideas, onboarding automation tutorial, increase client retention, ai automation agency, liam ottley, ai business automation, easy automations for beginners, agency automations, artificial intelligence

Id: TOeAe8KB68E

Channel Id: undefined

Length: 11min 40sec (700 seconds)

Published: Tue Jan 16 2024