BioMistral LLM: Medical QA with RAG Technology Using Qdrant VectorDB | LM Studio | Langchain

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hey guys in this video we are going to talk about how we can create a medical rack question answer something like this where a user can input a query and get the answer like this and they can also go through the chunks uh which has been extracted from the retrieval and which has helped the large language model to create and get the answer so let's quickly check the architecture of this application so we have used large language model biom myal for the synthesis of the retrieval chunks we have used Vector DB quadrant and we have used pmet bird base embedding for the embeddings so if you go to biomist 7B on hugging face and um you'll go through the abstract of it you'll realize that this model has been extensively trained on Healthcare and medicine and this line basically tells you that it is utilizing Mel as its foundation model and furthere pre-trained on pmet Central and if we check the pmit bird embeddings on hugging face we'll realize that this model has been extensively trained on pmit title abstract pair along with similar title pairs so we are going to use this embedding in our example and if we come down we'll see the evaluation results where it has been compared with some popular embeddings like all mini LM L6 BG base 1.5 next we'll talk about quadrant Vector DB so if we go to their website quadrant. te will'll see that the vector DB powering the next generation of AI applications with Advan and high performance Vector similarity search technology uh we can use Docker to pull the image and run the container in this example that's what we are going to do we'll have the docker installed and we'll uh run the docker of quadrant container so one can also go through their benchmarking Vector databases where they have uh this kind of a comparison which has been updated on January 2024 and these are the benchmarking parameters they have used and if you go down we can see that they have compared theirself with elastic sege redis vv8 and mver on different different parameters and uh you can see like in most of the parameters they are doing very well so before I go forward just wanted to tell you that we are going to do the coding part in Jupiter notebook for the better explanation and Clarity and then once the rack pipen is completed I'll copy this code and we will create the stream L application so those who are mostly interested in stream L application development they can skip this part okay so let's jump into the coding part so these are the P install packages which we need for this example and these are the import statements we are basically using quadrant client chat open retrieval from Lang chain and uh unstructured file loader to basically read the PDF file we have taken this PDF file I'll just quickly show you what it is about so I went to pbet website and I search for Neuroscience those who don't know about pmit they can search on chat GPT our new Google search because nowadays we prefer more you know asking things on chat GPT in state of Google but anyways so bmit is a free search engine accessing primarily the Medline database of references and Abstract on life science and biomedial topic I thought like this is the U best you know platform to get some kind of use cases which will be helpful for us in this example which we are working on if you search for neuroscience will'll get these many results I chose integration of optogenetics with complimentary methologies in system neuroscience and once you click on free full text you will come to this page again to get the PDF you have to click on this PDF you just have to save it and get back here so once we have uh the PDF we will load it using the unstructured file loader and you can see the content of the page it's huge right if we come to this part of the code as I have already told that I'm using LM studio and in the LM Studio you can see here this is the LM studio uh the biom the 7B bit 4 GF is already running here is the quadrant configuration where we are creating quadrant client and embeddings we are passing it to the quadrant object and we are also providing the collection name next we are creating the promt template and we are making sure that it doesn't hallucinate we passing the prompt into the prompt template next we are creat creating the text pter giving the chunk size as th000 and chunk overlap as 100 and we are getting the chunks we are passing these chunks into the quadrant so the moment this code executes right we can visualize what will happen here all these chunks will get converted into embeddings and these embeddings will get saved into the quadrant Vector DP server right now in the retrieval part we are going to pass on the prompt to create a chain type and then we pass it to the Retriever and we are making sure that top K is five and the suchar type is MMR once we have the retrieval Q&A object created we can pass the query now for the query what I did is because of my limited knowledge on the Neuroscience or you can say no knowledge about the Neuroscience I went to this PDF I took the abstract copied it and pasted it into the chat GPT and then I asked like provide to to questions from the below text not more than 20 words and then basically you have the two questions ready okay so let's take the first question and we'll past it here and let me run the whole Jupiter notebook I'll clear out the previous logs from LM Studio as you can see it is executing all these cells one by one and uh it is pushing the embeddings to the vector DB this might take some time the moment it will execute this line it will create this collection in the vector DB so we can see this is the collection and this is the collection information and uh these are the page contents all these are the chunks and all these chunks have this page content now if you go back to Jupiter notebook we can see that it has started executing so what we did is uh think of it we are a user and we are asking a question this question is converted into Chun and then it converted into embedding and it got there in the vector DB and uh the similarity search will happen here and then we will get some top Cas so in our case top K is top five so we'll get five you know chunks of it and then the large language model will process these chunks and give us the answer in our case the large language model is running on LM studio so we can see that it has already uh spilling out the answer and now we have the answer the development of opto genetic technique is a rapidly evolving field in Neuroscience so now that we have the rack pipeline ready what we are going to do is we'll take this code and create a stream lit application so let's quickly go through the streamlit application code so we have four files app. pi LM service. pi readme and requirements. text so we have a file uploader and it accepts on the PDF we pass it on the uploaded PDF and we call the load PDF in the load PDF which is written in the llm service we are creating a temp file so the reason of creating a temp file is an unstructured file loader is not compatible with stream lit application directly so we have to create a duplicate file and then we have to take the temp file path so we are passing the temp file path to the file loader and getting the documents out of it once we get the documents out of it we are passing it on to create a knowledge base and the knowledge base is [Music] also having the logic where we are creating a recursive character text splitter and we are splitting the documents into chunks and then we are passing these chunks and creating the embeddings and saving it into the quadrant Vector DV then we are letting the user ask a query and while clicking the submit button this query is going to the get response which is again defined in LM service we are creating a retriever logic here in get response and passing the prompt we are passing the large language model Retriever and we are keeping the wos equals to true so that we can uh see the outputs and once we get the response we are extracting the answer out of it and to get the source document or the chunks we are looping through the source dos here and once we get these response and Source document in the app.py what we are doing is we are directly showing the answer to the user using the sd. WR but with the source document we basically looping it through using the st. expander we are asking the user to expand it to see the chunks so for this case we have top equals to or k = to 5 so we are showing five chunks let's quickly test this application what we'll do is we'll take the questions which we got by you know by submitting the abstract from the PDF previously so we'll do a question answer ing task how has optogenetics evolved to mimic natural activity patterns and we are expecting an answer with the retriev chunks so here's the answer and these are the chunks which has been retrieved we'll take the second question we'll ask it here submit so what technologies have complemented optogenetics in recent years for physiological understanding and we got the answer the answer is quite short and straightforward natural and engineer Ops in diversity and the chunks which have been retrieved are these chunks that's it guys for this video in the next video I will compare biomist with the mistal on similar kind of use cases be it like Pharma or biomedical data set and we'll try to you know conclude that whether the biomist outperform the mistal and I hope you like this video and if you liked it please Thumbs Up And subscribe the channel thank you guys have a nice day bye

Info

Channel: Ayaansh Roy

Views: 608

Rating: undefined out of 5

Keywords: #BioMistral, #LLM, #LangChain, #Docker, #QdrantVectorDB, #Streamlit, #RAGarchitecture, #NaturalLanguageProcessing, #MedicalQA, #RetrievalAugmentedGeneration, #Bioinformatics, #HealthcareAI, #MachineLearning, #AIResearch, #OpenSource, #PubMed, #LanguageModels, #GenerativeModels, #TextGeneration, #DataScience, #Technology, #HealthTech, #DataAnalysis, #DigitalTransformation

Id: qAsNVjp42vI

Channel Id: undefined

Length: 15min 44sec (944 seconds)

Published: Mon Feb 26 2024