RAG With Azure AI Search using Vector/ Hybrid / Exhaustive KNN / Semantic Reranker in 15 minutes

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi the session is about the rag with Azure AI search and here the difference is that we will be exploring Vector hybrid exhaustive Canon and a semantic ranker with Azure I search so let's see this in action first of all we have already got this Azure AI search and U configured and what we have done is to put some documents in there and the index is already here and biology is the name of the index and if you click on it there are 73 documents in it and you can search directly from here we'll explore this and we will see this Vector search Hybrid search and the exhaustive Canon and the semantic ranker search so what is it so here this is what we are trying to see and then what would happen is that when we are searching these things are happening when you put in an user input it gets converted into an embedding Vector this Ting Vector is s used to create a vectorized query and then we search in Aur AI search and then it is sent for processing the results that is the search part and next what we do is that once the results are returned we use an orchestrator and here we will see Lang chain and in Lang chain we will see a conversation chain we'll use a prompt template we'll see a conversational chain with memory and here what we will use is that once the results are there in the orchestrator we'll pass it to asure opening ey to get the results so let's see this in action in code so if you have not watched my video on how to put data into aurei search I would strongly recommend please do watch it the link will be in the comments but let's now dive in of how do do we do the Azure AI search so before we jump in let's see it in action so here's a streamlit eii uh application and it has got two modes a question answering and the chat and you can select the analysis type and you can say you can use Lang chain and so if I say this search the results so it's going in and it is searching there and it also gives a tokens here and the details from where it is getting the answers is displayed here and there is also history which is being provided so let's see if you say it is platti Hill minus so the result is here result will come up the details will show from where it has got the answer from where from which place it has got the answer so these are the results and the history of the conversation is being provided so this is a memory part that is that is being given okay so this is the part that is uh there so there is hybrid search and you can select here the semantic search and here also you can see this is this you can do through semantic search so let's see this in action in code so for that what we have done is that so the code will be checked into GitHub and I'll share the link we've created one function the custom aure AI search and we are configuring it with the endpoint so if we go to the env. sample so it has got we configuring it with endpoint with index name the admin key and the semantic config name so you can find all of it here if you go into this Fields so these are the fields which have been configured here the the semantic configuration is this and we also have a vector profile for this so this is a first part and and if you go into the into the class so this is initialization and now the get results Vector search is where uh is a vector search happening so first is that we'll get a vectorized query we'll search it and you see if it's a pure Vector search it is none and you pass the vector query you list the fields and then you also configure the number of results to return and the number of results to return is also configurable here so it is three here you can change it to you can change it to four five and something like that so this is the number of results to return so this is for the vector search now now let's go in to the what is the vectorized query so when we are saying the what is vectorized query here we see that is a vectorized query first we are getting the embedding vector and we'll come back what is the embedding vector and then we go and getting the vectorized query you pass the query Vector the K nearest neighbors and the fields which are to be sent so the fields here what we are asking it is that the embedding field name so the embedding field name is a field that we have configured here uh that we will pass in and the ex exhaustive Canon is true for the exhaustive Canon when it is a flag and then there is the embedding field name the embedding field name is a field which is used for the embedding Vector so this is the embedding field name and the list of fields is what we are returning back so this if you go into the AI search so this is a vectorized query and for the for the embedding query Vector this is embeddings we want to get if you go here you can see that we are using sentence Transformer and this is the model name that we are using and if we go back to the environment variable we are using the All mini LM L6 V2 which is a dimension of 84 that is what we are using so in summary so if you go back and see this thing we are taking an user input we are getting the embedding Vector we're getting the vector I Square we search in a z Ai and we process the results so that's what is represented in the code so for for other things like the hybrid Sears the difference here is that for the pure Vector search we are only passing this for the hybrid search we pass the text query also which is a combination of a vector search and a regular search for exhaustive can and search the difference is that we are creating the vectorized query where exhaustive Canon is true and for the semantic search the difference is that more more or less it's everything is the same except that we are passing in the query in the search text we are putting it the vector query we are doing uh semantic query type and and we are doing the query answers here and we are getting the top results so this is what is happening out here and and we get the results to return return down here so this is the search which is happening and once a search is done what we do is that we pass into Azure open AI so in Azure open AI we are using the Azure Azure openi here and here this is what we are using Azure openi and we are creating the prompts we are generating the answers and we are generating the reply from the context in which we are passing the context which is here so let's see of how Lang chain and everything is all bound together so in app. a what happens is that the flow is like this you you put in an user user query and then and then we get the search results which results have the result content and the result source and then we join all the result contents to create the content and then we get the reply from the lln in which if you go down here what we are doing is that we are replying from the context here we passing in the content which is the context and we're passing in the conversation here so this is how which is this is happening so this is all about without the Lang chain but if you're using Lang chain what is happening is is that so here for the Lang chain you you are using this statement here get reply Lang chain stream lit so if you go down here go to definition so here what we are doing is that we are using a conversational buffer okay and then then we are passing it the conversational buffer with the content and the user input now if you go to the get reply Lang chain what is happening it Returns the reply from the llm the conversation buffer is not initialized to initialize it so the conversation buffer is null we are initializing the conversation and we getting the prompt template and we are also using a call back to get the total number of tokens so here if you see the converse initialized conver conversation here you can see this we are using a conversation buffer here here we are using a conversation chain and we using a conversation window memory of K is equal to 3 so this if you see and then this is the code that we are putting we are initializing the conversation with this with a conversation buffer and using a conversation chain and a conversation window memory of K is equal to 3 that means in the memory we'll keep the last three conversations so once this is done we getting the prompt template and what is a prompt template if you go to definition here this is a prompt template and here if you see what is answer the question based on the context blue the answer cannot be answered using the information provided answer with I don't know and then it is pass a context and it is pass a query and then we are ticking in the prompt template and we are formatting it so and then we are doing the invoke on the conversation chain with the input prompt that is to get the reply back so this is uh the thing that is happening with the buffer and uh let's go here and uh we can show the history down also here we can also have the chat version in which we are using the chat version here and you can browse through the code in which we are using the streamlit session to store the generated and the past and it's basically the same thing and uh only thing is that we are showing uh the response out here so this is more or less it and and I this is the code will be checked into GitHub and you can see it you can also see the results of the tokens down here so this is a function which does most of the part in Lang chain so it is initializing the conversation buffer and it is getting the reply using the conversation chain uh and the uh conversation buffer memory so if you see it in action uh if you see the chat also uh if you put here so this is a chat format in which as if it is chatting and you can get it here so you see here what is so you see this in action here in which it is maintaining the history as well it is and you see the number of tokens which is being updated continuously and it is maintaining the conversation buffer also so you see this this keeps on increasing so this whole history is being maintained and this is how we are maintaining the vector search the hybrid search and the exhaustive can in search in summary through this application we explained rag with Azure EI search and aure openi we have used Lang chain we have used Lang chain we've used the conversation memory we have used the conversation chain we've used a prom template you may see the places where we have used all these things so this is a place a get reply Lang chain is a place where you can see this in action where we are initializing the conversation we are getting the prompt template and we are invoking using the conversation buffer so the initialized convers conversation is a place where all these things are happening this is an important function where we are initializing the AO chat openai the conversation chain and the conversation window memory hope you liked it thank you bye
Info
Channel: AG Academy
Views: 4,814
Rating: undefined out of 5
Keywords:
Id: qJl3IdCKfvE
Channel Id: undefined
Length: 14min 56sec (896 seconds)
Published: Tue Jan 09 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.