How to Summarize PDF Using LangChain | OpenAI | Gradio

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello guys welcome back what I am planning as I said before I will be creating the documentation part of Lang chain videos as well as some use cases right the first use case is the PDF summarizer what I will be doing here is first explain you what are the different pieces you need to know before doing the summarizing part and in the first I will show you how you can do the summarization with a simple line of python code as you can see here and if you are a UI person I will also demonstrate you how you can do the same thing with the help of gradu so two different pieces together let's get started let's go through the code first thing first we need to install the packages right in the first line I have already installed this gradio open AI Pi PDF tick token and line chain if you want to know more about this I have provided the link here you can go into that so I have already done this so and then you need to also have the open AI API key it's always the same thing right we are using open AI meaning that we need to have the API key you can go to this link and get the API key and just replace this string here just to give you the high level idea what tick token is because that is the tokenizer of open AI let's import The Tick token right here I'm creating a function where it is taking a string and we give the encoding name here I am passing the encoding tick tocko dot gate encodings from the encoding name meaning that I'm passing something here you can see I am passing this base model and this then the string what I want to have the tokens encoding.in code string and the number of tokens is the length it's just a simple function what happens if I run this because it may seems only read numbers right if we pass Tick Tock is great what it is doing is converting that into the tokens and how many tokens are there there are six tokens that is how the tick token Works behind the scene if you want to know more information here is the link again go there and enjoy now let's go and import all the necessary classes here I'm importing radio line chain import open Ai and prompt template and all the other different things like node summarize chain that is the main function that is doing the magic for us and the Lang chain document loaders and we are using the pi PDF loader I have explained all the things here just to give you again the high level idea I have already downloaded this GPT for all technical report you can draw just run this command here W gate it will download it and once it is downloaded you can see over here right then what you can do is just click here in the copy the path and you can just pass it here let me just run this and show you what it is doing here I'm loading like there is a loader Pi PDF loader and I'm passing the file right and and then there is the dock which is loader.lord and split we need to actually split and make into different chunks and then here you see that it is divided into three different chunks and this is the first chunk and you can see it is extracting the information from that particular PDF file okay so this is just a small function now here you see I'm providing Dave summarize PDF and I'm passing the PDF file path there is a loader as I said Pi PDF loader it takes that path there is a box which splits that particular loader there is the chain now the load Samurai chain which takes the llm model and then chain type there are actually three different chain types in line chain you can go through the documentation I will actually create a the in-depth documentation later in the series of blank chain but we just use the map reduce and here there is the summary chain Dr Ron you roar on top of that docks and it will just return you the summary let's run this shell and as I said I already downloaded this particular PDF file and here is the summarize and summarize PDF the same function that we just wrote here and then we pass the file path and if I run this cell it will go through the documents of that particular PDF all the chunks right and then it will summarize for us that is that simple you see here only this line of code is doing all the parts here other yeah of course there is the chunk load and split part but this is the thing that is doing all the things here as you can see the summarize part this paper presents DPT for all or chatbot train on a large accurate data set of assistant interactions and so on that's all if you if you are okay with this but if you are a UI person I have now implemented the same thing with the help of gradu how it works is this is the same function as we just wrote here right now what we do is we initialize the gradu part here is the input PDF path that is we take the gradu and the components and we take the text box and here we say provide the PDF file path and there is the output summary which says radio dot components there are actually different components in radio it's really good if you don't know what video is as I said I have provided the link on the top of this notebook and this notebook will be in my GitHub repo feel free to use that notebook and here I am passing level as summary how it works I create a interface and gradu Dot interface and what we pass here it takes the function as the argument here and then there is the inputs that is the outputs we give the title we give the description and Dot launch why I am passing the CR equals to true because if I pass the CR equals to true it will provide me the shareable link if I run this you will get the idea so that I can share with other colleagues and they can also run this in their computer yeah this is the simple interface here we can just pass the part of the file right if I go here I can just copy the path and if I just pass it here and if I submit this it is going to do the same thing that we did here but now it is in the simple it looks better now it is doing the summarization part and yeah it summarizes here that's all that's that simple either way you want to go with the UI part or with just the normal summarize function that's all for this video I hope you liked it and I will be creating more huge cases of blank chain in my upcoming videos yeah thank you for watching and see you in the next video
Info
Channel: Data Science Basics
Views: 9,238
Rating: undefined out of 5
Keywords: chatgpt, openai chatgpt, openai api, code, chat ai, large language models, llm, what is large language model, chat, langchain, lang chain gradio, langchain demo, langchain tutorial, langchain openai, langchain explained, framework, openai langchain, what is langchain, langchain hugging face, langchain chat gpt, langchain tutorial python, langchain tutorial pdf, llms, chat models, prompt, chain, agents, langchain gradio, langchain use case, pdf, pdf summarizer, how to summarize pdf
Id: iMDBMTFT0ns
Channel Id: undefined
Length: 7min 7sec (427 seconds)
Published: Fri Apr 28 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.