DataStreaming with LangChain & FastAPI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everybody this was a requested video from my user who wanted to know how to use data streaming with openai Lang chain and fast API I will show you how to use streaming on its own which is kinda easy and then how to use it with fast API in the front end which is not really as easy as I thought it would be but together we will make it you will find the link to the code in the video description okay I'm currently in pycharm and as you can see I've got multiple files here I'm currently in the test.pi this is just a little bit of sample code I will show you how it works so first we have to import some classes from link chain which is chat openai and human message also load.nf because we have to use our openai API key and this is how we can load it we will also use a stream standard out callback Handler in the model and then we will create an instance of chat open Ai and use the Callback here as parameter temperature 0 to make it more reliable and then we will just use the chat function here pass in a human message and the content is write me a song about sparkling water if you run that let's do it pythontest.pi we can see that we make a request and that the model should then return a song about sparkling water okay that took some time but as you can see this is the response and this was of course not streamed if we want streaming we can just pass in the parameter streaming and set it to True standard is false and as you can see now the text is generated token by token so we don't have to wait for the model to actually create the whole text but we get it sent back a bit by bit so that's basically it you just have to set the streaming to true to make it work just with openai and now we will take a look at fast API okay the text for the application is in the main.pi if we find it there what we first have to do is we have to import some classes of course and we have to import the async iterator Callback handle not the normal stream callback Handler because we will actually use that to yield the token and we of course use the chat of my eye model and human message to actually send one message to open AI so a very basic endpoint but it shows how in general is streaming with fast API works so then we load our API key again and then we will create the app instance by instantiating fast API then we will add course middleware because we actually want to use the front end which is in the index.html and then we will also create a pedantic class just to make the endpoint a little bit more for both and we will just send this content here as message to open AI okay now we come to the tricky part and what we're going to do is we will create an async function which records a message that takes content as input which matches here this content property of the message class and this will return an async interval with strings in it this is very important this is needed for fast API to work and then also instantiate the chat of my eye model and very important here set streaming to true and also use the async iterator Callback Handler here is callback okay now we will use async IO to create an asynchronous task which we have to await and what we pass there is the model which uses the a generate function so this is for asynchronous generation of the content and we pass in a human message here which is from link chain and pass the content which we pass here as attribute for this class okay after creating the task we can now make use of the Callback Handler and use the editor method here to actually Loop over all of the tokens and yield this token here but why do we actually need that we need that to actually know that we don't have any more tokens to return so if every token was here that we can now set the Callback to done if we know that the Callback here is done we can now await the task from async or which we created here and just await that so that's the function itself and now we can create our endpoint so to do that we will create a post endpoint where we can send in the message here and also make this an async endpoint we call it stream chat and here we first take in the message as input and pass that to the send message function here and this will actually return a generator and we will use that generator in combination with the streaming response from Fast API and pass it here as input and we also set the media type to text and event stream okay now we can run our web application we will use uvr corn as web server here and run uvr corn main that's the name of the file and then use here the name of the app which is just app and we will run it on Port 6677 and now we can see our application is up and running and we can also use the Swagger UI to actually see it but this regular UI does not work with streaming responses as you can see here we generate our response and we don't get it back bit by bit but it waits until the whole process done so to make that work we can use a different approach and just use the request library and here everything is prepared now as you can see we sent in a message we pass that message here in the correct format here content is the attribute of this class here and we send it with a post method here to this URL we pass in the data as Json and set stream to true now we can Loop over the chunks bit by bit so let's try that out text test stream.pi if you are interested in making that work on a website we will take a look now at the index.html and how that is structured okay let's take a look at the index HTML and as you can see here this is just a little bit of HTML and I will not walk you through in that detail because that's actually it's JavaScript code and my projects are more about Python and Link Chain so I'm just going to show you what I'm going to do here we will create a message which will be attached to a button which will be triggered when we click on it so this will make a post request and we will send the value which is inside this text input field here so this will be sent to open my eye so we make this post request here to localhost I can see that's the wrong part and yeah the stream endpoint we will send again the body as Json and the content here is the message which is the content here of this text input field so to display that talk token we use the get reader functionality here and we also use the text decoder and then just Loop over the result here and if we don't have any more tokens we will return the function and if we have tokens we will decode the token first so and then we will check if the token is a stop a symbol like a DOT an exclamation mark or question mark and if that's the case we will trigger a new line with the pr tag and also pass in that token to the inner HTML so we will update the Dom for each token and yeah that's it of course that's not the most robust solution normally you would use a let's say a framework which is able to handle that in a better way but for Prototype that's fine so let's change the port here to 77 to make it match our API and now let's have a look at the front end so this is the front end we can now just ask how are you to test that and here you can see this is streamed maybe right about sparkling water to make a little bit more text content and yeah that's how streaming works and as you can see here's a question mark so we create a new line and this works pretty nice so yeah that's it that was the project and if you liked the video feel free to subscribe to my channel and like the video of course thank you very much bye bye
Info
Channel: Coding Crashcourses
Views: 5,930
Rating: undefined out of 5
Keywords: langchain, openai
Id: Gn54EbU9mRg
Channel Id: undefined
Length: 8min 50sec (530 seconds)
Published: Tue Aug 08 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.