On .NET Live - Build your own ChatGPT with .NET and Azure Open AI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] [Music] e [Music] d [Music] a [Music] d [Music] e [Music] [Music] he welcome net friends it's another episode of the on.net live show you know what we do here we really strive to empower the community to inform you of what other folks out there are doing with theet techstack as usual I'm your host Scott Addie and I'm joined by my renzel in cam soer we have a very familiar face on today's show as the guest uh Swami Swami um oh you know many of our viewers are already familiar uh with your presentations from the past but could you please in introduce yourself for folks that are new to the show hey guys uh this is Swami um I am very passionate about learning lot of new things I call myself as Swami the learner so I gave this title to myself no one give so I always love to learn new things and speak about whatever I learn and thanks to all those uh industry experts from whom I am learning I really appreciate them uh for sharing their knowledge back to you Scot yeah so you know speaking of always learning new things uh my understanding is today we'll talk about Chad GPT seems like I can't go anywhere these days without hearing chat GPT whether it's the local grocery store or picking up the kids from school someone's talking about it and my understanding is today we're going to build something like that inet um so where do we want to start this discussion Swami I know you had a in agenda you wanted to walk yes um here is the agenda I'll just quickly uh could you please flash my screen yeah it's there okay thank you uh so today's agenda as I was mentioning uh Swami the learner I'm still learner apologize if I do any mistake and uh here is what uh you already barded a Express train so please fasten your seat belt so first we'll start with a little bit of discussion this all we'll cover in 5 to 10 minutes uh because the terms are already there and uh thanks to CH GPD from where I did the straight lift of these paragraphs and pasted it here condensed version so we'll speak about these terms and then we'll go through some of the visual images to understand what is that we are speaking about and where we will be going from here and then we'll speak about Elisa which is the world's first uh neural network neural language processing this is the first AI or artificial intelligence program which was developed in 1960s to 64 we'll speak about it and then we'll come uh back to Azure open Ai and what is Transformer and what is gbt and then we'll speak about tokens why should we learn about tokens and and then we'll try couple of uh completion using Azure Studio get our fundamentals uh strong in that and then we'll come back and learn under the hoods thing what is that because when we do it in studio it is wrapping up or it's abstracting it has the qu question and it gives the answer but we don't know what is happening under the hoods so this program will get us up and running under the hoods we will see some of the bits and pieces we'll learn here and then we will start building our own chart GPT in net 8 and some of the beautiful things in net8 I learned is it about this I'll speak about those things also I'm very energetic to speak about those points back to you Scott yeah so I think um maybe a good place to start here when we're talking about something like Chad GPT is um what kind of what category of Technology does this fit into and what I mean by that is as a developer who isn't you know data science focused uh you often hear the terms artificial intelligence uh machine learning deep learning the list goes on and on what's the relationship between all of these U categories okay so if you see this one over here we can uh that we will speak about the intelligence like we as a human we learn from different things like our experience our senses and uh based on our previous uh data whatever we have seen that's where we develop intelligence like whether we ate some food and we are getting allergy so we know that we should not eat that food and relative food so that's what the intelligence we are learning so artificial intelligence is something uh a program which has been fed with a lot of data and it behaves similar to the human so the the next subsection is the machine learning where um where we will feed we'll collect a lot of data and uh bifurcate select an algorithm and divide the data into 80 comma 20 that's uh the standard it's not hard and frost rule but we'll make it 80 comma 20 uh feeding the data to this algorithm and uh train it once it has been trained we'll use the 20% of the data to see how it is performing the model performance and if it is everything is is good that's when we'll go ahead and uh uh push it onto the production and it's a iterative process if it is not performing well there is an edge case again we need to come back and uh uh retrain it and one of the finest thing in uh training these uh open a Char GPT models is that there are billions of uh data lot of data and billions of parameters by seeing itself as thinking that wow and uh the Deep learning is another subsection within that and uh I have piled up quite a few terms here let me quickly go through this one U like artificial intelligence refers to the computer sence where it behaves similar to the human intelligence like machine learning is subset of that that involves the development of algorithms if you see that we collect the data we'll massage it or prepare the data then we'll select the model we'll train again 80 comma 20 we'll evaluate and do the performance uh the uh parameter tuning and then we'll leave it for prediction and this prediction is for the machine learning like if it comes to the generative artificial intelligence instead of prediction it creates a new content so deep learning is subset of machine learning as we were seeing in in all these diagrams if you see artificial intelligence machine learning deep learning so the Deep learning is where again it has got a lot of uh things within it uh it's a subset of uh subset uh of the machine learning and it has got many uh neural networks we can see that so natural language processing is one thing and if you see the generative AI that again has got a lot of sections like generative aders networks and various Auto encoders that all we will see here like Auto encoders CNN lot of these uh things are coming in in in terms of that so if if these are the minimal terms we need to know if we want to jump into CH GPT back to your Scot yeah so um I think that was an a really nice overview of what the different terms mean so to kind of like take that a step further you know what if we were to talk about a real world use case of something like machine learning um personally I think of possibilities like anomaly detection maybe in a a manufacturing line where you're you're trying to to find a defects in products that are being produced but I imagine there are numerous other use cases is there one that's top of mind for you Swami oh yes uh pretty recently um one of the engineering College where I went for U discussion on the latest Technologies so that's when those students showcased uh of showcase one of their product uh or the demo which they created in Python so what did they use is um in some country they have the water plant so they'll fill the water in those scans so what these people did is they used the images of the filled in water cans and they improvised um improvised the images again using Python and then they found uh some of the dirt or a small warms or any anomaly inside that so those kind of things are very very important like if you go see a small insect or something dirt which is there inside the can and we don't uh catch it and we sell it and the customers drink they will fall sick or the reputation of the company will be at stake so those are machine learning nowadays is uh machine learning or artificial intelligence nowadays it's coming everywhere so this is one of the real time scenario what I was telling is the real project which they developed apart from that there are lot of other uh things but this is top of my mind Scott and then um I I know another thing you had mentioned thanks for providing that example another thing you mentioned was U NLP natural language processing um I know you had an example of this you wanted to share was it the Eliza yes so Eliza U we we we were thinking that um artificial intelligence or NLP was started pretty recently now it goes back to 1960s so this gentleman at MIT he created this uh program uh back in 1960s and it came somewhere around 64 to uh 1967 this program is called uh Elisa I got a hold of this program in go language but I I didn't get this program but that is pretty fantastic I tried that uh uh program in go language it was almost pretty similar to the Chart GPD B where I interacted with that go program language so I was feeling that wow uh we didn't even know that uh these kind of programs were existing it was very simple pattern matching and I saw that it's a small array and U bunch of statements in goang but it was answering lot of things until we said quit it was answering I said that wow this is seriously uh lot of Industry experts as I was mentioning so thanks to everyone who created a lot of knowledge for this back to Scott yeah thanks for sharing that example so you know uh as I'm watching the chat here I'm I'm seeing a lot of chatter um people all over the world tuning in but one of the things that uh I wanted to point out is you've mentioned other programming languages like python like goang when you think of things like Ai and ml you could probably survey a hundred people and most of them wouldn't mention Donnet um but we're we're noticing that starting to change over the last couple of years especially um and you know today you're going to show us how to build something like this withn net but before we get to that I wanted to ask about there's this other buzzword that folks are hearing out in the community and it's open AI could you talk to us about what is open AI and what is the relationship uh that Azure has to open AI oh yeah sure uh so one second let me open open a so this opena has started uh as an open source product and uh they they created lot of these uh generative models so and after that it they have to support lot of uh these U uh lot of these projects so it became full-time uh job for many of those programmers who are uh giving it as part of the open source and uh that's where they have to convert this into a company and they're charging it and there are a lot of other companies which invested in this and I read somewhere Microsoft was also one of the biggest uh contribution in from the investment perspective from the research perspective also and uh exclusively Microsoft uh Azure is giving access through Azure to open as of now I think uh Microsoft Azure is only uh one cloud provider who gives exclusive access to this uh open AI and this has many many models and I right now I don't want to derail uh the session but you can see that there are lot of models and each of those models when we get deep into the weeds to find out how much data uh with which it has been trained and uh how many parameters it's going into uh 117 B millions and somewhere millions of parameters with which they have been trained so seriously that much amount of uh fine-tuning as well as the injecting the data and getting it trained and those those are showing right now if you see that the model which we are using is a completion model but uh it is a latest latest version of the API now as we're looking at this list we see different versions of the GPT model um I assume the larger the number the newer the the model um if you were to uh explain to the audience what you're going to do today are you going to be basing this off of GPT 4 oh uh good question so if you see this this is the Azure openi uh resource which I have created and uh the first thing we will play with uh with this studio and uh inside this studio we will be going through couple of uh these things and again I'll come back to your query um on the models uh will set up the context if you see that the models whatever we have here we can create the deployments we can go here and we can see all the models each of the models if you see has got trained with different content and that's one thing and another different parameters if you ask the same query to text D 3 versus either the text query or text Adder it will give different results in one of the demo I showcased it live so it gives definitely it gives different uh different results and that does not mean that it it was not trained properly it is it is 100% that we have to communicate the way it understands so that's where our skill set of prompt engineering will come into the play I had a real quick question Swami I had a real quick question um so how do how do you know which one of these models that you want to use for a specific task I mean is there is there some some guidance somewhere or is it just like trial and error Chad GP dude the uh here if you see that at the um um at the first glance we can see that for images we will use this for audio to text conversion we'll use this um for set of uh text to numerical form like these kind of things we use it and if you see that um whether whether we have to use the base GPT or 3.5 or uh gp4 we two things will come into the picture one thing is the pocket what is going to get charged or if I'm able to do the same job with uh 3.5 I'll stick to the 3.5 so one is the functionality which is being provided and other one is how much I'm going to be charged and if if I'm purely looking from the charge perspective I might not get the good results so each of those models are specifically designed for one job so we we have that and then another thing is if we go deep into that uh into those models this the max tokens and its capability will come into the picture and the training data also because if if you take any of those let's compare with this one itself if we take uh chart GPT 32k versus chart GPT 4 versus gp4 32k the max tokens are 32,000 that means uh it's like you're hitting an API endpoint and API endpoint is giving you a Json assume that it is going to give you 32,000 bytes and this model is going to give you 8,000 um 8,000 bytes so that is different so the more the tokens you will be able to ask a tougher question or which gives lot of content if you see that 8,000 characters versus 32,000 characters something like that so these all things will come into our picture but I saw really quickly when you were entering azrai Studio there were some descriptions with some models and I don't know if in the draw like I don't know if it was on the it was like here yeah but that is like when you're looking that is not very descriptive so like if you hover on the select a model does it has some explanation of what what those models are on the oh here it will uh that those models we can we can go uh deep and we will be able to find out like what is uh what is that at at uh this or we can directly go to the Azure open AI site here in this we have the models and they have given uh pretty much lot of uh information and uh we can get into this it clearly shows that what is that it supports yeah you see that name that matches the the thing that is in the drop down yeah yes it will we will find I saw I saw like in two screens ago two or three screens ago like I think it was in The Opening screen of the Azure AI Studio that he has some descriptions of what you could do and the models that he used uh I don't know if it was on that first Home tab or something there where it's like saying like chat playground and you can experiment with Jack GPT 35 turbo or four model so like here you kind of have an overview of what the some of the things you could do to get started right right to to get get started like I can go to this um uh this one and I'll uh say that create Apple with blue so I can give some prompt and it pretty quickly it creates uh apples with uh these colors sometimes it almost matches as if it is a real apple with this now see this I mean it's it's that's whatever the colors I put it it created it it's uh I have to fine tune this one so whatever uh yeah any other queries um one more question um earlier you were talking about a concept called tokens and I just want to make sure the audience is clear what we mean by that term um yes I guess in the simplest terms a token could represent a single word in a sentence is that accurate uh actually the tokens there is an algorithm so for English um approximately four characters we will do the calculation approximately four characters uh we'll see that before that I'll uh take real quick minute of uh these these things U open Azure opening is a service like we saw that just now it is uh giving access to the open a and Transformer is the model models name so GPT is generated pre-trained Transformer okay generator means it's creating and it has been pre-trained and Transformer is the underlying model or NLP so there is a tool called tokenizer so you can click on that and we'll be able to come into this one so uh this is a sample text for Scott okay so as here it mentioned somewhere in uh as approximately four characters this is for English so 100 tokens is approximately plus or minus 75 words so there is an algorithm why it is breaking that way we don't know but if you see that alphabet a and space this is being considered as one token but if you see sample and space these are 6 + 1 7 characters is being considered as one token each different color represents one token and we will be charged based on the number of tokens so these number of tokens is the prompt which you're giving to uh to this uh model and the response which it is generating so we need to be uh mindful of that so that is what we are going to be charged so you mentioned cost and and so as a an independent developer now all of a sudden I get nervous how do I know what something like this would cost me to use okay so from the costing costing perspective it is um it is detailly given um in the documentation uh what is that it is going to be char LGE um and somewhere is there in the documentation pricing here so we can pretty much we can get into that uh pricing and we can calculate this one if you if you see this uh what is the model and uh per thousand tokens what is that we are being charged so we can almost if not accurate uh five or 10% plus or minus here and there we will be able to get that the detail pricing easier and we know that the tokenizer tool is pretty handy for us to find out what it is um with these two things we can accurately uh uh come to a conclusion that this is what we will be charged that's really helpful yeah um yeah I looked at the pricing it looked like GPT 4 was was it six six cents I think I saw um go back to that I'm flashing it again um and again these things are available only in few regions so it might the charges might vary I'm I'm really not sure on that but uh whatever we are seeing is uh here so gp4 is uh six cents per thousand tokens got it and for the price of a Dogecoin you could use GPT for um a lot of people talk about this in terms of uh going without a Starbucks for a day I chose to take that a different direction um cool so we're about halfway through the show I'd like to uh find time to get to your demo um yeah why don't we start digging into that yes so first I want to set the context we'll use the uh we'll use the Azure playground then we will go ahead and see that a small demo in uh C that's to find out little bit of under the hoods and then we will uh come back to the main uh chart gbt program whatever uh is our main goal for today so if you see that we'll give the prompt and we'll get the answer and here one thing to understand is if I uh exclude this and again I click on generate we see that it's going to give us a new joke it's not going to give the same joke that's because the parameter one of the parameter which is driving is this one the temperature if I make the temperature to zero that means the randomness or the creativity will be um will be lesser that means it it's going to stick to the game plan like any number of times we hit that it's going to give the the same joke we innocent mushroom and innocent carrot at least those two words I remember if I hit this for the second time it's going to uh give us the same joke so that's one important point and the max token is we can restrict just to be on a safer side so this is just the playground is there not all people will have access to this one assume that we have a team and that team or a project team they have to use this so we can't or they might not have access to this or giving them the token and the end point might be little bit risky so one of the motivation why we can create our own GPT just for our team and hook it up with the Azure Ed so only our team members will be able to login and do this prompt engineering or get their queries answered maybe uh scientific or regarding the human resource or regarding the coding or any of those things and and the key thing is we don't need to sweat on anything Azure has given tons and tons of coding and here also you can click on view code you have this and you can see that you have python you have curl you have Json and you have C and it's c is also pretty much uh detailed code and here here will be the key and we will able to do that so this is the first demo just wanted to set the context but here what is happening is we are not knowing what is happening under the Hoops so that is the reason first we will comment we will see this is one more simpler thing it's just a straight lift from asure Microsoft documentation um a little bit uh tweaking I did this um one more thing to observe here is uh these two things in Net 7 and six s and8 somewhere in this uh they have introduced so if you see this in the entire program we will not see console a word and we are not even seeing that using static system. console nowhere we are seeing this so what is happening under the hoods is if we go into this debug and get into the global usings go into this [Music] Global I see it there I think from the bottom yes yeah if you see that Global usings there whatever we instruct as part of this it gets added to the global usings so that is one of the suggested way to add this and this is aliasing we doing aliasing so here what is happening is is there are multiple files two three files are there but nowhere I'm going to uh add the stating so the suggested way is to use this this is from uh Mark J Price Net 7 and net 8 book I just learned it from there so let's come back to uh understanding the deep into the weeds um one one important point is here always the app settings we can use it but we never know that when we will check it so what the suggested method is to use the secrets so rightmost click on this and uh manage user secrets that will add the secret dojm we need to put it or another method is Mark this as non-trackable and create app settings. example.js write down what all the things it needs so that way it what will happen is any of our team members will be able to use it and something which is very secret and you are using it at multiple places create an environment variable so that also I gave it in the documentation create so those things which are pretty uh secret create an environment variables now nowhere in the app settings. Json ORV files nowhere it can be traced it will be there as part of the environment variables so this is a simple uh configuration Builder and here is a model so just before printing all these things I just wanted to show you uh under the hoods then we'll go and uh create the project from scratch I will just give it a minute uh the box is slow or what okay so if you see that the response object is here I will do a quick watch on this one and if you see that um this is a raw response we can see that what is that it is going to give and inside this content we'll see the content stream and other things but our interest is here value and choices is an array at this point of time Choice choices is one and uh the usage statistics it says that uh how much it's being charged for the prompt prompt itself is 21 tokens and the completion is 106 so total 127 so this ID is being tracked the time is uh when it got created and the choices of zero the and inside that there will be text and the Finish reason is very important thing sometimes the Finish reason will be sto and this stop can be I'm completing the job and I've completed that means that will be stopped sometimes times what happens is if you go back to the if you go back here there is uh a stop sequence here we can Define that whenever you see mushroom stop it so I can put this here whenever you see mushroom you stop it so this could be one more reason why it will stop oh sorry looks like I I forgot to uh click on that and U whenever you see mushroom you stop it you saw that it's not going to generate anything of that so stop reason could be two things one is hey Ive completed my job that is one reason and another reason stop could be because we have given the stop sequence and sometimes the length itself will come into the picture suppose if I say that the max tokens is 20 so it it was costing 127 now if I say that the max tokens is 20 so at the time the stop will be uh the stop the stop reason will be length Okay and uh another important thing what we need to understand is sometimes when we say that N N means what happens is it will create n number of uh the output so we are going to charge for the total output that means whatever this Max token 120 we are telling this is for one response if he said that generate me 10 responses like create a tweet on my Doha Center which I'm opening or a fast food center create 10 tweets I will select best out of that so here I'm saying 120 tokens is for one tweet If it creates 10 tweets that means 120 into 10 it's ,00 plus okay so we need to understand that and if we say that n create Nal to 2 okay so generate sample count this is is the n now I'll put a debugger here I just want to set the context uh this is very uh very important point I will uh show you in the Python program or somewhere I created it uh we'll see that but let me jump that but I'll tell you two things and here in the program he used a different input right like it was a a different input that he used in the aurei studio right that you were counting the the tokens and everything I'm sorry I missed your last sentence on the on the program you're using a different input it's not the jokes one it's a different question right yes this is this is the one what are the top 10 countries um here okay so this is if anyone has questions yeah if anyone has questions feel free to ask them now on the chat and then when we have a chance we'll ask Swami okay that is one one sample I just want to show the uh another sample also uh before we jump into uh before we jump into the main program whatever we'll be discussing uh this is a little bit uh different uh program this is where the prompt engineering is also going to come into the picture so it's not restricted to one prompt we can have multiple prompts so let's see that program quick and in this program what is happening is the same program whatever we have seen in this previous desired output I'm saying that gimme um as part of Json or gimme as part of XML there's nothing different here so two things I will highlight uh I'll better show that in Python Programming but let me show this and where this uh completion or whatever we are doing where will it come into the picture so this output whatever we are seeing here this will give us a clear-cut uh Direction Where We can use it okay looks like uh oh right that's what I was expecting um I'll get it from here copy this and come back and put it here all the way till here and that's the okay one important thing is this one the deployment name whatever you have deployed that name is the one which it is asking so sometimes you see that different uh programs they uh I mean different sample codes they are calling it in a different name they're saying that uh model name or model deployment name but technically whatever we have deployed in Azure that is the model name we have to give the endpoint the model name and the key those are the key things now see this this is where our as I was mentioning The Prompt engineering will come into the picture see U if we used the old Mainframe model the fixed lens string we can ask that hey give this output in a fixed lens string or someone who has got a contract with this they can ask it for Json give me the output as Json now I can pass this Json this is an collection of countries population and City I have a data transfer object I will transfer this and use it some Legacy um Services soap or XML web services asmx uh web service we can use uh these XML and speak with them so now coming back to our program the we will see that uh this is the one so there are a couple of things uh I will speak here uh this is being done in uh net 8 and also in minimal API so these things I've already covered um aliasing and this one the benefit of this you will see that I'm using this constants in different places um in in these two different end points in welcome end points I'm using these constants but nowhere I'm telling that where these constants are coming from and because I've marked it here and it's getting into the global Json I can use it across the project okay that's one tip I wanted to give and another one is uh alas sing also we can do and this alas sing We Did it and the static also we discussed about it and the second important thing is we are seeing a new file in net8 template and this file is the uh it's based on one of the extension which I see that this one rest API extension you can see that um it's almost similar to this one it supports see that exactly you can create whatever the way we want the post get any anything and everything we can create it and we'll be able to test it we don't need any other tool like the visual studio code itself has got so many tools you don't need to go out of this tool everything is being provided so similarly this is fantastic tool we can write the code we can write the testing here itself and we can debug or we can send the request and we'll see the output on the right side that is another tip um and let me come back here now this is uh some of the technical jargon they call it as a big ball of Mud versus all-in-one architecture this is like an all-in-one architecture and I I wanted to keep it this way we can from here we can go to the layered architecture or different architectures we can go this is uh one thing and the second thing is the separation of concerns like many authors they say that if if at all a file has to change it should have only one good reason so if it is changing for A and B that means it's not maintaining the separation of cons so for that reason there are multiple ways to do it I did a small refactoring if you see that I just wrote an simple extension method here and this extension method itself is having the all the configurations the service container is getting configured into this so if at all I have to do anything with the service container all have to come and modify this file nowhere else that's one point and here if you see that I'm configuring the request response pipeline the difference between the controller and minimal API is the minimal API will have lesser number of dependencies that means in your backpack you're having 10 lbs in minimal API assume that if it is controller API you might have 25 lbs in your backpack so your memory you're lifting lot of weight and the pipeline if you see that it's a shorter pipeline it's like we are using this and this and this and this I think in my previous discussion when I was doing the minimal AP1 I went deep into the weeds to showcase the difference between how the pipeline itself is going so this is from the separation of concern perspective now if you see that our program class is pretty simple and another thing is we want to wrap up the UI also within this technically we can have by exposing this minimal API we can have have different clients I can write myi to have the mobile application which talks to this and react angular Blazer server Blazer Razer asp.net Razer Pages or traditional MVC anything so by writing this we can have it one more thing is if you see that while I created this extension method I used I service collection if I would have used the Builder web appliation Builder I don't need to write this line everything will come into this one so technically if you see that one line this one 2 3 four and five within five lines the program. Cs has been wrapped up and that to with separation of consons okayy I wanted to chime in real quick we're at about the 15 minute Mark and I know you had another chat GPT sample you wanted to get to oh that is this is the one this one okay yeah this is the one so if you see this common so all the constants there are no magic strings around anywhere in this one so whatever the magic strings are there it's there here if you see that these are partial classes both are representing the end points so again if you see this this is separation of conses and minimal impact on other files if I have to change the endpoint name all I need to do is come and change it here nowhere in the endpoint if you see that in Azure open a endpoint nowhere we have that so that means even though you go and modify some of the file it's having two things one is separation of concern and the impact of modifying something in this file on the solution or on the project is very low if I change it nothing is going to change only the client if my is using my has to make the change in the end point so these constants I I learned it from Steve Smith he said that put it in the class and wrap all your constants like this and create multiple classes net 8 or maybe Visual Studio or Visual Studio code I don't know who suggested he suggested that this all these two files were in in the same file under constants and different classes but the it suggested I mean the ID itself suggested to have a partial class and move it into separate files so again this is again a uh beautiful way to organize and minimizing the impact of our changes okay yeah Steve Smith has a very popular um session on our YouTube and I'll try to find a link about clean architecture is like one of our most popular videos on the channel channel so um yeah you can check it out um why codies is also like just to have to go to the basics for some of our beginners like he's asking what I'm hearing the minimal API um how does it defer from web API so can you cover a little bit of that otherwise we can oh yeah yeah um uh we did this in the previous uh uh previous session but I will definitely cover this real quick uh in the same series itself here in this uh series we we have covered uh that piece let me go to the documentation so that uh I'll be able to speak uh much better yeah so if you see that as I was mentioning two things go ahead there will be two there will be two things which will impact or which will come into the picture so I will directly go to that section where uh I was uh showing the differentiation parameter finding and HTTP pipeline yes this is this is where so one thing one thing is if you come and put a debugger here in the program. Cs uh just give me a minute uh this will answer your query let's wait until it hits the debugger okay now you we start any of the line there are 127 dependencies L we have to load this in memory so what if you compare the web API the controller fullblown controller API versus the minimal API just before you start the builder. build I mean at this line if you put the debugger and compare both whatever you're adding on top of that and whatever it is coming out of the box so the number of services which loaded in the container are more that means this is from the memory perspective and from the HTTP pipeline perspective so HTTP pipeline perspective here if you go and see these two URLs uh and then just watch the previous video it is there um so if you see that it's in this one so wow I think it's this one here yes this is one if you see that the each of these middle wear will have some seconds like maybe milliseconds or Nan seconds or whatever it consumes some time so if it is a full-blown controller how many of these middle Wares you will add it to your controller it will be definitely more than minimal API so if you see the minimal API in the previous video I clearly shown that it comes to the request and directly goes and hits the end point and send the response but here in our case whatever we are seeing is let's go to this application Services uh I'm sorry this is uh the request response pipeline so if we see that request response pipeline we are using swagger that means one which is common and exception this is common this will be common static files is this Middle where and we are just adding end points that means these things are not there so it's like uh whether you're eating a plain pizza or you're adding additional toppings so the request response pipeline whether the pipeline is longer or shorter that's where it differs so yeah like just to like summarize like minimal API is still like you're You're Building web apis but it's like a like a more straightforward way to build web API with your code because you remove all of that scaffolding all of that structure that you have to do in the past to create a web API and you can see that the code is much cleaner and so that's one of the the beauties of having minimal a and that's why we call minimal is because it's like a small um way to build those web apis so but it is powerful I'm wrong so it's easier to write because you remove all of that um middleware C that you had to do before uh and you go straight to to like the routes and and and and the end points that you had to do exactly so we can I uh we quick time check here we've got about seven minutes remaining there is one thing I wanted to dig into um I I know I'm curious it sounds like some of our viewers are as well I know you're using the open AI client library that the Azure SDK team ships can we look at your CS Pro File and just point out what dependency that is that's being used oh yeah sure sure I just wanted to that's the glue that's kind of making all this work and um I want to make sure folks understand what's going on behind the scenes and maybe we could show how you're creating the client object uh from that Library yep yep sure so if you see that um Myra this one will also help them if you see that at this point here is the request we received it here and see that it's directly going to the end point there are no middle Wares between the request and it goes hits the end point and then it spits out the response so the shorter midle lare the faster response and if you see that uh here we are having Azure open aai uh Azure a. openai and uh this is the library which I'm uh using and uh in the repository if you see that repository it is the same one we'll create a open AI client we will pass the endpoint which is again coming from the application insights and uh open a key I created as an environment variable so that I don't need to uh save in this uh app settings or in the Json file or EnV file and worry whether did I check in or not and the model deployment name plays a very crucial role if we give something else uh some of the samples they call it as model or different things which I've seen um so but it's a model name this is how we create the client and uh await client. uh call the method get completion as sync and the deployment name and this is uh the completion options and uh in the completion options if you go and see at the top you will be seeing different things uh choices everything is this is equalent to end this is what I told see we can tell something in two different ways I can tell that tell me two jokes on innocent people that means N I can tell as part of my prompt or I can tell that tell me a joke and choices per promp is I can specify two that means what happens is it will generate two choices in that case I need to go and iterate the choices at this point of time whatever we considered is the choices is one I ask a respon question and it gives me a response but this is n Echo is another thing where we need to be very careful that means it will slightly increase the number of tokens because I ask a query it it gives me a response and if I See Echo that means it is it is going to spit the query what I asked so we're going to be charged slightly because apart from our response even the query is being spit out and another thing is this one stop sequence I spoke and I gave an example also another thing is it's called as um I'll tell you what it is called uh from the from the UI or from the python perspective if you see that uh best of so best of means what happen is internally it creates 10 if you say that best of 10 tell me two jokes so what is happening is it is going to create 20 jokes and out of that it takes one this whatever is our input so if you say 10 so 10 into two jokes so 20 jokes it creates so we we are going to get charged for that one okay so this is our repository and if you see that uh let's see how it works we have two minutes and I have this code checked in I will give the repository I'm going to that repository I I'll give you so that it will be uh pretty much hand keep it uh this one one where I have all the source code and everything here so we'll see the Swagger we have seen and because we have included here we see that it is plain HTML JavaScript and CSS nothing else so if I come to the sources you can see that it's uh index.html uh bootstrap is getting coming from the um CDN and have a CSS file and JavaScript file and the CSS file is is pretty simple and uh that is uh I'm having a Google fonts so that's another thing and the scripts is very simple scripts so whatever we are doing we are using this fetch from JavaScript and we are calling the completion passing in the user input and whatever the respon response it comes that's being processed here and we are maintaining two different things so let's see this in uh in action I'll open this and keep it as part of uh uh this response I say that uh what is an apple and click on that we'll see the network Tab and you can see that the network tab we were hitting the API end point whatever we have seen in the Swagger and payload we passed it and we see that because I said echo2 true so what is happening is the input which you have given and the answer which it is coming both will come out so that means we are going to be charged for these tokens and again the query is coming as part of the response so it's like a double charge for the query okay so what is generative Ai and I'll click on that so it's just simple um to have a query and answer like a simple thing the motivation or why why do we need to do this is someone who doesn't have but they need to speak to the CH GPT or this completion models if we hook this up to Azure ad only this team with which we'll give access to that Enterprise application only they'll be able to log in and they'll be able to access whether they are programmers or non-programmers everyone can use it like we build it and others can use it for programming query or anything back to you Scott yeah thank you uh it's so unfortunately we have to wrap things up uh real quick though I wanted to point out again this is the Azure open AI client library that Swami's been uh demonstrating uh he's using the beta 8 version which has been out about a month now so kind of sort of hot bits come at them come at those with Evan mitts on um we do expect that library to reach GA in the near future so what you saw today will be stable in the near near future and you should be comfortable using that in your production grade applications I do want to thank all of our viewers for tuning in uh thank you for your Contin continued loyalty and all of your questions in the chat as a reminder uh you can check out our live stream recordings out at do. netlive and if you tune in next week we'll be joined by guest Victoria dolenko who will join the show to talk about building a scheduling system with background processing based on postgress so do tune in next week and check that out thanks to Swami and thanks again everyone for tuning in today good night folks by [Music] bye
Info
Channel: dotnet
Views: 4,124
Rating: undefined out of 5
Keywords: dotnet, AI, LLM, LargeLanguageModels, ChatGPT, AzureOpenAI, Azure, OpenAI
Id: 3r7bR5ZEJp4
Channel Id: undefined
Length: 65min 22sec (3922 seconds)
Published: Tue Oct 31 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.