Build your own LLM chatbot from scratch | End to End Gen AI | End to End LLM | Mistrak 7B LLM

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
the world is getting crazier with Chad they want to explore they want to know more about Chad GP Well Chad GP is also one of the mostly surged term in the internet in 2023 but what exactly is Chad GP what is the internal architecture how does it work in behind how is it able to generate large amounts of text how is it able to help you with your blog writing with your content creation and lot more things we will be getting into that in this particular video so this video is going to serve as a knowledge guide on the entire Concepts behind AI generative AI large language models and we will also be exploring what are the other options apart from chat GPT because most of you know about chat GPT and most of you might not know that chat GPT has a lot of Alternatives as well we will also be winding up this particular video with an endtoend case study around generative Ai and we will be exploring an alternative of chat GPT and we will try to build a llm based chatbot also we are going to deploy the chatbot using flask application so stay till the end for this particular video most of the initial topic discussions are going to be done by me but for the later half of the video where we will be implementing llm code wise and then we will be deploying that into a flask application we have another guest I will be doing an introduction to him shortly what exactly is artificial intelligence now ai is nothing but obviously we know the definition is artificial intelligence that means some sort of intelligence that is achieved artificially now there are various areas of researches that basically leads to Ai and one of these areas of researches are machine learning natural language processing deep learning learning computer vision automation autonomous vehicles and many more fields that basically leads to AI now generative AI is nothing but a part of artificial intelligence in short generative AI is that part of artificial intelligence that helps you to generate or create new content such as text images music or videos that is originally and not directly copied from existing data now talking about large language models now are large language models generative AI or vice versa are generative AI large language models well generative AI is the overall picture the generative part the generative artificial intelligence and it has both llm and non llm part now llm has become famous because Chad GPT is a llm model which is nothing but a large language model so in simple terms in layman's term what exactly are llms llms or large language models are powerful artificial intelligence models that are trained on vast amounts of Text data to understand and generate humanik text these models basically use deep learning techniques such as Transformer architectures to process and generate text based on the patterns and structures they have learned from the training data now as we know the basics of llms what are different kinds of llms in the market now if you want to know more about generative Ai and llms however I have just talked about one sentence or two sentences but if you want a detailed video on generative AI or llms please let me know in the comment section and I will come up with either a video or a affordable course whatever it is but most likely a video because I don't want you to pay for anything so in case you have any questions let me know in the comment section below coming onto this topic of llms large language models now I'm not going to go in depth into the architectures as I told you if you want that video let me know and I will come up with another video in short llms there are multiple types of llms one of them being Chad GPT now Chad GP became famous because it basically created an interface where people can log in people can start talking to it and people will able to get the responses back now it became very famous because it was free of you and later on when people started using it they basically started making some changes to their models they introduced 3.5 then they introduced GPT 4 and then they started charging users like $10 a month or something like so that's how it became famous and now chat GPT being a part of open AI is being acquired by Microsoft it is going to get a huge boost but apart from Chad GPT there are many more llms that are easily available and can be free of cost you don't have to pay anything for that what are those one of them is Google bad which is basically by the Google team and one of them is Lama l l a m a and it is by meta apart from these two there are various other llms like GPD 3 by open a GPD 3.5 and GPD 4 by open a so that's all about some of the basic concept behind AI generative Ai and large language models so let's get started with the use case part which serves as an endtoend project for you guys people who are interested in the generative AI or large language models or deep learning part they can explore this particular project and work on it so in this video I have a friend I have a colleague I have a partner I have a Community member whose name is Mr wasant P he's also a YouTuber he runs a YouTube channel and his channel link will be in the description below and he works as a AI developer with joho z o I think most of you might be knowing about the company now I'll hand over the mic to him and he's going to demonstrate about which llm he's going to use and how is he going to leverage that llm to create a llm based chatbot and ultimately he will take the llm chatbot to the next level by deploying it into a flask application and he will show you how he's interacting with the chatboard the codes will be given to you for sure I don't want to keep the codes with myself codes will be given to you but at a cost of 200 likes and 50 comments so in case you have any questions around this space of generative Ai llms chatbots gpts and all those things let me know in the comment section below and I will get back to you 200 likes is what I want to give you the codes freely access it deploy your own llm chatbot enjoy your flask application write your project in your rumes best of luck to you handing over to wasant here we go yeah uh thank you sjit for the amazing introduction which you provided so in this we are going to see how to fine tune Mr LM Mr 7B to create a chat bot which will be like chbt but not chbt this is something else right so it has given great performance which I'll show at the end of the video where I'll be showing a flask application I would not be delving deep into the code of flask I'll show how the calls happen that's for that's there but I won't be going deep into how it works okay I assume that you all know how it works okay so with that being said first let me say few words about mrl 7B about the organization and how it works behind in Mr 7B and then we'll move on to the coding part which I have a separate collab notebook for which it is there yeah so this is the website of Mr Lei so they have said that it is the best 7B model to date Apache 2.0 licensed okay so probably it is an arguable point right now because uh open ARA organization what they did was they took this myal 7B itself and then they fin tuned on their own called open ARA which is again going to be myal for sure but yeah their B version is is might is it might not be the best model right now okay but still it has given some great result but we are not going for the base version we are going for the instruct tuned version in this FR uning okay so U if you see the website was created on September 27 so on by that date it was the best yes it has outperformed uh every 7B model but not just 7B models it has also outperformed 13 billion and 34 billion model okay so it has written here right see outperforms llama 2 13 billion outperforms llama 1 34 billion and it also approaches on COD Lama's performance okay so this model if you see it is not purposely created for uh coding okay it is an llm gen General llm general purpose llm but still it gave Great Performances with coding also but in case of cod Lama it was just for coding but yeah still uh it came closer to Cod Lama so that is pretty good okay so there are some advantages in this architecture first is they are fast in inferencing and the reason behind is they are using group query attention there is a attention mechanism called group query attention and what it will do is it will have number of queries being grouped and and provided to single key and single value okay so that is group query and then they have also provided sliding window attention which will support up to 128k tokens which is been specified in their uh paper Okay so up to 128k tokens it will support when you are F tuning or even you when you're inferencing which we won't need because we won't have that much computation but let's say you have an use case and you have the competition then yeah you you should be able to use Mr 7 B itself for your long documents and all okay so they have released it uh under Apache 2.0 license which means commercially it is usable okay and here is the main thing which you need to note okay performance comparison I'll just go through this graph uh in a minute and then we'll go on to the collab notebook okay so here we have uh the comparison okay so orange is mistal green light green is Lama 134 billion teal is Lama to 13 billion and red is llama to 7 billion okay when it comes to mlu it has given outperformed all of these okay and in knowledge uh category it is slightly lower than 34 billion but still it has given as much as the performance with 13 billion okay and in reasoning it has been surpassing all the models same with the comprehension okay but when you comes to come to the other side right in ag which is uh it is to check uh how much it is autonomous in nature okay so in that it is it has outperformed literally all models with a very significant margin and similarly it worked with math and code in BBH which is Big bench hard Benchmark it was slightly underperforming than 34 billion but still it has been pretty good see uh 7 billion model outperforming 34 billion which is around 5x its size but it has outperformed with significant margins that good is this model Mr 7B which I think we can term it as open source Beast okay so that is what you need to know about uh mral 7 billion from the organization known as mral AI so with the introduction being said let's go on with the fine tuning okay so let me open the collab notebook now here we have the Google collab notebook so the first step in the finetuning would be is to install the dependencies required so here I I have accelerate P bits and bytes and uh installing Transformers you should install it from the source for this video because at the time of this um project hugging face hasn't released the version with mral support in PIP okay so the PIP package isn't released yet so you can install it from The Source by which only you will get the support for Mell and TR library for Transformer reinforcement learning which will help you to use sft trainer which will act as our trainer in this project okay and we have Auto gptq uh which we will be using for gptq Quan model okay because this video we'll be fine tuning the model in a gptq quantied way okay a gptq quantied model only would be fine-tuned in this video so that's why we need uh Auto gptq and uh Optimum it is also a requirement when you are fine-tuning a gptq quantied model okay so these are the dependencies which are required for this project so I hope you all know now that these are all the things which we need to install usually so yeah now we'll start with the import before going on to import uh just to let you know that it is very important for you to create an account in hugging face Hub and then log in by providing the API key so let me show you how to do that okay so just go on to hugging face so this is my profile okay so just click profile here once you Cate create the account that that is when you should do all of these click settings click click access tokens so you all might not have uh these tokens okay so what you can do is click new token have this as something like maybe mral chatbot better not leave Space Mr chatbot and I have the role to be right because if you don't provide right access you won't be able to push your model to the hub okay so it is very important for you to have right access so yeah just copy this and once you run this obviously you'll have a token being asked you can provide in there okay so that is about the login now coming on to the install uh importing dependencies since we have installed it now we can start with the import followed by which we can go onto the find uning part okay so the first input is torch because we are using torch framework to work with this so yeah the first input would be torch then we have load data set function which will help us to load the required data set and data set class which will provide us the functionalities to play with hugging face data set okay so whatever data set uh we load by using load data set it is a data set class and to harness the functionalities we need this data set class Okay so so yeah now coming on to the PFT we have uh Lura config so Laura config uh is required for you to create your L adapters and then autop model for Cal LM so this I'll come on to this into the later part so next we have prepare model for kbit training which will help you to prepare your model for k k is some arbitary value in which here it would be 4bit training okay prepare model for 4bit training and then get PF model it will try to create your parameter efficient fine tune model based on the config you created for low okay so in Transformers uh we have Auto model for CM Auto tokenizer auto model for CM will call the model and Auto tokenizer for your tokenizer gptq config for specifying the config of the Quan model and then training Arguments for you to provide your training arguments okay okay so basically your bat size number of epoch and yeah so on and so forth which we'll be seeing um in some time all right so next we have sft trainer being imported from TRL so sft stands for supervised fine tuning so this is a type of fine tuning which we are doing right now superise fine tuning trainer is being used along with that we have also have OS package okay now coming on to the uh data set part so here I'm loading alpaka data set okay so this is a famous conversation data set let me show you here so here is the data set which you if you see here uh if you give an instruction stating gives three tips for staying healthy and the output would be eat a balanced diet and so on and so forth okay so this is how it would be but the key thing to notice alpaka data sets has its own format okay so that is something which we need to note so let me show that here okay here we have three columns which is very important for us instruction input and then output okay just keep this in the note now I'll come on with the code okay so once you call the low data set function with the data set ID which we pasted here right this is the data set ID you can just click copy here it will copy and then you can paste it here in that we are calling the train split alone usually it will have lot of splits we are just taking the train split data alone and then once it is done we are converting it into pandas so that uh it is very easy to work with pandas data frame rather than a hugging face data set so yeah we have converted into pangas Data frame for limiting uh the time of execution and uh for faster execution obviously I've have restricted this to 5,000 data points again we we won't we won't even be using 5,000 data points uh you'll understand that once you um come on to the training argument section I'll be explaining there but yeah uh if you think okay I'm just going to run it for how much ever time it is there so it is fine uh because if you see the data set has around 52 K rows if you think that you have computation for 52k rows just go on with that because if you run it for 52k rows you you will get a great result for sure okay okay so yeah now in SF trainer um it expects all the text should be in a to be in a single field okay single field single field is like single column okay so I just create a new column called text in which I'm calling the apply function on top of input instruction and output all the three columns okay how I'm combining is this is alpaka prompt template okay first you'll have hash human and you'll provide the human instruction if there is some input input is also being provided and along with that you will also provide the assistant so that it will know okay you have provided the input that is fine from now on I need to provide the output so that is what it will recognize when you see assistant okay so that is how you need to combine human your instruction so here if you see instruction is there input is empty if the input is there it will also come here okay okay so identify the odd one out Twitter Instagram telegram okay so yeah now with that being said you will combine the instruction input followed by that you'll have assistant being combined with this okay so once that is done Hereafter it will try to fill okay X of output so yeah since we are using CM there is no such thing that this is input this is an output okay in calm it will always try to generate the content so wherever you stop it will try to generate from there and here we'll stop always from assistant is two and one space okay so that is how uh that is how it is framed here all right now uh since we are completed with the processing we need to again convert this into a hugging face data set because that is the required format for uh the trainer so we are calling the from pandas function which will create hugging face data set based on a pandas data frame which is here data unor DF okay now the next step is to create the tokenizer so before going on to the tokenizer let me show you this model because this is the model which we are going to use okay we are going to use myal 7B instruct version but it is gptq Quan okay so that is what we are going to use here so this is from the block okay there is this person he always uh uploads the different quantized versions of model ggf gml and awq gptq and we are using the gptq format okay so yeah um from here we are calling the tokenizer so here you can see right tokenizer Jon tokenizer model everything is there based on this Auto tokenizer will try to create the tokenizer and we are setting the EOS token to be pad token which is very important okay it will just say that okay this is the end of sequence token and you can stop the execution here so that is how it works all right next is quantization config okay so here we are using gptq config which is 4bit so we have said that uh gptq config bits is equal to 4 disable XL is set to true and then we are also providing the tokenizer okay so if you have this now you are saying okay this is the quantization uh models configuration So based on this we are loading the model that is what is happening here and device map set to Auto what it will do is it will try to allocate the layers of model with some CA device okay so if you have multiple device it will try to efficiently map it to multiple devices Koda devices but here we have only single GPU with uh Google collapse it will try to map everything to the zero device all right so next we have model. use model. config do catch is equal to false which means uh we are setting that we are not inferencing it we are training it okay so that is what happens with use catch equal to false and pre-training TP is equal to one is to replicate the performance of pre-training okay these are all some configurations which is need to be set prior to training so yeah guys keep it in the mind so next we have a gradient checkpointing enable which will help us to checkpoint the gradients and use it and also uh we have prepare model for K bit training like I said here it is bits is equal to 4 right so it will become prepare model for 4bit training that is how this model will be created now the model is ready for 4bit training okay now next what so you might think wasn't the model is ready so now shall we go onto the training no like I said uh here we are using Laura now you might think was why is Laura needed you see these models are pretty huge um Let Me Go and show you the size here okay so it is 4bit quantized yet it is 4.16 gigs and uh if I'm going to show you the original size all right so let me go for mral here sorry here we have the original model which is of size around 14 gigs all right which is very huge so here to Compu compute faster and train faster the researchers recently uh created a method called Laura which is low rank adaptation of large language models okay which states that with small weight decomposition matrixes itself you'll be able to um fine tune the model and achieve the same performance which you will get by fine tuning the full model okay so let me give you a oneliner of how it works so what happens is uh let's say you have a model you'll try to create adapters for it okay so what happens is here is the original weight I hope it is visible here okay here is the original weight and here is your lower adapters so once your uh input goes first these weights will be frozen okay this won't be trainable it will just go forward and come uh there is no backward pass here okay this is just for computation and this would be trainable okay your L adapters So based on this you'll be only finding the Lowa adapters which you provide okay so that is how Lowa adapters work um it is a detail concept so yeah um but right now you won't need that much uh you need to know some of the aspects of Laura alone just know that with smaller uh matrices itself you will be able to achieve the performance that is a core of Laura and uh here are some things which you need to know important on uh Laura Laura configurations okay so here we have uh Laura ranker 16 R stands for Laura rank so smaller the rank um lesser the performance so in the paper Laura paper they would have specified that even 8 is suffice enough but I have gone for 16 but yeah eight is also suffice enough but if you go for four it will be faster with lesser memory itself you will be able to execute it but there will be slight uh degradation in performance so you need to find the trade-off between the performance and uh training computation resources okay Laura Alpha is equal to 16 it is a trainable uh sry not trainable scalable uh value okay it will try to scale the weights so that is why Laura Alpha is there and then drop out it is very usful like our normal Dropout only it will also work for Laura Dropout we are setting the task type to be Cal alarm because there is another type called sequence to sequence alarm right so here we have only Cal and next we have Target models being set to q and V see uh usually with Lowa right uh you need to provide some Target modules so with the target modules it can be your attention layers it can be M MLP which is your linear or dense layers whatever it is you can provide here we are going for um the attention layers okay especially q and V Pro as per their specification in their uh paper itself they would have specified that with uh query and value layers alone itself you'll get some very good performance that is what they would have specified and we are following that okay so that is how the Laura adapters config is so now that the adapter config is set it is time to attach those adapters with the base model which is done by get PF model okay when you call the get PF model based on these adapters it will adapter config it will try to create adapter and it will try to attach those to the model okay so that is how it is with model equal to get PF model of model comma PF config that is how it works okay so next we have uh the training arguments so inside the training governments there are lot of arguments to be provided so these are some uh parameters which is very essential I'll explain each of these now output directory is your output directory I think that is not needed of much expl explanation right next we have per device train bat size it is named such because uh like I said if you have three gpus right here we have said per device train bat size to be eight so your bat size efficiently effectively rather would be 24 okay so per device 8 and 8 cross 3 would be 24 but here we have only one GPU so it will be 8 itself okay and then next we have gradient accumulation steps so what is this gradient accumulation step so if you worked with pyw right we call this step function usually so what this gradient accumulation steps does is it won't uh call the step function for every step okay it will try to accumulate the gradients and uh for a certain period of time obviously if I have given one here if you have given eight steps it will go till it will wait till eight steps accumulating all the gradient and then it will update it okay so if you keep higher gradient accumulation steps the uh the training time should be slightly faster okay next we have Optimizer being paged ATM w32 bit this is a special Optimizer created to work with quantied models okay so if you have quantied models being fine- tuned then you should use page admw Optimizer okay next we have learning rate set to 2 e minus 4 this is a basic learning rate which is used right now starting from Lama it is a common learning rate being used next we have uh learning rate schedu type to be cosign okay and uh save strategy as OC which uh wouldn't make a difference here but it would make a difference when you command the max steps okay so let me see what are about Max steps and logging steps okay logging steps what it will do is it will try to evaluate at the given step which is 100 years so 100 steps here so at 100 steps it will try to evaluate and then provide the loss that is what logging steps means Max steps what is that is you see uh there are 5,000 rows here but the training happened only for 2,000 rows so you might wonder what how it is see Max steps is for you to restric your training to some arbitary number of steps and here I restricted it to 2,000 so still you can wonder what and there is only 250 so how the calculation works is uh you have the training batches right you should see the effective training bat not the per device training bat okay like I said uh here it won't make a difference but if you're working with a multi GP server it will make a difference so here we have eight as our effective training bad size eight cross sorry 8 multiplied by 250 would be 2,000 okay so that is why it is executed only for 2,000 steps but you can also uh commend that Max steps 2, 250 to be uh out okay I don't need this restriction at all so yeah if if you think that like that you can just command the max steps and then execute it it will work still okay but the training time will take some more time okay so we have fp16 set to be true which will train the model in a mixed pression so fp6 is a mixed pression training so we are also enabling that finally we are also enabling push to HUB to be true what it will do is it will try to push the adapter model and adapter config and your tokenizer and all to the hub okay so so that you can later call it for for your inference which we'll also do here with creation of the chatbot okay so that is about the training arguments finally we have the trainer here in which it will take the model parameter training data set parameter which we also created it okay model is the kpf model output that is what is your model here and uh we have the P config which we load okay PF config is the Laura config and uh just to let you all know there are other PF methods as well like there is ia3 um there is prefix tuning promt tuning uh there are lot of other methods but we have gone for uh Laura because that is currently the best way to find un or model with parameter efficient approaches okay we have the data set text field to be set as text like I said uh it will expect to have a single field or single column which we created right uh under the column name text so we are providing that here we are also providing our training arguments tokenizer Max sequence length to be 52 and packing equal to false you see uh what they do is uh if you don't have the sequences to be off the max length right they will try to pack the next sequence with the previous sequence itself this will help at times when you are doing pre-training and all so that's why they have given this uh aru uh functionality but we don't know we don't need that so we have specified that packing is equal to false and then finally you can call the trainer or train function which will train the model so that is fine tuning mrl 7B for you all okay so if you have a custom use case all you need to do is just have this uh data set processing side to be uh changed okay so still the other code Remains the Same you can use the same code for any project you want just make sure that you will be creating the data set in a single field called text it should result in that okay so that is the aim when you are maintaining it for any use case of your own so here if you see I ran it only for 225 steps right and uh 250 rather yeah 250 steps and for logging it happened at 200 steps and 100 steps because the Locking steps is said to be 100 at the end of 200 step it is 1 into four laws which is okay okay it's pretty good actually for a quantex model and what I'm doing is for safety concerns uh since Still Still it is pushed to have but for safety I mounted the drive and also copied it to the drive so you want you can do it or else if you don't want you need not do it okay I'm not going to show the inference here uh I've uh I have an application so that uh it will be more better as an application right when you are inferring it so let me show the application right now okay let me fire up the VS code so yeah here we have the vs code okay in which uh we have flask class being uh called from the flask okay so again this is a flask application we have the flas uh flas class being called along with a render template which will help you to render your temp template um which is your HTML page okay and then we have request to call to provide a request from front end to the back end and jsonify to basically uh provide uh a Json response okay so that is why uh we need jsonify here the important thing you need to have is the chat from chat import chatboard this is the important thing we should note because because uh these are all uh simple things where you you need to call the front end I'm not going into the front end part okay I'll show you only one thing so here in chat. HTML I'm not going at every places so here you can see there are uh JavaScript being provided here and uh yeah here if you see there is the send button right what happens is if you call the send button if you click the send button obviously from the front end it will call to this back end okay so that is how the post comes all right so let me go on to the chatbot file right now but yeah uh before going on to the chatbot I also like to show you the fact that how it is being rendered in the front end okay so here if you see uh response. answer right this is what it will do once you uh provide this post request from here again the response will be provided which is answer here and that answer will be called here okay so that is being displayed so that is how uh the pipeline Works basically but the thing we need to not is chat. py file okay so this is the main file for inference so still if you are feeling okay I'm not good with uh HTML I can't go for flask and all still you can write a streamed application and call this itself again there still it will work okay so here we have the autop model for CM what it will do is now uh we saw how to create the PF model right you will create the Laura config you'll call the get PF model everything you'll do right and auto model for cm to call the base model everything you will do by yourself but when you do this autop model for calm right once you provide the repo uh it will try to get the adapter config and based on that it will itself call the auto model for Cal LM and it will try to combine the P adapters by using kpf model so around three steps will be combined into the single step when you're are calling this Auto model for Cal LM okay next we have the generation config um this is to provide or constrain your generation of the model and we have Auto tokenizer which will create the tokenizer okay so here uh you saw there right we provided as Mr F alpaka and it is under my name vent in my user only it is there right so uh you need to provide that also b/m fine tune alpaka and uh the model is also available there so we are calling the first the repo ID we are also setting the low CPU memory usage to be true which will make it to consume the usage of uh CPU being very less the to D type is set to float 16 so we all we saw right we were working on float float PR 16 so the Dos D type is set to float 16 and the device map is Coda so that we can use the GPS okay we also have uh generation config in which we have set the top K to be one which will try to provide the top most probable um output and do sample equal to True is set for temperature temperature being used okay why is this temperature parameter needer is to control the randomness of the generation okay so usually large language models are very random in nature when while they're generating but we can control that by providing a very low value for temperature closer to zero okay when you're closer to one it will be very creative and diverse in nature we have the max new tokens also Max Max new tokens parameter also which will have 100 as its value here so why is this Max new tokens is that will allow the model to generate up to 100 tokens okay so that is why we have said this Max new tokens to be 100 and finally we have P token uh ID to be set as tokenizer EOS token ID which we also did while we were fine tuning it right so we are also doing that here so that is something uh you need to know or else you need to do before you're going for inference okay these are all some pre- inference requirements so to call the inference the first step would be um here the message is coming from the form and we are passing it via the chatboard right so the message will be formatted we fine tuned it in an alpaca format right so first we are creating it in such a way so we are providing human giving a space combining the message we are giving in another space we are providing assistant and we are letting it to complete okay now if you provide the inputs so here we have uh tokenizer the inputs how it will be created is the tokenizer will tokenize these and the input IDs will be return in a Cuda device okay this will be ca tensas with the name input IDs if you call the model. generate function it will try to generate based on this which will result with token token IDs or output IDs which you need to decode again with the tokenizer okay so that is what we do here we are decoding it and we are skipping the special tokens if there is any and have also said replace input string comma an empty uh string because uh it will try to generate from the input string which we don't need while we are getting an output so we are calling the replace function here so let us see how the output is right now okay yeah so let me go for the terminal you need to en activate your environment with all the requirements okay so let me call the flask run so here you can see it is being called for first time it will take uh some time for calling because what happens is you you'll have um the model being down download it first and then it will load so here I've already downloaded it it will only load okay so now it is ready so let me go for the Chrome okay so here we have the chatbot interface okay it is a simple interface so let me ask um I dropped my mobile phone in water what should I do now if you send this request obviously it will take some time okay so let us wait for that probably it should take around 3 to 4 seconds so yeah it is come it is saying turn off the phone immediately remove the phone from water and dry thoroughly use a ha dryer in a low setting to dry the phone use a desant to absorb any remaining moisture check the phone for any damage if the phone is damaged take it to a professional repair uh if it is not damaged since you have dried it you are uh fine to use again so yeah it is pretty good right uh so yeah we have a powerful chat Bard which is running in your local system okay and also we saw how to F tune uh Quan T model okay so that is about how to fine tune Mr LM or else Mr some be here for a chat bot okay thank you all that's all about this particular video guys I hope you enjoyed it in case you did please don't forget to like share and subscribe the channel and leave a comment in the comment section because that is most important see you in the next video bye-bye
Info
Channel: Satyajit Pattnaik
Views: 17,820
Rating: undefined out of 5
Keywords: satyajit pattnaik, data science, machine learning, data analyst, artificial intelligence, Projects, how to create a llm chatbot, satyajit pattnaik llm, satyajit pattnaik ai, satyajit pattnaik gen ai, llm based chatbot, create chatbot using llm, local llm chatbot, mistral 7b, build your own ai chatbot, build your own gen ai chatbot, gen ai chatbot, mistral llm, build your own chatbot using mistral llm, chatgpt alternatives, Build your own LLM chatbot from scratch
Id: yISaV2vp-Fk
Channel Id: undefined
Length: 44min 48sec (2688 seconds)
Published: Sun Oct 15 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.