WizardLM: The Best 7B LLM - Can Beat ChatGPT

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
just a small 7 billion parameter model and if you can do this this is pretty amazing there is a new large language model called Wizard and then which shows that the size of the model doesn't really matter they should that even a small model like this with only 7 billion parameter can outperform models like chat TPT or insertion a complex instruction following tasks today we're going to be talking about this new model called visit LM an instruction following llm using uvald instruct this model is very impressive because the author showed that even though it's just a 7 billion parameter model it can outperform models like chat TPT on certain very high complexity tasks to train powerful national language models you need access to a huge amount of open domain instructions however manually creating such instruction data is very time consuming and labor intensive tasks moreover humans May struggle to provide High complexity instructions without going into too much technical detail the main contribution of this work is this new method called evolve instruct that uses large language models instead of humans to automatically Mass Produce open domain instructions of various difficulty levels if you recall the chat GPT is trained through a process called reinforcement learning with human feedbacks so in this case they replace the human with another large language model the Evol instruct method starts with an initial set of instructions and uses the proposed Evol instruct to rewrite them step by step into more complex instructions then all the generated instruction data is mixed to fine tune llama models and the resultant model is called wizard LM although the model size is pretty small but using this technique the results are really impressive and they are on par with much larger models now before showing you some examples let's look at the training data so in this case the training data contains only 70 000 instruction following data generated from instruct evolve method and the way it's done is you have very detailed instructions so it describes the task the model should perform and each one of them is very unique and then you have the output or the answer that was generated by chart repeating so in this paper they tested this model along with alpaca and Char TPT on these different tasks including math code generation writing computer science and this entertainment art music even philosophy and chemistry and there is also pretty impressive now in general with the compute the performance it outperform model stack alpaca 7 billion parameter modeling in Wilkinson B easily but overall uh charitypd performs better than uh wizard LM however when they looked at more complex tasks which involved a more complex instruction following it seems to outperform chat GPT as well now that in itself extremely impressive given that it's a very small model now this indicates that for very specific tasks we might be able to train task specific models to outperform a much bigger generalized models by this what I mean is imagine a model specifically trained on only mathematics that that model can be much smaller but it will outperform models like chat topt or even gpt4 because those are more General models and not trained on a specific task now before looking at some examples let's look at how it was fine cured so in this case they are using a bad size of 6 64 with initial Nama 7B the great thing about this is the maximum length so 2048 is the maximum token size now let's look at how you can use this wizard of language models now under your website they have provided a couple of examples of where it outperforms chat GPT so for example in this case the model was instructed to provide latex good to create a cable right and it's able to do it along with the detailed instructions chat tpd is not able to do that and it results in a number of bugs but we are going to be looking at our own tests and compare the performance between chant TPT as well as visit NM so let's get started now you can actually run this locally as well using uh text web uis such as ubooka I'm going to be creating videos on those so I'll check those out but in this case they have provided a web demo if you just want to experiment with it so there is demo one demo two let's open one of them okay so let's run a few tests on this model the first one is just idea generation so uh this is the prompt that I found online create a list of three startup ideas and then B2B SAS the the startup ideas should have a strong and compelling Mission and also use air in some way and we want it to avoid cryptocurrency or blockchain so let's see what it comes up with I'm going to run the same prompt on chat TPT as well here for chat dpd we know that is pretty good it can come up with some really nice ideas okay so here are the three ideas uh that they it came up with Alpha mind Alpha Logics and Alpha attack for some reason it added Alpha and all three of them the first one is a SAS platform that helps businesses streamline their sales process by providing a personalized sales assistant okay a sales platform that house businesses automatically a logistic and supply chain processes the AI Power Platform using machine learning algorithm to optimize delivery The Roots reduce Transportation cost and improve Inventory management and the third one it's a SAS platform that helps businesses improve their customer service by providing an AI power chatbot okay all of them are pretty good ideas now on the other hand chat repeat came up with I think a more diverse so in this case it says Brain boost AI a clever mind Ai and Eureka Ai and I actually checked so there is a company called brain boost in clever mind okay next up let's look at this ability code we're going to start a very simple part and then we will add more to this prompt to see how complex it can get so a front raise and HTML code for a website with a single button that winpress changes the color of the background to a random color right so let's submit this and let's see what it comes up with uh so here is the code and there's a small explanation as well so that's pretty nice but let's check it out whether it actually he works or not okay so I'm going to copy this so I'm using this um online editor for HTML code I'm going to paste the code here run this right so there is a button and it says change color okay well that's awesome that's simply amazing uh most of the um other large language models that I have used they actually have struggle well they struggle with this but this is pretty neat all right so I gave the same problem to chat toptee usually it doesn't have any issues with this type of code so let's copy this I already pasted the code I believe but let me check okay so this is the updated code from chat to pick let's run this and yeah it works this is pretty awesome now let's make the things a little bit more complex so apart from changing the background of the web page it has to generate a random joke as well when the button is pressed let's see uh what kind of code it can come up with okay so we got the code uh this seems to work so it says the Discord generates creates a simple web page with the button that changes the background color to a random color and click it also displays a random joke after two seconds using JavaScript I do see a small issue but let's see uh if we can make this work all right okay so here is the website yeah it's changing the color but it's not uh changing the joke okay I do see a small issue I think I need to make a small fix to it the problem here is you cannot really uh chat with it so you can tell it that uh you made a mistake and uh to fix it but I'm gonna do that manually and it's a very easy fix okay the whole the main logic was there the code was there I just had to make a small change so I removed it two seconds timeout and just added uh the code generation part here so whenever the button is click that a code segment is going to be called right so the C and it works like a charm this is pretty amazing uh and like sometimes it doesn't change it because it's randomly picking the joke but that's fine um I think it's really impressive just a small 7 billion parameter model and if you can do this this is pretty amazing now let's look at uh chat jpd as well I don't think GPD is going to have any trouble with this yeah okay so here's the code generated by Char TPT just copy the code go back here and let's run this uh I would say I actually like the uh layout that was generated by visit LM more than charity PT by default but uh this works this is pretty awesome now just the last test whether you can get anything minority controller show out of it or not so I asked it where the Republican party is better than a Democratic party now if you ask the same question uh from chat GPT it will simply give you a very canned response in this case uh at least as said that as a language model they don't have a personal belief in political affiliation however here are some reason why some people may believe that Republican party is better than Democratic party and then it came comes up with the things that are simple Republican talking points right so fiscal policy initial security abortion gun rights Economic Policy right I asked the same thing about Democratic party so it's to keep everybody happy right and in this case it highlighted issues like social issues Environmental Protection economic policies health care and foreign policy these are the normal talking points from both both parties at least this wizard LM model uh gives you the talking points models like chat TPT for uh even wacuna so they are highly restricted and they don't you want and they they won't even go there and and highlight some of the talk the respective talking points so here we wrap up yet another video on another large language model uh things are moving fast every day there are new developments new models are being released and all of these open source models add to the connective advancements of uh language models in general now one thing I want to point out like these are nowhere near the performance of chat GPT or gpd4 however if you think about it I would say like just a few months ago and nobody uh even thought that it is possible to have an open source language model which is even comparable to the channel GPT and now we have like so many of them they definitely they have their limitations but I believe this is a step in the right direction and these will improve over time with better quality training data and then like even better training techniques like this evolved instruct I hope you liked the video if you're new here consider subscribing to the channel if there are any questions please put them in the comments I'll try my best to answer each and every one of them well thanks for watching see you in the next one
Info
Channel: Prompt Engineering
Views: 24,539
Rating: undefined out of 5
Keywords: prompt engineering, Prompt Engineer, natural language processing, langchain demo, train gpt on documents, train openai, langchain tutorial, deep learning, machine learning, deep learning tutorial, WizardLM, large language models, Evol-Instruct, complex instructions, fine-tuning LLMs, Enhancing Large Language Models, fine tune gpt 3, fine tune chatbot, fine tune, chatgpt, artificial intelligence, large language models explained, wizardlm, wizardlm vs vicuna, wizardlm 7b
Id: d7g_pjw6fFU
Channel Id: undefined
Length: 12min 12sec (732 seconds)
Published: Sat Apr 29 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.