DeepSeek Coder: AI Writes Code | Free LLM For Code Generation Beats ChatGPT, ChatDev & Code Llama

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone my name is Vin and in this video we're going to have a look at DC coder this is a open- source watch language model by Deep seek AI that the authors claim that can outperform CH GPT or GPT 3.5 on coding tasks we're going to have a look at what the model is how it was trained where you can get it and then we're going to try it out on a couple of coding tasks let's get started the DC coder model was strained on two trillion tokens and about 87% of that was code and then about 133% of that was natural language in English and Chinese the models are provided in different sizes and as you can see the authors are describing sizes from 1 billion parameters to 33 billion parameters another very important thing is that these models have a very large window size and there is about 16k of tokens that you can put in within those models so another important thing is that there is also this fine-tune version of the base model that is called deepseeker coder instruct which was additionally fine tuned with 2 billion tokens of instruction data so this is pretty much a summary of what you get with these models and this is the UI that we're going to try in a bit but the more important part here is that actually the performance of this model at least according to the others is amazing compared to other open watch language models so here they're comparing the code the code DC coder model to code wama the 34 billion parameter model which was the largest one available and it says that this DC coder model is actually performing much better you can see the percentages here roughly 8 to 10% um better performance on different benchmarks and then the authors also claim that the CPC coder instruct is perform per in better than GPT 3.5 turbo on human evaluation data set so here is a breakdown of what you get with this model and you can see that in red here is the DPC color model and you'll see that the model is performing vastly better compared to even codw and even the the other star coder models Etc so at least it looks to me that uh on especially on the python performance part you'll get much better results compared to open Watch language models that we have and uh in this video we're going to actually try the python coding capabilities of this model so here is a breakdown of what the model can do and you can see that the Deep Seer coder base model is outperforming pretty much everything across the board in this table and is performing much better on all of the evaluation examples and here the instruction tuned model is compared to GPT 3.5 turbo GPT 4 and you see that actually the the 33 billion parameter model is performing very well compared to the Char GPT model and it is outperforming it pretty much on everything except for the mbpp Benchmark but it is very close and here are the results for GPT 4 and as you can see the model is not that far off on those benchmarks as well so I would put it somewhere between uh CH GPT and GPT 4 like right in the middle there is a g repositor that is associated with the model and here you can see the similar graphic that we had within the report on the official page as well as the tables from there but here they're describing some of the ways that they're using in order to generate the training data and here you will see that they're actually applying the same filtering rules such as the star coder data to in order to filter dat data sets and this is pretty much the pipeline that they're using in order to get somewh more uh quality coding uh data set so here you can also see that the model is available from huging face and this is example of how you can lo the model right within the Transformers library and the models themselves are under the deeps repository on huging face and here you can see that the models that are available are ranging for 1.3 billion parameters up to 33 billion parameters for the instruct models this is the official UI provided by the deeps AI for the DC coder model and this is actually running the 33 billion parameter model on the left and here I have a Google quap notebook that is already running and in it I'm going to execute the code that the model is actually providing for us so so let's start with something simple write a function in Python that calculates the square of a sum of two numbers so this is very similar to what we had in previous videos and let's see whether or not the model can do this for us and here you will see actually that the model is performing uh or doing the inference very fast and I'm going to copy the code uh you can see that actually the the model is uh performing a very nice function or outputting a very nice function very succin uh let's check the response let's check with the provided values and you see that the output is actually correctly predicted by the model for our next example I'm going to copy and paste my prompt and here this write a function in Python that splits a list into three equal parts and returns a list with random element for each sub list so this is again very similar to what we had in previous videos and here you'll see that the model is actually providing us with the function again which is great let me just uh paste it in here and let's try out the function I'm going to copy the code for this one here uh and I'm going to essentially run this just to make it a bit easier uh yeah the output looks to be correct let's have a look at the function first the model is uh the function is actually checking whether or not the list has less than three numbers and if that is the case it is providing a value error or raising a value error then each sub list length is calculated which is great uh the comments also help split the list into three equal parts uh and this is done using uh indexing of the the um yeah the index of the lists then we have a random choice and the random library or module is actually imported right here so it appears to be working just great uh let's try something here yeah so in this case this doesn't appear to be working as expected I was expecting that uh we get something like one and then three and then five or something like that since the last element was uh this one so not entirely uh perfect this function but maybe my prompting was off as well uh let's try something a bit more involved so here is my prompt write a function that generates an excuse when my girlfriend asks me to do something like go to the mall or to the movies but I only want want to do some deadlifts and go for a walk the excuses must use the personality of D truth from the office so this is the function that we're going to get and again you can see that the output is quite fast and here is the final function that we get Let's uh check it out I'm going to paste it in here so uh it even included a sample code let's execute this and actually generate an excuse I'm afraid I can't do that right I'm here to do my deadlifts and walks all right so it thinks that it is actually Dwight I'm sry Dwight but I'm not available I'm here to put on muscle and take AO okay so he didn't actually get what I'm telling it to do but the code appears to be working just fine it generated a list of excuses and then it used the random Choice function in order to get a single excuse so I would say that the code appears to be working all right but uh The Prompt didn't get what uh we were asking about it and just for completeness here is the exact same prompt that I've given to uh chat GPT and you'll see that the GPT actually didn't want to provide any code that generates an excuse so you can see that the response was heavily sensored so at least in that point it looks like the open source DC coder model is actually performing much better compared to CH GPT is giving us next we are going to try some lead code problems and I'm going to start with something very simple this is the toome problem which is very easy and I can attest to that so I'm going to essentially copy this and the problem is that given an array of integers and an integer Target return indices of the two numbers such that they add up to a Target so you might assume that each point input would have exactly one solution so you might not use the same element twice I'm going to input this as a prompt uh add the Sol and I'm going to add that the solution must be in a class solution since this is required from the lead code examples and then and the function and the method name should be to sum let's see what we get here and keep in mind that these problems are widely available and also their Solutions are widely available on GitHub so we might just be getting something that the model have completely memorized and let's copy this and let's uh here is the funny or the easy part I would say uh we can essentially get the solution and then submit it and you can see that the actual solution was accepted so this even beats 90 about 94% of the users with Python 3 since dur time is very fast memory is not that efficient for this solution at least but yeah uh you can see that the model is performing very nicely at least on the runtime part and the solution was also accepted let's continue with a medium problem find the winner of an array of game given array R of distinct integers and integer k a game is played between the first two elements of array each route the game we compare the lger integer wins and remains at position zero so essentially you are uh beating integers against each other it is guaranteed that there will be a winner in the game and then we have to return the integer which wins the game so I'm going to essentially get the same thing as a prompt uh but I'm going to again add the solution and the solution must have the get winner okay and yeah you can see that is performing the correct function uh let's get the solution and uh immediately you'll see that the solution involves a while a while whoop so the solution should be be performing Pretty nicely according to the speed requirements again this solution was accepted and the runtime performance is amazing as well 94% better than all of the users and then this beats 85% of the users on the memory requirements so this one is very white on memory as well so this was a medium uh medium problem as you can see find the the array game so at least until this point if the models can solve easy and medium problems you can actually use those types of models in order to solve uh problems for interviews and uh some of the companies are actually asking just for um easy and medium problems of course hard problems is where the fun begins I would say so this is count of range sum given an integer array nums and two integers L and number return the number of range nums that lie in lower and upper inclusive range sum is defined as the sum of the elements in nums between indices Y and J inclusive where y i is uh lower or equal than J so at least from the the description you might notice that this should probably include some sort of binary search but of course uh this wouldn't be enough in order to solve this problem so let's pass in the prompt and I'm going to take the solution part and here I want to have the count range sum let's see what we get again this is a hard problem so I wouldn't expect actually this model to solve it I would be really impressed if it does this and here is the output this problem can be solved using a combination of Mer sort and and binary index 3 so yeah um merch s and binary the idea is to use the binary index 3 to count the number of range sums that lie in the given range okay uh it sounds really interesting uh okay let me copy this I'm going to add it and submit it let's see whether or not we have a solution for for this and actually it did solve the problem uh a couple of hours ago I've tried the exact same prompt and the model didn't or wasn't able to give us a solution that can solve this but as you can see uh this is even solving a hard problem on lead code uh and this is I believe a relatively new problem I'm not so sure about that but yeah it looks that the solution is pretty involved and I would say Let's uh check the output in this solution we first calculate the prefix sum of the array so this makes sense then we sort the prefix some and remove duplicates again makes sense we use binary index 3 to count the number of elements that are less than or equal to a given number uh since we want the wow and high for each prefix sum we find the range of numbers that are within the range upper and lower yeah when we use the bit to count the number of elements that are within this R finally we update the bit with the current prefix so it's pretty involved a way to use the binary index 3 and as you can see it is actually using the BCT U function in order to do some binary searching on um parts of the array so yeah it appears that the model is performing very well on these types of task and I'm pretty impressed with that in my previous video we've seen how we can use the autogen library to generate a sus landing page and here I'm going to take the same prompt that I've used there and I'm going to paste it in so here we have built a landing page for SS let me just expand this that creates chat Bots using chat GPT from a set of documents think of a name of the product short tagine add a form and collect user email when the for when the product is ready add an admin page that shows a table of f emails use FL askk for the app TA in CSS for stying and SQ Y3 as a database so here is the outline of this model or this fask application that we get and uh it is also providing us with the index page and an admin page for the uh list of the emails probably and let's see what we get from the the index page uh you see that we are actually using sky3 connect it is providing uh email database base for us and then it is also showing the admin page then in the name or the main function it connects to the email and creates a table for us then uh it is committing the code and then starting the app so everything looks to be uh all right on a first glance I'm not sure about the templates yeah this is a very basic example it doesn't include any styling of validation to make your application Prof look professional and secure all right so the first thing that I don't see here is actually I want this to think of a name of the product in a short tagline this is not included within this uh template and this is the application that is running as a flask application uh you notice that uh it is actually running so uh I'm going to use my email from before and let's actually go to the admin page and you'll see that the actual email has been stored right here so on the FL C even though it is simple it appears to be working just fine let's see if the Deep C coder can create a very simple Floppy Bird game so I'm going to imput my prompt here and this is the prompt create a very simple flppy Bird game in P game it should display the current score and the end and end the game when the bird is below the screen or hits an obstacle so this is the output of the model you see that it is actually adding the correct Imports it is initializing the py game framework and then we have the width and the height probably of the screen we have some birth speed some gravity as constants what the birth image uh I'm not so sure about that yeah so I wouldn't say that this is great since we probably don't have uh images um remove the wng of the images from the game okay so now it is using some shapes which is much better okay so this should work just fine now and note that the input or the instruction is actually very well defined uh and then the response is uh pretty good compared to what you might have with other versions the bird is red and the obstacles are green I got the code from the output and this is it I'm going to run the app now and you'll notice that the bird is actually not falling down the game is moving but yeah you see that the bird isn't falling down so here is the prompt that I'm going to use there is also no screen scoreboard so it's going to fix this for us at least it says it will okay it adds the obstacles and then it increases the score Okay so so we can try the fixed version and here is the result uh you'll see that we have this scoreboard and the bird is actually falling down and you need to press the keys the obstacles are bit far between each other but at least otherwise everything is appear to be working let's try to hit an obstacle yeah and the game ends honestly I'm pretty impressed with the DC Coler performance the model's inference speed is very good at least on their web page the model is also available from the huging face repository you can download various models from 1 billion to 33 billion parameter models we've tried the model on variety of tasks including the lead code easy to hard problems and it was able to solve all those three problems on the first try which was pretty amazing and then we've tried the model on to create a flask web app which it did it was pretty simple but still it worked on the first try then we've tried to use the model to create a simple Flappy Bird clone game and he did that on a first try of course uh it wasn't perfect and then we've provided some feedback that was uh vastly improved the game in my opinion so if you want to see more of this model and probably we can use this model for example in some agent like Library like L chain or autogen let me know down into the comments below please like share and subscribe also join the Discord channel that I'm going to link down into the description of this video and I'll see you in the next one bye
Info
Channel: Venelin Valkov
Views: 4,375
Rating: undefined out of 5
Keywords: Machine Learning, Artificial Intelligence, Data Science, Deep Learning
Id: TWcZOrXFEqY
Channel Id: undefined
Length: 23min 42sec (1422 seconds)
Published: Sun Nov 05 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.