Claude 3 Sonnet vs Opus Coding Challenge

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video we're going to test CLA Sonet versus Claud oppus two flavors of the new CLA tray released by anthropic both models should perform fairly well in coding but CLA three oppus should definitely take the crown however keep in mind that CLA three oppus is four times the cost of clonet and is also half the speed meaning you will wait more for the generation to happen for our testing environment we have the testing playground here in M Studio where our team tests all the different models as soon as they come out out here in the different workflows panel we only have a start block and a chart endpoint meaning this is pretty much a chat bot using the latest model in each of these flows now let's click on publish open up the testing playground and let's test it by coding a basic snake game you can see I run this test already in the past but let's do it together here let's select clonet first and our prompt is very simply code a python snake game I specified I want something copy paste and ready to go we will paste the code in a replic instance and see if it works this test is ideal for non-coders meaning that I really need this code to work or this video will not look as impressive let's click on submit and here you go you can see that the speed of clonet is quite good it actually matches GPT 3.5 in almost all our testing so it should be ideal for very fast Generations it looks like it's done and it's even telling us how to run this code but all we need to do now is copy it and upload it in our replic instance let's paste it in and click on run it looks like the python game is loading and this is definitely doing something and there you go I was is moving I'm now using my arrows and let's try to get that foood in there we go we don't have a counting system for our points or anything like that but the game is working and my prompt was quite literally just code a python snake game this is incredibly impressive especially given that clonet is cheaper than CLA 2.1 and CLA 2.1 was never the bested coding now let's test it with Cloud 3 opus let's go back to the testing playground and open a new thread for CLA three Opus and let's try with the exact same prompt qu a python sake game I want something copy paste and ready to go let's submit one thing you'll notice out of the box is that CLA 3 Opus is significantly lower than CLA 3 Sonet that being said this is not necessarily a drawback as long as the final output is better let's see how it does there we go this is the final code and we see that this code creates a basic SN game using pame the game window set to a size of 800 by 600 pixels so it just gives us a little bit of details into what it's done let's scroll back up click on copy and enter it inside of our replate instance let's replace the old code with the new code and then run it again here is our output and as you can see it also works fine the snake gets longer as soon as you eat some fruit which seems to be a mechanic that the other game didn't perform as well at and we can just keep playing our snake game we also have a game over message and a message that says press Q to quit or C to play again let's press C and it does indeed work we can now play again very impressive results from both models but again remember that CLA tronet is significantly cheaper significantly faster and you should be able to use it for many more C Generations at the same cost going back to the testing playground we can see that both models operated with a temperature of 0.49 and a maximum response size of 4,096 tokens clicking in the debugger section we can also see that CLA 3 oppus had an average speed of 21.8 75 tokens per second while CL 3 Sonet had a speed of 56.6 eight tokens per second which is very impressive and much faster even than GPT 3.5 turbo we highly suggest you give a try to CLA 3 Sonet and just in case it doesn't work you can give a try to CLA 3 Opus which seems to be even better than GPT 4 at the vast majority of use cases both models are also incredibly good copywriters they can help with analysis and they can also help as we just saw with very cool coding challenges happy building and you can try both models right now in the mind Studio platform
Info
Channel: MindStudio
Views: 2,534
Rating: undefined out of 5
Keywords:
Id: RGNH35kpeRo
Channel Id: undefined
Length: 4min 55sec (295 seconds)
Published: Thu Mar 07 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.