Why & When You Should Use Claude 3 Over ChatGPT

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
claw-free another large language model that claims to be better than gbd4 in benchmarks according to anthropic but also in practice according to a lot of the internet I really wanted to take my time before releasing this video because the question today is should you drop gp4 for claw-free the short answer to that question is well it depends on what you're doing but probably it likes most of chat gbt's features but the foundational model is really good for certain use cases I haven't really left my apartment since release I tested this in every way that I could conceive and here's what I learned clfree B anthropic the gbd4 killer question mark so look first things first I think it makes sense that a lot of people compare this to gbd4 it is the king in the category of large language models and it has been at that spot ever since release for a reason it's just really damn good although a lot of Alternatives like open source models or Gemini have come out none of them have really defr gbd4 in terms of usability and consumer preferences but I think this might have changed now so here's the plan first I'll give you a quick rundown of everything you need to know really the key points for you as a user what specs does this have what matters in terms of usability and then I want to Dive Right into use cases because what I did is I tried this on all the ways that I use large language models on a day-to-day basis there's a lot of Niche use cases there's a lot of fancy workflows or specific automations I have but those are not the day-to-day use cases things like content creation Assistance or idea generation those are the things I use it all the time for and that's what I care about so that's what we'll be looking at here today and I'll give my honest take if I'll be using this over gbd4 or not and if yes why but before we even talk about the specs let me show you a site where you can actually use it for free so if you head on over to chat. LMS y.org you're going to be able to go to direct chat here and pick claf free Opus that is their new flagship model okay so they released multiple models you can check out all the details this video is not going to be a summary of the blog post that they released although it does contain a lot of great information like yay it wins all the benchmarks fantastic we know that so do many other models but in practice they're not better so I as a power user kind of stopped even looking at that I mean good winon all benchmarks great let's move on what matters to me is this retrieval what is the pricing going to be what is the speed of it and how's the quality of the outputs okay so basically it's priced at $20 a month but this website here allows you to actually use it now sometimes it's a bit overloaded but hey it's free you can go ahead and test it out if you go to Arena side by side you can actually compare it to gp4 and run a prompt in here and you get both outputs gp4 included and the this is free which is kind of wild they have VC funding and they basically want to create a leaderboard for chatbots which they're successfully doing this is one of the best ways to evaluate different models it just updates every two to 3 weeks so this leaderboard is not updated yet but basically this is a way for me as somebody sitting in Europe to use this without a VPN now this is the next point if you want to use claw free it's not available on Europe and then the best model Opus is gated behind a $20 pay wall okay so those are some of the most important points as a user except of the fact that it has a 200k context window now if you're using GPT for today inside of chat GPT you have a 32k context window right but it retrieves all the information with it wonderfully if you use the 128k context window of the gb4 API it's not so perfect anymore sometimes the info in the middle it just gets lost as you might know it gets tested with this Benchmark called needle in a hyack where they basically hide a little line inside of a very very long document that maxes out the context and then you prompt it to retrieve that piece of information and this graph actually really matters because it does visualize how well it retrieves the hidden piece of information in other words we have a very large context window that actually works with a model that is extremely powerful this looks very promising across all dimensions and this is the interface it's nice and intuitive you have your history down here you can start new chats attach PDFs or images now I do have to say if you're using this web interface and if we compare that to chat GPT it does lack pretty much everything that chat GPT has outside of the text generation there's no code interpreter there's no image generation there is no voice input or output there are no plugin AKA actions there's no custom instructions you can't edit the messages that you sent previously but the core of this product is the answers it gives so let's talk about that how does it do well let me tell you it does really well and many of the super basic prompts like write me an essay or research this topic it performs pretty much equally to gbd4 and by the way everything I'm about to say here is purely subjective right this is all a perspective of a power user who spends all his time pretty much experimenting with these tools and then teaching other people what I find but I got a say at the base level it's just seemed identical but then if you go a little deeper and you start expanding the context and if you're watching this channel you will know the more context you provide in the prompts the more you can expect in the output it will be custom tailored and more relevant and if you do that I want to start with this one use case that really blew me away here you're going to get incredible results so I'll just show you this little conversation that I had with it and this one really impressed me this is incredible so the prompt is super basic okay so I like to do this a lot and this is how I teached it in the course and previous YouTube videos basically you can have super easy prompts if you have your custom instructions with it okay so I have my own set of custom instructions that I crafted for myself over time down here and then I just include this super easy prompt but as a third point of context I include a screenshot of my most recent 12 YouTube videos so basically takes the context from my custom instructions and the image which is very rich in data right there's view numbers here there's titles there's all the thumbnails and then all I practically need is a simple prompt like this and here's the deal this result I got a I agree with most of these These are fantastic video ideas all of them it it proposes various shows and when I look at these I just have the feeling of and again this is more of a feeling than anything else but then picking videos is more of a feeling than anything else I mean you can look at data to inform that decision but at the end of the day it's like y this would makees sense I want to create this and in this case my feeling just tells me these are all incredibly spoton I mean look at this chat GPT memory series diving into how the model builds up context and memory during a conversation demonstrate multi-step interactions like absolutely that might not be the packaging for the video but it's a great concept I would like to do Hands-On tutorials and prompt engineering I mean I have a whole library of those videos right you can check out a playlist on channel for those comparison of cat GPT with other large language models that's what we're doing right now ai tools of the week that's my Friday show right just all of these are relevant but if you run the same thing inside of chat GPT so the only difference here is that I have them inside the custom instructions as I usually do and then instead of chat GPT when I look at these ideas they're all okay but I would say maybe two or three of these are something I would actually want to create I mean it makes sense yeah AI ethics and governance create content that traces the history of AI like this might be interesting but it's not what we're doing on this channel right we're focused on what's happening today and what you can use today not on the history of AI like these are all relevant topics but they're not relevant to me and I did provide it with a lot of context Tech I gave it 12 videos that I just created you know if somebody showed me that this is their YouTube channel and they asked me what kind of videos they should keep creating I don't think I would recommend that I should be reviewing AI startup pitches or creating content around the history of AI again this is all fantastic stuff but it's just not what I do it's not the context that I provided like the custom instructions Clearly say we have a focus on generative AI specifically chat GPT and related Technologies why does this give me recommendations like this I don't know but there's a reason that I kind of gave up on some of these use cases because just the results were never that good Claude nailed this these are great okay so that's one use case right but if you go deeper like the one thing that I really found is that it's just so good at taking in images it really just feels different when you work with the images and I guess if you want a quantitative way of expressing that you can look at the benchmarks on Vision capabilities and how Opus outperforms GPT 4V but the best way I can describe it is in gp4 it feels like they have a large language model and they have a vision model and then they just like plug them into each other and let them work together and that is great but with CLA just like with Gemini it just performs differently if it's multimodal from the ground up and that is literally the case I mean if you're using vision through the API and not through cat gbt they're two different API end points so yeah just from a practical point of view this really blew me away and all the other image use cases were better when I compared it with complex images like for example this one that I found on Reddit Claud described it perfectly not a single mistake as far as I can tell but chat GPT actually went ahead and said that the left snowman is wearing a green hat with a red band and small holy decoration okay fair enough and the blue scarf my man the left snowman is not wearing a blue scarf here and this is a minor thing so fair enough you know who cares well if you use this stuff for work and if you use it inside of your automations you do care you're not going to be looking over the shoulder of the language model in every generation right you just want it to work so I guess when you're working with images it's just clear that claw wins from everything that I've seen so far and this is the one that I tested extensively cuz I love prompting with images it's so simple it's the simplest way of putting in a lot of context like when I want to just get something done I don't spend 90 minutes engineering The Prompt so it's perfect I do that for things where the task repeats and if it's just like a quick onetime prompt I just throw an image at it and that's the context I provide along with some custom instructions and maybe I extend the prompt two three sentences but I use these models to keep me efficient right I want to be fast on my feet I want to have an assistant a coworker that works together with me and for that I use images a lot and cloud is just better at that but not to get stuck on this point so let's move on here so here's another use case that is very important to me if you've been following the channel you know that we have a free newsletter and you get this massive chat GPD resource with it and in it my personal favorite part are the prompt generators so for 10 different professions you get 10 prompt generators and you get to customize those for yourself or then we sell this big product where we pregenerated a thousand of them so you have no work left with it so this is one of them and what it basically does is based on the custom instructions at the bottom it generates pretty prompt formulas so they're very Universal this one particularly is for a growth hacker and I tested this rigorously in both models I have a lot of experience with this prompt and I keep using it over and over again in different variations based on the custom instructions to find new use cases for AI this is really my favorite way when people ask me or how do you find new stuff for chat GPT to do this is my answer run this prompt customize these custom instructions here at the bottom and then it just spits out what you can do today because that's how this prompt is designed and I ran this many times and what I found is it performs equally as well okay so to me this is a prompt where I've seen the output hundreds of times so I feel like I can be quite objective I don't care if I use CLA free or gp4 in this instance both work super well one note should be that the gp4 output is limited so when I run it in GPT 4 it gives me around 22 prompts depending on the length of them because the output is limited I mean it's no big deal I just prompt continue or press the button where it just continues generating but CLA has more token outputs which is nice to have but here's an interesting point where clot actually does differentiate it itself so I do have this workflow where I can take one of these prompts and I improve on them based on the specific context that I'm using them in okay and look this is the result of that workflow it's a prompt that is a bit more fleshed out okay so this would be the chat GPD generation this would be the claw generation now I'm not going to go into all the details that's not the point of this video but I do prefer this version it is more more detailed it's more actionable it preserves the variables more effectively which is what I want based on my input and I found this to be consistent across improving multiple prompts and multiple prompt generation workflows so my conclusion is if you're using a large language model for prompt engineering CLA is actually significantly better good to know right so then I went ahead and I tested it for image prompt Generations right we also covered this on the channel and I gave you the prompt for that you can create these photo realistic images that are incredible because all I do in this prompt at the end I say a cat with a hat and then it flashes it out and it really gives you rich detail which then allows you to easily customize this turns out there's no difference whatsoever between the chat GPT prompt and the CLA free prompt as you can see here first one is chipt second one is CLA essentially the same thing so there it doesn't matter but if you're generating prompts for large language models I did find that it does matter now look this might depend on your workflow and your prompting but I'm just trying to compare apples to apples I've been developing some of these prompt since quite a while and I've been surprised by how many perform bettering claw just right off the bat now look not to Hype it up too much there was actually a few use cases where it actually completely failed now for example here's another prompt that I found on Reddit very simple Sams 50 books in his room he reads five of them how many books are left in his room well Claud Frey seems to think it's 45 books but he just read the books he didn't remove them they're still in the room so it should be 50 chip got this right first try and then beyond that I run a bunch of other tests like creating palindromes or code generation I don't know Apples to Apples it honestly it feels too early for me to have an opinion on that I mean both failed with palen drums with the code generation really depends on what you're generating what package you're working in I don't have a real opinion on that yet again it does win on all the benchmarks but then at this point they can't be trusted too much because the entire industry knows that it's evaluated on this and while they do claim that none of these Benchmark questions are included in the training data the training data is not public right what am I missing here how do we know that's true like a year ago every single L struggled with generating a snake game and now every single small release can do it because they know that people are going to be trying it YouTubers are going to be going ahead and creating snake games and being like Oh it can do it it's a really good model you should use it but I want to share one more with you okay and this is a specific prompt from another Creator fantastic work on this prompt really got to give it to him synaptic Labs with Professor synapse killed it a really effective way of enhancing your chat chipt experience especially if you don't really know what you're doing with the prompting it asks you clarifying questions and spawns specific character to help you with that now obviously if you do it manually and set up exactly the Persona that you want and craft The Prompt in a way that you need it is probably going to be more effective but it's a great starting point and I tested the prompt in here this is essentially the professor snaps prompt and I was beyond surprised one I might say I was shocked along with the entire industry that this didn't work because Claude is really correct it's a ethical AI that is going to do no harm and that is their main thing they're looking to scale this for Enterprises that's kind of their main selling point if you look at the order of the paper and how they argument and everything I don't think they're trying to build a consumer product here look at that potentially uses task automation R&D strategy and their whole ethos is we make safe AI open AI might be accelerationist but we make the safe one this is the one that you can rely upon the downside of that is a lot of stuff is not going to work a lot of persona modeling where you tell it to act as a certain persona it's not going to work because they tell it not to accept role playing at all ever and they do this to prevent jailbreaking right like it's March 2024 and there's still ways to jailbreak gbd4 maybe not fully but you can do a lot today with Claude none of that is going to fly they're super strict on this front so yeah look this just doesn't work at all so that's a big limitation a lot of prompts that I use use Persona modeling as a base to it now the custom instructions that I teach on this channel there's many videos about how to create custom instructions and when I teach building gpts also videos on this channel all of that still works because I don't take this role playing approach we created this AI Advantage approach where you have 24 building blocks that represent different aspects of the Persona and never does it directly tell hey you are XYZ it's more like hey here's my profession here's my goal here's my language preferences and way more and that works universally across all llms these Persona prompts don't work in claw and then to round it out I should mention creative writing now this one is really hard to judge cuz I feel like it's super subjective and also on this I need more time to have a solid opinion that I'm ready to share with you my initial take there is for Content creation it's very similar to gbd4 maybe even a bit worse that's just my initial intuition here because whenever you create content with gbd4 and you plan content out it acts more as a director and it takes more responsibility at least in my workflows whereas Claud just gives you the text and it's not exceptional like I personally have very high standards for my content so I would never ever use the scripts that the AI generates for me but as mentioned I do love it for adiation a lot and in that clot just excels it's going to be my go-to for brainstorming and idea generation from here and that's huge and same for prompt Improvement it's just better there so what is my initial conclusion after using this thing for things that actually matter to me and use cases that I actually use day-to-day well I'm bookmarking this tab and I'm making sure that the bookmark is placed right here next to cat GPT because from here and out I'll use both and on all the use cases that I haven't tried yet I'm going to be testing both because it really seems like it's just better at certain things and anytime I want to input an image as context I'm just going to be defaulting to clot from here and to be honest that happens a lot I use images a lot in my prompts so there you go I hope all that was helpful and now all I got to say as somebody who's been following openi extremely closely is that in typical open fashion it's probably a matter of days not weeks until they release their next big thing because clot free as of today is going to be competing away a lot lot of open AI users it's just too good and I don't think I'm the only one that's going to arrive at that conclusion all right super curious to hear what you think though leave a comment below what's your take on CLA are you using it more than GPT and what I care most about is for what use cases do you prefer GPT or CLA on let's turn the comment section into a place to brainstorm all this and other than that if you made it all the way to here here's a video you would probably love because it shows you how to build a GPT from scratch with my super prompt where all it takes is a few words of input and what I'll do now is actually get some sleep I've been comparing this for way too long today I'll see you soon
Info
Channel: The AI Advantage
Views: 86,679
Rating: undefined out of 5
Keywords: theaiadvantage, aiadvantage, chatgpt, ai, chatbot, advantage, artificial intelligence, gpt-4, openai, ai advantage, igor
Id: CEI4e2SQnzo
Channel Id: undefined
Length: 16min 59sec (1019 seconds)
Published: Wed Mar 06 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.