Master Claude 3 Haiku - The Crash Course!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Okay. So a few weeks back, I did a video about the announcement of the Claude three family of models And at that stage, they basically released the sonnet model and the Opus model. And we had a look at the quality and we could see that there were very clearly, up there in the league of GPT 4, et cetera. the one model that wasn't out at that time was Haiku. And I made the comment that actually Haiku might be the most interesting one. Based on the fact that if you look at the stats in here, the Haiku model is very strong. Yet we know, that the cost of the Haiku model, is actually, extremely cheap. So when we have a look at the, the cost of the Haiku model in here, we can see that the input tokens. Our 25 cents per million tokens and output tokens are only a $1.25 per million tokens. Now remember also that this model is a fully multimodal model. This is not just text only. we can put images into this. just like we can with GPT-4 Vision. So when we compare it pricing-wise to GPT-4 Vision. There you're looking at 40x of the Haiku price for input tokens. so $10 for the 1 million input tokens and $30 for 1 million output tokens as opposed to 25 cents for input tokens and a $1.25 for this. Now at the time, I commented that this could really be the winner model is that, as much as Opus is really impressive and powerful, it's very expensive to use. if you've got a lot of output tokens there. But the Haiku model can get, very close to similar sorts of results for a lot of this stuff. And be so much cheaper than what's out there. Well, Now the Haiku model itself has now been out for a week or two itself, and people have been testing it out. I've been testing it out a lot. and, I've been really amazed just how good this model is. and when you compare it on the sort of, performance versus price versus speed, I really think this is the sort of winner model at the moment, for doing the vast majority of tasks. People are always trying to come up with tricks for these models or benchmarks and stuff like that for these models. that are often just nothing like how people are actually using these models in the real world. So one of the benchmarks that I do think is really interesting is the, LMSYS chat bot arena in here and just overnight they've gone and released this, updated, benchmark of this. And perhaps not surprisingly, we can see that, Claude 3 Opus now is actually the new king at the top of this. But look at this. If we come down tied for six place is Claude 3 Haiku. and this is why I want it to make a whole video about using Haiku, using some of anthropics prompting styles. I already talked about the Metaprompt in a previous video, but in this one, we're going to look at, some of the tricks for doing multimodal prompting, some of the tricks for generating exemplars, generating things out of these models And I'm going to focus almost solely on the Claude 3 Haiku model, which you can see, is even beating some versions of, GPT-4. in here, which is pretty amazing, that this small model and this model that can do multimodal stuff can be so much cheaper and yet up there with, quite a lot of these models out there. So in the previous video, I talked a little bit about, the prompt engineering and stuff like that. You should definitely come in read this as really a lot of useful information in here. A lot of things, talking about how the model is set up for doing different things. One of the things that I'll point out in code is this whole idea of wrapping exemplars in, these sort of, XML tags in here. and that just adding in some things like that, suddenly the quality of outputs that you're getting back, goes up quite a lot. So this is something that they talk about in here. They talk about using the XML tags. they talk about using examples and stuff like that. All of these things really amplify the results that you're going to get out of this model. so today what I'm going to focus on a lot is just some of the examples from the anthropic cookbook, I've played around with some of them. gone through and change some of them. There are a whole bunch of different ones in here. I'm not going to cover the function calling in this one today. Perhaps I'll do a separate video Just for that in the near future. but I really want to go through some of the stuff that you can do with images. Some of the stuff that you can do with multimodal stuff in here. All right, so let's jump into the code and have a look at getting started with the Claude Haiku model. So in this notebook, I'm going to basically be going through a bunch of things from the Anthropic Claude Cookbook, and also I've added some things in there to show you, a few different things about these models, in general, but also specifically about the Haiku model. now in the cookbook, they generally just use Opus for everything. so I will point out that a lot of the examples I'm going to be showing, if you compare it to the cookbook version, the cookbook version is actually using, the big Opus model and I'm actually going to be using the small Haiku model in here. Okay, so I've got some just standard imports here. I'm bringing in some LangChain stuff at the end. I thought I'd put a little bit of LangChain stuff in there. and I'm bringing in, the main one obviously is bringing in the Anthropic SDK here. Alright, so I've got it set up so I can use my Anthropic key. with Colab Secrets, and I'm cloning in the Anthropic Cookbook in here so that I can get access to all their images and stuff like that. All right, first off, we've got just basic text prompting here. So it's basically pretty standard. First off, you will, import and set up a client from the Anthropic Library. And then you'll use that to make your calls. Now, generally you'll be basically passing in, messages. so it conforms to the, open AI chat standard, ML chat that, Hugging Face uses and stuff like this, where you basically have a role of a user or assistant. You can have a system. And then you've got your content, which is actually the prompt that you're putting in here. just starting out, here you'll see that, okay, I'm basically just going to make a call. Pass in the prompt, write me some, write me the lyrics to a made up Taylor Swift song called Filled Out Space. and sure enough, we'll bring it back and every time, just so that you can see the model, I'm going to print out the model as well. So you can see this one is using Claude Haiku. And then we can see the lyrics to our made up song. Taylor Swift's song in here, and it's certainly, conforming to standards, for verse, chorus, verse, chorus, bridge, and that So this is the kind of standard thing that you would use for any sort of, normal prompt. Now you would add in, you can add in a system prompt in there as well, to guide it. That's something that you can, use. And especially if you happen to be basically passing in the content for the user as something, that an actual user is using, but you don't want them to actually see your sort of prompt guidelines or something. You would put those in the system prompt in here, but this is fundamentally how you're going to do most of your, standard text prompting. next up, one of the big things that people want to do is to get JSON out So getting JSON out is actually quite simple, on this model. and again, here I'm using the Haiku model, I'm not using the Opus model. So in the, in their examples, they tend to show this, these kind of things with the Opus model. And so I guess in theory, The Opus model should conform to things like JSON, better than this model, but I find that this model actually does a really good job of this. You can see here, this is one of their examples of give me a JSON dict with names, of famous athletes and their sports. And sure enough, it's basically returned back, here's a JSON dictionary with names of famous athletes and their respective sports. and we can, extract that out, quite easily just by, looking for where the JSON. dictionary starts and ends, but we can also do things like where, you can give it a sort of start to something. So this is going back to more of a completion idea of where here you can see that we're giving it the user with the same kind of thing. And then we're basically giving it, the here is the JSON bit, and we're starting with this. This one opening bracket here and we're letting it generate the rest of it. So then it's generating just this, so this is interesting. This wouldn't work on some of the models like, Gemini, for example, in that, in those ones, you have to go user assistant, user assistant, going through it. So this is cool that it works like this, straight out of the box here, for doing this. And then we can basically parse that and that bracket back in manually, and then find the end, and then just parse that out, as JSON again, for this. So this is like the thing that you, if you want to get stuff back in JSON is definitely, useful for this. So this gives you a pretty simple way to basically work with JSON and get things out, in JSON that it seems to be, quite reliable, The other cool thing that you can do too is that you can actually take ideas from things like the instructor library, and, if you don't get valid JSON back, just ping the model again and, you can even pass what it gave back and point out what was wrong about it or something like that if you want to. the cool thing with Haiku is because it's so cheap, Because it's so fast, if it does make a mistake, it's quite easy basically just to get it to, try again, and you'll probably get something back, next up, I want to look at one of the interesting things around the Claude models in general, and this is all to do with, the ideas of, XML. So if we look at, if we look at one of the examples from the cookbook, you'll see that one of the things that they often do is that they will wrap things in XML. So it's like you're flagging out, to the model via the prompt, hey, this bit is the answer bit, right? And you'll see that it's wrapped in this sort of XML, and the thing that I find interesting about this is that I, at first I thought, okay, these are only going to work for question, answer, a number of sort of key words, but I found that to not be the case. So one of the things I was working on was related to LinkedIn examples. And I literally just put LinkedIn example. Close, LinkedIn example, and it worked fantastically well. So this idea of wrapping things in XML is definitely a key thing that you want to work on and use in your various prompts. So one of the best ways to do that is through exemplars. So remember exemplars are the examples of things that we want to basically show the model that this is what the, output should look like. So we can give it different examples for this. Now, this is where, combining the models can be really useful. So when you're designing a prompt, you can play around with, using this exact kind of prompt on an Opus model to make exemplars for using on a Haiku model in here. So I'm going to show you this prompt here. It's basically saying, I want to create a few shot learning prompt, that has eight exemplars. The prompt will be asking the model to reason over grade school math questions. So this is GSM 8K, if you're, if you know what that is. please generate eight examples I can use and wrap them individually in XML format. include thinking step by step for each one. this is Haiku's examples. it does quite a nice job here of the, output. here it's basically, giving me these eight examples out. it goes through and wraps each of these. And I think just looking through it before, most of them seem to be correct. You would obviously want to go and check that they're all correct if you're actually going to use these as examples for a main prompt here. But you might find also that the, that the way this does the thinking step by step is actually quite different. different than the Opus model. So you'll see down here that if we do it with the Opus model, and I'm doing the exact same prompt, the prompt will be asking model to reason over grade school math questions, please generate eight examples, and wrap them individually in XML format, including thinking step by step for each one. This does it in a different way. So this basically comes up with example rather than that numbering the examples, which is, which you could, say is quite good. I, because you then basically shuffle these around, you could make 50 of them, randomly sample them out to see which ones are getting the best results, that kind of thing. and then you'll see that it also does, question and then reasoning, and then answer. So we've got this question, reasoning, answer. sort of pattern going on here. Now the cool thing is you can make them with Opus, and then use them with Haiku. that seems to work very well for a lot of different tasks, So in one of the previous videos, I talked about, the whole metaprompt thing. but you can make your own, mini metaprompts You get the model to generate out what would be, good language for prompting it, but at the same time to generate out, a number of few shot examples. Now, in this case, I've gone for eight. Doesn't mean I have to use eight later on, but it gives me, some, some choices to, to come up with here. and it's something that certainly, you could generate 20 of them and go through and, And anytime that you saw, that, actually this is not how I wanted the example to be, you would change it, yourself. Very quickly, you can get a set of, few shot examples that you can combine into your Haiku prompt. And remember, because the Haiku model is so much cheaper than the Opus model and then other models out there, it's not a big deal at all to include 20 examples, in here, and then have it, go to the actual, prompt question or whatever it is. that can be really useful for, doing a whole bunch of these things to basically guide the model through by putting these exemplars into the, prompt, the final prompt that you're going to use in here. this is definitely a technique that you want to play around with and try out. \ So let's have a look next at some of the, multimodal, sort of stuff. So here we're looking at images. So these are taken from the cookbook. It's basically showing you, how to, load an image. So we've got a JPEG image here. we can basically encode it to base64. and then, pass that through. And you'll see that, what we can do here is we can have, role user, and then content can now be a list where we've got the text like we would normally would, but then we've got also, before that, we've got an image that we're passing in here. So this basically allows us to pass in an image and text for that image at the same time. Often you'll want to extend the number of tokens out because you're, counting the image and stuff as well here. And you can see again, I've gone back to the Claude 3 Haiku model here, not the, Opus. And sure enough, we're getting, a pretty good, sonnet out here. Now, if you compare this to their, Opus one, you will find that, yes, they're, Maybe the description of the image can sometimes be better, there, but often you will find, that the Haiku one does, very well with this. And in fact, I'm going to show you an example in a sec where the Haiku model actually did better, surprisingly, than the Opus model. next up, doing the same thing with a URL. So you can base encode the image. But, do basic 64 encoding for the image, or you can actually pass it in a URL. if we pass in the URL, you can see here we are basically, encoding that as well. Passing that in. And then doing the same sort of thing as before. And you can see here we've described this image in two sentences. This image depicts Inca ruins at Machu Picchu, a UNESCO heritage site. That's one sentence there. The ruins are set against a stunning backdrop of lush rugged mountains showcasing the impressive architectural achievements of the Inca civilization, the second sentence finishing there. So this is definitely, a nice use of this very cheap model, right? remember that as we're doing this, if we were to do the same thing with GPT 4 vision, The input tokens are going to cost us 40 times as much. So getting those images in, doing all that stuff is costing you 40 times as much as what this Haiku model is actually costing in here. So that's a key thing in here. Next up, transcribing handwriting. I don't know how it would do with my handwriting. I don't have very neat handwriting. but this handwriting, it seems to do very well. And again, we're using the, Haiku model. The cheap model here. Now, it doesn't get it perfectly. I think the Opus one maybe does do better. for me, I would say that's U6L4. levels of cellular organization. Here it's basically changed it to W6. 4 levels of cellular organization, but it's got most of the things. Enough that we could, use this for RAG, we could use this for fact extraction, So there's probably a lot of interesting ideas there, where you can take something, where you've got a set of notes, or someone's got a set of notes, and then, convert those into something that you could actually extract out nice summaries, and nice, structured notes, for this as well. Alright, next up, counting. So this one actually, in the, notebook, they use the, Opus model. And they ask it to count the number of dogs. Now, these models are notoriously not good for counting. You're actually better to use things like YOLO models and stuff like that for counting. if you're building an app that really the main thing is to count something in the image, probably better to not use one of these models. that said, these are getting better and better at this. Like I talked about, just for the previous example of transcribing, words, again, OCR models should be doing better with text, than these models, but it is amazing that these models are getting such good results while they're not, strictly an OCR model, in the same way as traditional models. All right, now count the dogs. Okay, so I count, nine dogs in here. In the, Opus version on their GitHub, it basically says ten dogs. And it said then that, that, okay, the model got it wrong, and so that you need to prompt it in a certain way. And now I totally agree with the idea of you needing to prompt it. the funny thing is, with the, Haiku version without any fancy prompting , it's, it's straight away got that there are nine dogs in this image without the fancy sort of prompting, here, just how many dogs are in this picture. When we did the fancy, prompting, it actually got the wrong answers, for doing this. And so I basically just needed to play around with the prompt a little bit. Now it could have just been, random luck that I got the wrong answer or something, but playing around with the sort of, prompt. a more in depth prompt, I was able then to get it back, back from getting the wrong answer with this one to actually getting the right answer. So getting back to this sort of, answer equals nine. It's a really nice example here of wrapping the final answer in an XML, which makes it very easy to pass out. So this thinking, we might not want to show that to an end user. so this, this is where going through their prompts can be really, interesting to look at, how did they get that thinking, the thinking tags, and they actually talk about tags in here and answer tags, in here. that's something that you can use for your own prompts as well as you go through this. Alright, next up, one of the examples I think is, very impressive, is that here we've basically got an org chart. So it's basically doing the task of, OCR on the org chart, and then also at the same time looking at the org chart to work out who is related to who in here. so you can see that the, the prompt is pretty simple. Turn this org chart into JSON indicating who reports to who, only output JSON, nothing else. here we get President John Smith. I, we've got the VP of marketing reports to the president, VP of sales, reports to the president, VP of production reports to the president. we've getting got the managers. I, so we've got manager Ellis Johnson reporting to marketing. Alice Johnson reporting to marketing. We should also have Tim Moore reporting to marketing. And we do. and you can see that, okay, it's gone through and worked out, those, in a pretty nice way. So that, that one is cool to look at along the same kind of lines, I did one here where, take this profit and loss, sort of statement, go through and, and I just literally just said P&L statement in here, so you could definitely work on this prompt to get it to be better, than what I've got here, indicating the state of the company, only output JSON, nothing else, sure enough, it's able to extract out these things, and, we could, We could go through and pass this, quite, quite well. It's able to work out when things should be negative, which is also interesting looking at this, that, this negative, 240. in there is going to be negative when it converts it to JSON. So for me, that's done a pretty nice job, of that. Now you could, work on these and, get it, much better here if you wanted So one of the big things I would say is that, if you prompt this model in the same way that you would prompt GPT 4, you will often get, okay results out of it. You're not going to get really bad results out of it, but if you put in the effort to just look at some of their prompting examples, look at the cookbook, go through some of these, you can actually get a lot better results, out of doing this. So have a play with this. I put in some, some simple examples with langchain at the end as well, just taken from the langchain site. Overall, I would say, hopefully you can see that, this Haiku model really is One of the best models out there at the moment, based on this whole sort of trade off of performance, cost and inference time here, the model's very quick, it's super cheap, and yet it's still getting, very high quality results, often, very close to on par with, GPT 4 results, and, like I talked about earlier, you, We saw in the LLM leaderboard just how well this is doing, compared to other, models when people are voting on it blind. so definitely, this is something that you want to check out, and start using in your apps today. Finally, as a just a quick teaser for one of the new videos, is that the Haiku model also works very well for agents and for doing things with agents. So in a future video, I'll show you actually running, CrewAI with the Anthropic, Haiku model, setting up that model and just using that to basically, run a crew, delegate to various agents, each agent using this model rather than using the OpenAI models. So look out for that, coming soon . As always, if you've got any questions or comments, please put them, in the comments below. If you found the video useful, please click like and subscribe, and I will talk to you in the next video. Bye for now.
Info
Channel: Sam Witteveen
Views: 19,590
Rating: undefined out of 5
Keywords: anthropic, claude, claude 3, ai, sonnet, haiku, opus, openai, multimodal, Claude Haiku, Claude Opus, claude 3 opus, claude 3 api, gpt4 turbo, generative ai, claude 2, claude ai, claude pro, chatgpt plus, ai writing, gemini pro 1.5, first hand demo, google gemini era, anthropic cookbook, OpenAI cookbook, prompt library, metaprompt, prompt engineering, anthropic ai, prompt engineering tutorial, claude ai vs chatgpt, claude 3 haiku
Id: GPfbPEYSckM
Channel Id: undefined
Length: 23min 23sec (1403 seconds)
Published: Wed Mar 27 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.