Okay. So a few weeks back, I did a video
about the announcement of the Claude three family of models And at that
stage, they basically released the sonnet model and the Opus model. And we had a look at the quality and we
could see that there were very clearly, up there in the league of GPT 4, et cetera. the one model that wasn't
out at that time was Haiku. And I made the comment that actually
Haiku might be the most interesting one. Based on the fact that if you
look at the stats in here, the Haiku model is very strong. Yet we know, that the cost of the Haiku
model, is actually, extremely cheap. So when we have a look at the, the
cost of the Haiku model in here, we can see that the input tokens. Our 25 cents per million tokens
and output tokens are only a $1.25 per million tokens. Now remember also that this model
is a fully multimodal model. This is not just text only. we can put images into this. just like we can with GPT-4 Vision. So when we compare it
pricing-wise to GPT-4 Vision. There you're looking at 40x of
the Haiku price for input tokens. so $10 for the 1 million input
tokens and $30 for 1 million output tokens as opposed to 25 cents for
input tokens and a $1.25 for this. Now at the time, I commented that this
could really be the winner model is that, as much as Opus is really impressive and
powerful, it's very expensive to use. if you've got a lot of
output tokens there. But the Haiku model can get,
very close to similar sorts of results for a lot of this stuff. And be so much cheaper
than what's out there. Well, Now the Haiku model itself has
now been out for a week or two itself, and people have been testing it out. I've been testing it out a lot. and, I've been really amazed
just how good this model is. and when you compare it on the sort
of, performance versus price versus speed, I really think this is the
sort of winner model at the moment, for doing the vast majority of tasks. People are always trying to come up with
tricks for these models or benchmarks and stuff like that for these models. that are often just nothing like
how people are actually using these models in the real world. So one of the benchmarks that I
do think is really interesting is the, LMSYS chat bot arena in here and just overnight they've gone and
released this, updated, benchmark of this. And perhaps not surprisingly, we can
see that, Claude 3 Opus now is actually the new king at the top of this. But look at this. If we come down tied for
six place is Claude 3 Haiku. and this is why I want it to make a
whole video about using Haiku, using some of anthropics prompting styles. I already talked about the Metaprompt in
a previous video, but in this one, we're going to look at, some of the tricks
for doing multimodal prompting, some of the tricks for generating exemplars,
generating things out of these models And I'm going to focus almost
solely on the Claude 3 Haiku model, which you can see, is even
beating some versions of, GPT-4. in here, which is pretty amazing, that this
small model and this model that can do multimodal stuff can be so much
cheaper and yet up there with, quite a lot of these models out there. So in the previous video, I talked
a little bit about, the prompt engineering and stuff like that. You should definitely come in
read this as really a lot of useful information in here. A lot of things, talking
about how the model is set up for doing different things. One of the things that I'll
point out in code is this whole idea of wrapping exemplars in,
these sort of, XML tags in here. and that just adding in some things like
that, suddenly the quality of outputs that you're getting back, goes up quite a lot. So this is something that
they talk about in here. They talk about using the XML tags. they talk about using
examples and stuff like that. All of these things really
amplify the results that you're going to get out of this model. so today what I'm going to focus on
a lot is just some of the examples from the anthropic cookbook, I've
played around with some of them. gone through and change some of them. There are a whole bunch
of different ones in here. I'm not going to cover the
function calling in this one today. Perhaps I'll do a separate video
Just for that in the near future. but I really want to go through some of
the stuff that you can do with images. Some of the stuff that you can
do with multimodal stuff in here. All right, so let's jump into the
code and have a look at getting started with the Claude Haiku model. So in this notebook, I'm going to
basically be going through a bunch of things from the Anthropic Claude Cookbook,
and also I've added some things in there to show you, a few different things
about these models, in general, but also specifically about the Haiku model. now in the cookbook, they generally
just use Opus for everything. so I will point out that a lot of the
examples I'm going to be showing, if you compare it to the cookbook version, the
cookbook version is actually using, the big Opus model and I'm actually going to
be using the small Haiku model in here. Okay, so I've got some
just standard imports here. I'm bringing in some
LangChain stuff at the end. I thought I'd put a little bit
of LangChain stuff in there. and I'm bringing in, the main
one obviously is bringing in the Anthropic SDK here. Alright, so I've got it set up
so I can use my Anthropic key. with Colab Secrets, and I'm cloning
in the Anthropic Cookbook in here so that I can get access to all
their images and stuff like that. All right, first off, we've got
just basic text prompting here. So it's basically pretty standard. First off, you will, import and set up
a client from the Anthropic Library. And then you'll use
that to make your calls. Now, generally you'll be
basically passing in, messages. so it conforms to the, open AI chat
standard, ML chat that, Hugging Face uses and stuff like this, where you basically
have a role of a user or assistant. You can have a system. And then you've got your content,
which is actually the prompt that you're putting in here. just starting out, here you'll
see that, okay, I'm basically just going to make a call. Pass in the prompt, write me some,
write me the lyrics to a made up Taylor Swift song called Filled Out Space. and sure enough, we'll bring it
back and every time, just so that you can see the model, I'm going
to print out the model as well. So you can see this one
is using Claude Haiku. And then we can see the
lyrics to our made up song. Taylor Swift's song in here, and it's
certainly, conforming to standards, for verse, chorus, verse, chorus,
bridge, and that So this is the kind of standard thing that you would
use for any sort of, normal prompt. Now you would add in, you
can add in a system prompt in there as well, to guide it. That's something that you can, use. And especially if you happen to be
basically passing in the content for the user as something, that
an actual user is using, but you don't want them to actually see your
sort of prompt guidelines or something. You would put those in the system prompt
in here, but this is fundamentally how you're going to do most of
your, standard text prompting. next up, one of the big things that
people want to do is to get JSON out So getting JSON out is actually
quite simple, on this model. and again, here I'm using the Haiku
model, I'm not using the Opus model. So in the, in their examples,
they tend to show this, these kind of things with the Opus model. And so I guess in theory, The Opus
model should conform to things like JSON, better than this model, but
I find that this model actually does a really good job of this. You can see here, this is one
of their examples of give me a JSON dict with names, of famous
athletes and their sports. And sure enough, it's basically
returned back, here's a JSON dictionary with names of famous
athletes and their respective sports. and we can, extract that out, quite easily
just by, looking for where the JSON. dictionary starts and ends, but we
can also do things like where, you can give it a sort of start to something. So this is going back to more of a
completion idea of where here you can see that we're giving it the
user with the same kind of thing. And then we're basically giving
it, the here is the JSON bit, and we're starting with this. This one opening bracket here and we're
letting it generate the rest of it. So then it's generating just
this, so this is interesting. This wouldn't work on some of the models
like, Gemini, for example, in that, in those ones, you have to go user assistant,
user assistant, going through it. So this is cool that it works
like this, straight out of the box here, for doing this. And then we can basically parse that
and that bracket back in manually, and then find the end, and then just parse
that out, as JSON again, for this. So this is like the thing that you,
if you want to get stuff back in JSON is definitely, useful for this. So this gives you a pretty simple way to
basically work with JSON and get things out, in JSON that it seems to be, quite
reliable, The other cool thing that you can do too is that you can actually take
ideas from things like the instructor library, and, if you don't get valid
JSON back, just ping the model again and, you can even pass what it gave back
and point out what was wrong about it or something like that if you want to. the cool thing with Haiku is because
it's so cheap, Because it's so fast, if it does make a mistake, it's quite
easy basically just to get it to, try again, and you'll probably get something
back, next up, I want to look at one of the interesting things around the
Claude models in general, and this is all to do with, the ideas of, XML. So if we look at, if we look at one of
the examples from the cookbook, you'll see that one of the things that they often
do is that they will wrap things in XML. So it's like you're flagging out,
to the model via the prompt, hey, this bit is the answer bit, right? And you'll see that it's wrapped in
this sort of XML, and the thing that I find interesting about this is that
I, at first I thought, okay, these are only going to work for question,
answer, a number of sort of key words, but I found that to not be the case. So one of the things I was working
on was related to LinkedIn examples. And I literally just put LinkedIn example. Close, LinkedIn example, and
it worked fantastically well. So this idea of wrapping things in XML is
definitely a key thing that you want to work on and use in your various prompts. So one of the best ways to
do that is through exemplars. So remember exemplars are the
examples of things that we want to basically show the model that this
is what the, output should look like. So we can give it different
examples for this. Now, this is where, combining
the models can be really useful. So when you're designing a prompt, you can
play around with, using this exact kind of prompt on an Opus model to make exemplars
for using on a Haiku model in here. So I'm going to show you this prompt here. It's basically saying, I want
to create a few shot learning prompt, that has eight exemplars. The prompt will be asking the model to
reason over grade school math questions. So this is GSM 8K, if you're,
if you know what that is. please generate eight examples I can use
and wrap them individually in XML format. include thinking step
by step for each one. this is Haiku's examples. it does quite a nice
job here of the, output. here it's basically, giving
me these eight examples out. it goes through and wraps each of these. And I think just looking through it
before, most of them seem to be correct. You would obviously want to go and
check that they're all correct if you're actually going to use these
as examples for a main prompt here. But you might find also that the, that
the way this does the thinking step by step is actually quite different. different than the Opus model. So you'll see down here that if we do
it with the Opus model, and I'm doing the exact same prompt, the prompt
will be asking model to reason over grade school math questions, please
generate eight examples, and wrap them individually in XML format, including
thinking step by step for each one. This does it in a different way. So this basically comes up with
example rather than that numbering the examples, which is, which
you could, say is quite good. I, because you then basically shuffle
these around, you could make 50 of them, randomly sample them out
to see which ones are getting the best results, that kind of thing. and then you'll see that it
also does, question and then reasoning, and then answer. So we've got this question,
reasoning, answer. sort of pattern going on here. Now the cool thing is you can make them
with Opus, and then use them with Haiku. that seems to work very well for a
lot of different tasks, So in one of the previous videos, I talked
about, the whole metaprompt thing. but you can make your own, mini
metaprompts You get the model to generate out what would be, good
language for prompting it, but at the same time to generate out,
a number of few shot examples. Now, in this case, I've gone for eight. Doesn't mean I have to use eight
later on, but it gives me, some, some choices to, to come up with here. and it's something that certainly,
you could generate 20 of them and go through and, And anytime that
you saw, that, actually this is not how I wanted the example to
be, you would change it, yourself. Very quickly, you can get a set
of, few shot examples that you can combine into your Haiku prompt. And remember, because the Haiku model is
so much cheaper than the Opus model and then other models out there, it's not a
big deal at all to include 20 examples, in here, and then have it, go to the
actual, prompt question or whatever it is. that can be really useful for, doing a
whole bunch of these things to basically guide the model through by putting these
exemplars into the, prompt, the final prompt that you're going to use in here. this is definitely a technique that you
want to play around with and try out. \ So let's have a look next at some
of the, multimodal, sort of stuff. So here we're looking at images. So these are taken from the cookbook. It's basically showing
you, how to, load an image. So we've got a JPEG image here. we can basically encode it to base64. and then, pass that through. And you'll see that, what we can do
here is we can have, role user, and then content can now be a list where we've got
the text like we would normally would, but then we've got also, before that, we've
got an image that we're passing in here. So this basically allows us to
pass in an image and text for that image at the same time. Often you'll want to extend the number
of tokens out because you're, counting the image and stuff as well here. And you can see again, I've
gone back to the Claude 3 Haiku model here, not the, Opus. And sure enough, we're getting,
a pretty good, sonnet out here. Now, if you compare this to their,
Opus one, you will find that, yes, they're, Maybe the description of the
image can sometimes be better, there, but often you will find, that the
Haiku one does, very well with this. And in fact, I'm going to show
you an example in a sec where the Haiku model actually did better,
surprisingly, than the Opus model. next up, doing the same thing with a URL. So you can base encode the image. But, do basic 64 encoding for the image,
or you can actually pass it in a URL. if we pass in the URL, you can see here
we are basically, encoding that as well. Passing that in. And then doing the same
sort of thing as before. And you can see here we've described
this image in two sentences. This image depicts Inca ruins at
Machu Picchu, a UNESCO heritage site. That's one sentence there. The ruins are set against a stunning
backdrop of lush rugged mountains showcasing the impressive architectural
achievements of the Inca civilization, the second sentence finishing there. So this is definitely, a nice use
of this very cheap model, right? remember that as we're doing this,
if we were to do the same thing with GPT 4 vision, The input tokens are
going to cost us 40 times as much. So getting those images in, doing
all that stuff is costing you 40 times as much as what this Haiku
model is actually costing in here. So that's a key thing in here. Next up, transcribing handwriting. I don't know how it would
do with my handwriting. I don't have very neat handwriting. but this handwriting, it
seems to do very well. And again, we're using the, Haiku model. The cheap model here. Now, it doesn't get it perfectly. I think the Opus one maybe does do better. for me, I would say that's U6L4. levels of cellular organization. Here it's basically changed it to W6. 4 levels of cellular organization,
but it's got most of the things. Enough that we could, use this for
RAG, we could use this for fact extraction, So there's probably a lot
of interesting ideas there, where you can take something, where you've got
a set of notes, or someone's got a set of notes, and then, convert those
into something that you could actually extract out nice summaries, and nice,
structured notes, for this as well. Alright, next up, counting. So this one actually, in the,
notebook, they use the, Opus model. And they ask it to count
the number of dogs. Now, these models are notoriously
not good for counting. You're actually better to use
things like YOLO models and stuff like that for counting. if you're building an app that
really the main thing is to count something in the image, probably
better to not use one of these models. that said, these are getting
better and better at this. Like I talked about, just for the previous
example of transcribing, words, again, OCR models should be doing better with text,
than these models, but it is amazing that these models are getting such good results
while they're not, strictly an OCR model, in the same way as traditional models. All right, now count the dogs. Okay, so I count, nine dogs in here. In the, Opus version on their
GitHub, it basically says ten dogs. And it said then that, that, okay, the
model got it wrong, and so that you need to prompt it in a certain way. And now I totally agree with the
idea of you needing to prompt it. the funny thing is, with the, Haiku
version without any fancy prompting , it's, it's straight away got that there
are nine dogs in this image without the fancy sort of prompting, here,
just how many dogs are in this picture. When we did the fancy,
prompting, it actually got the wrong answers, for doing this. And so I basically just needed to play
around with the prompt a little bit. Now it could have just been,
random luck that I got the wrong answer or something, but playing
around with the sort of, prompt. a more in depth prompt, I was able
then to get it back, back from getting the wrong answer with this one to
actually getting the right answer. So getting back to this
sort of, answer equals nine. It's a really nice example here of
wrapping the final answer in an XML, which makes it very easy to pass out. So this thinking, we might not
want to show that to an end user. so this, this is where going
through their prompts can be really, interesting to look at, how did they
get that thinking, the thinking tags, and they actually talk about tags
in here and answer tags, in here. that's something that you can
use for your own prompts as well as you go through this. Alright, next up, one of the examples
I think is, very impressive, is that here we've basically got an org chart. So it's basically doing the task of, OCR
on the org chart, and then also at the same time looking at the org chart to
work out who is related to who in here. so you can see that the,
the prompt is pretty simple. Turn this org chart into JSON
indicating who reports to who, only output JSON, nothing else. here we get President John Smith. I, we've got the VP of marketing
reports to the president, VP of sales, reports to the president, VP of
production reports to the president. we've getting got the managers. I, so we've got manager Ellis
Johnson reporting to marketing. Alice Johnson reporting to marketing. We should also have Tim
Moore reporting to marketing. And we do. and you can see that, okay, it's
gone through and worked out, those, in a pretty nice way. So that, that one is cool to look at
along the same kind of lines, I did one here where, take this profit and loss,
sort of statement, go through and, and I just literally just said P&L statement in
here, so you could definitely work on this prompt to get it to be better, than what
I've got here, indicating the state of the company, only output JSON, nothing else,
sure enough, it's able to extract out these things, and, we could, We could go
through and pass this, quite, quite well. It's able to work out when things should
be negative, which is also interesting looking at this, that, this negative, 240. in there is going to be negative
when it converts it to JSON. So for me, that's done a
pretty nice job, of that. Now you could, work on these and, get
it, much better here if you wanted So one of the big things I would say is
that, if you prompt this model in the same way that you would prompt GPT 4, you
will often get, okay results out of it. You're not going to get really bad results
out of it, but if you put in the effort to just look at some of their prompting
examples, look at the cookbook, go through some of these, you can actually get a
lot better results, out of doing this. So have a play with this. I put in some, some simple examples
with langchain at the end as well, just taken from the langchain site. Overall, I would say, hopefully you can
see that, this Haiku model really is One of the best models out there at the
moment, based on this whole sort of trade off of performance, cost and inference
time here, the model's very quick, it's super cheap, and yet it's still getting,
very high quality results, often, very close to on par with, GPT 4 results, and,
like I talked about earlier, you, We saw in the LLM leaderboard just how well
this is doing, compared to other, models when people are voting on it blind. so definitely, this is something
that you want to check out, and start using in your apps today. Finally, as a just a quick teaser for
one of the new videos, is that the Haiku model also works very well for
agents and for doing things with agents. So in a future video, I'll show you
actually running, CrewAI with the Anthropic, Haiku model, setting up that
model and just using that to basically, run a crew, delegate to various
agents, each agent using this model rather than using the OpenAI models. So look out for that, coming soon .
As always, if you've got any
questions or comments, please put them, in the comments below. If you found the video useful, please
click like and subscribe, and I will talk to you in the next video. Bye for now.