Another massive release just happened: Midjourney
V5 just came out. And I already found some tricks to get better results. But why would you even want
to use this? And once you use it, what do you need to look out for? We're going to be discussing
all that, plus how to use GPT-4 to get superior, majority V5 results today. So let's get into it!
So first up, I need to tell you that only paying mid-journey members have access to this.
But if you're a paying subscriber starting at $10 a month, it's quite easy. You just go
into Discord, say "settings," and now you can change your model. This is the most comfortable
way to interact with it. Otherwise, you need to include "--V Space 5" at the end of your prompts.
But why would you want to use it? Well, simply put, the biggest reason to use V5 against V4
right now is a desire for more realistic images. Concretely, humans. They finally fix the hand
situation, so sometimes you still get an extra finger, but it's not a catastrophe every single
time anymore. Also, the photorealistic qualities of this thing just went through the roof. These
pictures look as if they were taken with a camera, like they're 95% there, with great skin texture
and proper hands, and even details like teeth now being properly represented because they train
this new model to do exactly that. You can start using this thing to replace models right now.
But the more realistic settings that are baked into the system are not everything here.
Overall, you can view V5 as more of a pro model than V4 was. You need to go more into detail
to get better results. So as opposed to GPT-4, where it got smarter and it got a little easier
to prompt, this moved in the other direction. You can get better results, but your prompts need to
be better.And here's one of the biggest changes: no more prompting with keywords only. At least,
they don't recommend it. You should use natural language, just like I'm speaking to you in
right now. Just like you would communicate with a human. If you just take a step
back and think about this for a second, you'll realize that natural language includes so
much detail and so much direction that obviously, it's going to be able to craft better results
from it than language including details better. All those little words in between actually do a
lot to communicate what we want to communicate. So no more keywords only. But once I show you the
prompt that I crafted in GPT4, you'll still be able to use keywords, because GPT4 is gonna do the
natural language processing for you, but more to that soon. Another change that I personally really
enjoy is that it's upskilled already. So no more waiting twice if you want to upskill something. So
if it just generates a cat with a hat right here, right away, you should be able to see how
realistic these are by default without me including anything like photorealistic or
aperture. All I asked for is a cat with a hat and I got photos of a cat with a hat. So yeah,
I just realized, this photo-realistic capabilities are baked in now. And, as I mentioned, the
upscaling is super fast. So, in real time, I'll just click U1, meaning upscale one, and without
even finishing my sentence, the thing is done. This looks way too real! If you were to
show this to a normal person, I think this passes as a photo, as long as a person is not
consciously looking for AI-generated features. With my photography background, I would say the
depth of field gives it away a little bit. These whiskers should have been sharp because the
ones on the left and the eyes are sharp too, and they're in between those two focal planes.
But that's just me nerding out. Now let's move on to the interesting part, and that is how
to get better results by using GPT-4 for this. So, what I did here is I prepared a little prompt
for you guys. What you want to do is you want to go to GPT-4. As a result, the outputs for this
are slightly better than 3.5 because we're kind of pushing it here with the amount of detail
we request. You will be able to simply copy paste this from the description of this video,
but essentially, it makes GPT-4 act as a stable diffusion photography prompt generator. And then,
we specify the input and the outputs. So, we input the visual description, and it outputs a detailed
paragraph that I can copy into my diffusion model. Include a variety of photography-related
terminology, including the description of the exact Zeiss lens. So, I specified a brand
here because that's my favorite look, and that's also a lot of the equipment they use around here
because I just love that sharp look. It looks very professional and very premium, very luxurious, and
this is what I want for my images too. But you can switch this up to any brand. You could even say
iPhone lens to imitate photos a normal person would take in their everyday life. Either way, the
prompt continues, and most importantly, a detailed description of the volumetric lighting. So, if you
didn't know, volumetric lighting is a term that I borrowed from my background in videography and
cinematography, and it's what some people refer to as God Rays. Essentially, this diffused beam
of light in the background of the image. Again, you can change this for a different type
of lighting, but I really enjoy this in my images. As in combination with the Zeiss lens, it
makes it epic and luxurious, something I really enjoy in my images. And then we end with, now,
right near clear and concise natural language visual description of the following. The natural
language part here is very important because, as I said, it doesn't want keywords anymore.
But the cool thing is, I could go ahead and use keywords to describe this now, because I
have a prompt generator. So let's do this. And, my idea here is, let's generate some prompts that
recreate items in my room. For example, there's the ship inside of a bottle right on my table. So
let's try and recreate the epic version of this, inside of Mid Journey V5. Okay, so all I'll do
here in the end is, a ship inside of a bottle, and say 'Enter.' That's simple. And what it does
is create a detailed description of my scene, including a specific lens. And I'll squeeze
in this little tip: the lens here has become increasingly important, as these outputs are
very realistic. Just like in photography, one of the main things that influences that look is
your lens. So, there never has been a better time for you to learn a little bit about photography.
And although I would love to sit for a few hours and give you a basic lecture, because still,
I'm so passionate about all this camera stuff, I'll make it simple for you. And I included two
resources in the description that will give you a basic overview of lenses. Okay, so look over here.
Sigma is one of the main manufacturers of lenses, and at this fantastic page, which shows
off different focal lengths and full clips, are essentially how wide your shot is. That's
all you need to know. And whenever you want a specific look, you can simply look at this
page and pick this millimeter number right here and switch that out inside of your prompt.
Now, if you want to get even more advanced, you could pick specific lenses. And a good
way to do this is by going to the website of B&H Photo Video.com. When you go to Photography
and Lenses, you can pick from all the different brands. So just pick SLR lenses because there's
the most variety there. And then we'll do something most people won't do: list from high
to low. We want the most expensive ones, right? And right here, you already see that it actually
shows the Zeiss Auto's lenses which GPT4 picked by default. I mean, how smart is that? And
the Otis lenses are very well known for portrait photography. By many people, they
are considered the holy grail of portrait photography because they're super expensive,
super sharp, and GPT4 already uses them. But you could just look at this bundle, and here
in the description, you have all the lenses in the bundle. So I could just copy this 100 millimeter.
And again, here I would just check, "Ah, 100 millimeters, really zoomed in? Yes, that's kind
of what I want with my ship inside of a bottle. And once I copy my prompt into my journey
again, I'll just say, I'll just say: /imagine as per usual. I could just go ahead and
copy the name of this 100 millimeter lens, and that's the only thing I'm going to change about
my prompt. Okay, and now I'm gonna hit enter. So now it should be less wide
and more focused on the details, which is exactly what I want with something
as detailed as a ship inside of a bottle. Now, all we have to do is wait for a minute and
look at the results. Will you look at that? I mean, that's pretty impressive. Look at that. That
looks so realistic. This could easily be a real photo. Now, given I could have included a lot of
specific details to make it look just like this, but honestly, I just wanted to see what
my journey plus GPT-4 would come up with. Okay, and to round this out, I just want to do
one more because I know in V4, this prompt always generated an image of something very cartoonish
looking. I'll just say: "AI YouTuber at his desk." And funnily enough, it actually uses a lens
that I typically have on top of this camera, the Zeiss, but I usually have the 40 millimeter.
Oh, it really went hard on this description, so let's see what we get here. And I'll just switch
this to the 40 millimeter And hit enter! Oh wow, look at this. Okay, given there are many
details which are messed up, like what is like my journey, develop some camera prototype
that I've never seen before, but look at this guy sitting at his desk, and the hands are
not good, but they're also not terrible. They used to be just terrible, right? Okay, so that's
it. If you create some amazing images yourself, I want you to join our Discord server and share
them with the community. Me and many others in there are posting their creations and discussing
what is going on in the AI space on a daily basis. And if you want to get even better at Midjourney,
you should check out this video because it will teach you the ultimate hack on how to get the most
out of it with just one keyword. See you there!