LLAMA 3 vs Stable Diffusion 3 vs DALL-E 3 - Prompts and Images

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] right so in this video we'll be taking a look at stable diffusion 3 and what it can do and also taking a look at llama 3 and what it can do and this is going to be amazing some the stuff that these models can produce is pretty awesome now as usual the workflows will be available for the members in the in the membership area and what I want to do is to compare both sd3 and uh and also meta which is meta's llama is an amazingly powerful new model it can also do images but we're going to be using it both for prompting and also for images now I'm working that a comy UI and like I say the comy UI workflows will be available for members as usual for this video I want to ask you guys a question which is going to be about working with with AI working with artificial intelligence are you yet working with artificial intelligence do you plan to work with AI is it part of your job yet even if you're just using an app and what sort of stuff are you doing and using it for so I'm interested in that and I'm also so interested what sort of issues is it working well how is it working how is it coming along so I'll be discussing some of the things that I've encountered over the last 6 months or so and some of the problems I've had and what sort of things are we doing with the AI to to to change things up so we've got workflow with sd3 as the model I'm going to be putting in the different prompts which are manufactured inside of grock but using llama and I asked it to produce a bunch of sample prompts so with this one we have a futuristic the futuristic Skyscape neon skyscrapers robots Holograms rain futuristic futurism okay sci-fi and composition the cityscape should be Central focused with neon lights and Holograms Illuminating the dark sky and robers and skyscrapers in the background okay and we're going for a 16x9 aspect ratio which is something obviously very relevant given that we are looking at a YouTube video that looks awesome and let's see what meta actually produced so here we have meta exactly the same prompt and it's produced the Skyscape skyscrape here sky sky it's produced the cityscape and we've got some skyscrapers we've got futuristic feeling to it got neon lights so he's got most of the stuff that I got there and I don't know exactly what it's doing in terms of the the prompting because I prompted it using a specific command that says create a prompt and it seems to have formatted The Prompt in its own way let's take a look at this one this one looks very good I like the composition with the looking down on the city block and all of them have this stupid sign here let's take a look at the robot this is pretty Fierce this is pretty Fierce you've got the huge robot in the middle there so it's got a sense of imagination it's got the Iron Man beam which all robots need to have that sort of you know killer laser just to destroy anything that that upsets them we've got lots of cars and little people there and let's see the final one this one is similar to the other one but the the the the landscape is a little bit it's got a slightly different composition so overall very impressed let's take a look at a previous one that I did this one is a Mystical Dragon landscape format so obviously it can't do landscape format it's always going to be one to one square and we wanted ancient ruins treasure scales Wings fire we've got the fire going on random places and I would say we've got maybe we've got nice Wings we've got very nice scales very lot of detail so obviously with this company with meta they do a lot of very good stuff okay that's not good they they do a lot of very good stuff in terms of open- source models and and this is just another example of a really nice open source model so we can see just one at the moment I'm not sure why maybe it's it's a teething problems so this is another one here for the dragon now the the dragons look really good but they could be they have a comic book feel uh the original prompt was a Mystical Dragon keywords ancient ruins treasure scales Wings fire with an epic mood excitement style fantasy inspired composition and it's given us something that looks a little bit comic Booky not that epic but the detail is fantastic and with sd3 what we got was this epic looking image I I felt this was more artistic looking with the fire just everything engulfing everything engulfed in fire we've got the treasure chest there we've got the dragon as the central Thing theme and we've got the ancient ruins in the in the in in the background so I felt this The Prompt really well and it felt it produced a more artistic rendering but I didn't give it exactly a style like painting or photo realistic so I think we could maybe do that maybe say style photo realistic fantasy inspired okay that created this image here it's not that different but I would say aesthetically this one is maybe slightly better than the in the previous one so with meta we've got this here this is looking good I would say just a little bit more photo realistic it was already fairly photorealistic but we can see here I love the way the dragon is just resting on top of the of the chest sort of protecting it you can see the ancient ruins in the back there this one as well just looks fantastic little bit of detail very interesting detail with the rocks in the front and another very good image there so you get four images to choose from and as long as everything's working fine you can actually get quite goodlooking images in all four of them this very different from Google Gemini Google this is good Gemini Google is just well I think the less I say the better now we've changed the subject again so there was another prompt on mermaids but I'm sure would have had problems with with mermaids cuz sometimes the subject matter can be a little bit provocative so here we've got a steampunk Airship gears propellers steam clouds Wings industrial tone Adventure mood steampunk style and the Airship should be Central focus with gears and propellers visible and clouds and steam rising from the background so we've got the clouds in the Stream rising from the background it it's not the most convincing it's not the most convincing steam punk Airship that I've ever seen however we can check let's just zoom in a little bit I like the fact that it's giving me this Airship with you know the landscape format it's correctly formatted and there's nothing that looks weird about the format uh it looks like it's created a landscape image without any unusual stretching which is part of the problem you sometimes get with stable diffusion XL and with meta we've got these four Images they all look fantastic but the images would definitely have a crop going on here and the crop is definitely taking out some of the some of the important detail or some useful detail in the image this one looks really nice and I also think this one here this one here I think was probably the most satis Factory one with the no the crop is not quite as awkward it would be nice if this could have been in 16x9 format I think maybe I'll try to tell it 16x9 format cuz I did tell it landscape format in the past and that doesn't seem to have actually done anything useful so let's try this one in for Let's Tell It 16x9 image format and we did another one inside of stable diffusion 3 and we've got a really nice image maybe a little bit maybe a lack of contrast but we did ask for Steam and clouds and it's given us plenty of steam and clouds aesthetically I don't think this works quite as well but I like the I like the the composition a little bit better with the way this is designed the overall shape however let's see if we've got landscape format so we've got four Images all of them square and I don't think it's capable of doing landscape no matter how you maybe there's some sort of secret code somewhere maybe there's some secret instruction you can give it to to get it to do landscape I mean there's a big animate button here which obviously looks good but uh you know it it does seem to take quite a long time so all of all of these look good but I I think the composition just being limited to square does limit this quite a bit I think it it limits the the utility of the images quite a bit so we've got a new prompt this one is that's looking beautiful a ghostly forest trees Mist fog lanterns Spirits spooky ominous Gothic and the forest should be Central etc etc lanterns and other stuff visible wow that looks fantastic I mean that is so cool that is so cool I mean this is what you want in a in a landscape image you want something where there's a little bit of space some something that you could maybe work with to to you know if you want to do compositions with images sometimes you want there to be a little bit of space either at the side or at the at the center it's given us this kind of mysterious looking area in the center here and the Lantern's looking good as well and inside of this guy here we've got slightly more let me just bring it up got a slightly more comic book looking feel to this I I I'm getting this theme that this one tends to produce images that are slightly less realistic slightly less I don't know it's maybe the word doesn't come to me but it's I would wouldn't say hack need but there's a certain thing that you imagine when you when you hear a description and this one seems to be like the comic book or the game version of it if if you if you were designing a game maybe you would design spooky Forest looking like this it looks fantastic lots of color but it's not quite spooky is it maybe with this one where we've got these little Spirits poking out that's very creative I think the these little spirits poking out of the tree out of the hollows in the tree that looks pretty creative and I like the look of that so both of them are good but I think I don't know I think um the flexibility of sd3 is great I think this one here could do with a bit of flexibility with a little bit of ability to do landscape and portrait as well but we've only just looked at the portrait or sorry the landscape inside of stable diffusion you can do different aspect ratios so you can do let's see all of these aspect ratios let's try 219 and let's see what that can do and it is pretty fast for for an API I think we've lost some resolution here I think we've lost a little bit of detail it looks awesome but there's not quite as detail as much detail as there was before and it's still about 1 megapixel wide so with certainly with with stable Cascade I can get things that are 4 megapixels I can get 99 megapixels quite easily sometimes pretty goodlooking images at 9 megapixels this one is still limited to 1 megapixel and there is there is an upscaler but the upscaler is about 25 cents or 25 cents for each use which is quite a bit okay so I set up a different set of prompts as well for something a little bit more romantic and this one is a ballroom scene I've also put it at 1: one which is probably the aspect ratio at which this produces the best quality images so at one:1 the image quality is awesome there's no way that I can find of changing the number of steps that you can run I guess that's something that stability control over the API and if we oh look at this this is Meta Meta is awesome so let's let this download and that's looking very good I would say maybe stability uh sd3 is a little bit more photo realistic but this is looking good and this one looks almost as good as the one from stable diffusion that looks awesome that looks really really cool I'm not sure exactly how we get rid of the the little sign there that seems to be permanent part of how it handles things and this one looks cool as well so they're both very good and maybe we should do one portrait just one sort of phase portrait so I've asked meta AI for a face portrait for a detective who wears a deer stalker cap let's see if it can do an image based on this yes create the image let me just copy this for stability so I don't have to give it the Special command I can just say yes create the image awesome it's very quick and stability AI API is also very quick now as you can see that I think the quality of both of these are very good I think the quality does suffer a little bit when you go to the very to the very wide wide angle images when you go to the 21 by9 but I suspect some images at that level probably work work very well let's look at the prompt and see why we got that weird image so we've got Victorian detective deer stalker cap magnifying glass mustache piercing eyes top top coat a close-up portrait of a detective's face with a dear stoer cap prominent on his head a magnifying glass held up to his eye and a thoughtful expression his mustache is neatly trimmed and his piercing eyes seem to be scrutinizing every detail he wears a top coat and giving off an air of professionalism and Authority okay so I think the top coat was unnecessary because it just creates confusion as to whether it's a face portrait or not so that wasn't very good but I can see now why the magnifying class is held up to his like it's held up inside the uh the flat cap so it's very strange sometimes put stable diffusion also with stable Cascade it struggles with things like magnifying glasses and let's try again it struggles with things like magnifying glasses it struggles with things like pipes if you try to tell it someone is smoking a pipe it produces absurd results sometimes and with this one it's a little bit better it's a little bit better again we don't have the deer stalker I don't know if that's a deer stalker hat I don't know exactly how they Define deer stalker but it doesn't look like a dis talker to me but we've got the magnifying glass being held by the by the hand there it looks like an unfinished painting which is a really nice style and this one is a bit more realistic a little bit more useful I would have lik a little bit more more detail in the background let's go to meta and see what meta did whoo whoo whoo whoo this is awesome you can see there it's a bit more photo realistic this time then with the deer stalker hat seems a bit more realistic the hands and fingers look good as well let's go and check this out this one is pretty damn good pretty damn good not quite as high resolution as stable Cascade but pretty damn good it's holding the the pen in the right way as well that's pretty amazing and what about this one again it's holding the pen pen in exactly the right way it's an oddl looking pen and I think that's a dear stalker hat I think it's actually got the hat right as well so that's pretty awesome I think this is a very good AI now I said I'll talk a bit about some of the problems I've had with AI I use AI quite a lot and this is an AI called chat GPT with do3 you can see it produces amazing quality images and in some cases the image quality is better even than what we've seen today for instance with these mechanical mice it produces awesome looking organisms and it just has a fantastic ability to to style images and to create a lot of detail this is an area where some other models generally speaking do struggle but I've had problems because I was thinking possibly I could start moving to stable diffusion 3 rather than using this model because I find sometimes particularly when dealing with 16 to9 aspect ratios it doesn't give me exactly this the the images I want so I've done quite a lot over the last 6 months I've been using it it's about $20 a month and I've used it less and less over time and part of the problem has been it hasn't really produced quality images of the sort that I need now this is a conversation I had with chat GPT and darly 3 from quite a long time ago and what I was trying to do was to create a fairy just to create a fairy with a sort of whimsical environment so it'll be like something like this where the the fairy is flying along what looks like a Runway so you've got the little lights like the landing strips of a Runway and the fairy supposed to be gliding along like it's taking off and if you look on the right there you can see some more recent examples of what I managed to get using the the software using this particular model now Daly 3 came out 6 months ago it it became available 6 months ago what I found was that there were problems in the way that it rendered images so with this image here the fair looks fantastic the background looks awesome we've got the landscape format we've got the 16x9 format but we haven't got we haven't got the wings in the right orientation the wings are always behind the the fairy even when the fair is facing away from us and this was a consistent problem I could not get it to could not get to get the the fairy to have the wings in the right orientation and I had a conversation which sometimes I used to do this with stay with this particular software the conversation went on for quite some time and I managed to get gradually Improvement you can imagine how much it how much effort it took to to to prompt for all of what we're seeing here it took a long time and after a huge number of different iterations and in those days you could get two images produced at the same time so you could choose one of two images and occasionally you got something that look right so this one here it's a square format I wanted landscape but it's a square format and it looks broadly speaking what I was looking for now with this one the problem here probably was that the fairy was a little bit too far away and and we've got a sort of a funny looking foot we've got the face looks looks doesn't look Doo good and the trees were just way too huge I wanted something that looked a lot more cozy so I had a very long conversation and you can do this with chat GPT uh sometimes it puts things in you you can have portrait you can have landscape and it quite often confuses portrait and landscape and overall the the quality of the images was awesome but in order to get what I wanted I had to prompt and to keep on prompting until eventually I was getting the kinds of images that I thought were doing what I wanted them to do so I wanted to get a fairy as if flying on a Runway and I wanted that I wanted that to be the exact image that I got and I didn't want huge ugly trees in in the background it was a very specific thing I was looking for and maybe around I don't know maybe the third day of prompting I got something started to look like this which which looks pretty good for me it looks pretty good it looks photo realistic we no longer have a fairy we've got a flight attendant but that was a decision that I made we've got everything that's looking good and some of these images I shared over on over at over at Instagram now the problems I was having quite a lot of the time was that the faces looked weird the hands looked a little bit odd so in the end I ended up with a result that was like a compromise now you need to understand before before running all of those prompts what I was getting was this type of fairy this is the kind of thing you can get inside of B chat it's it's not very good it's it's very it's very formula it's very formulaic the faces look weird and you really kind of need to understand how the software works almost all the time the faces or the fingers look weird so you kind of need to prompt it so that it it looks good especially if you're not someone who's got very good drawing skills and you can maybe alterate yourself some people just use it as an inspiration whereas for me I wanted it to be the final product so so I continued working with it and I'll show you the results that I got in the end and here we are this is one of the images that I created which I felt did the trick they it captured the mood that I wanted and it's got the fairy coming in we've managed to eliminate problems with the face by giving it glasses and we've eliminate we've reduced problems with the hand by by giving it a rose to hold which just fixes a whole lot lot of problems the atmosphere looks fantastic and we've got the fairy kind of large in the in the in the screen because I've said she's coming into land she's going to hand someone a rose so I worked with it a lot in order to get this type of image and also there was some text and you the results were okay in terms of the text we can work with that inside a Photoshop shop or something it's not a huge problem but the problem is the the amount of time it takes to get something good it it can be a huge amount of time and the biggest problem is that this unfortunately chat gbt has a tendency to lose some of these conversations so some of the conversations have ended up being missing and I've contacted open AI a number of times saying hey what happened to to to the missing conversations and they can't tell me they they they haven't actually responded so that was like when I was considering should I move to sd3 I'm having these problems with missing missing results and not getting any support from open Ai and it's definitely tilted me in the direction of wanting to do these things on my own machine having the images you can download these but I don't want to download every single image a lot of them are not going to be usable but I do not want to lose the com ation as well but it does seem the conversations maybe just disappear from time to time uh and from what I can tell about a third of the conversations I had seem to either be missing or they're just impossible to find and that's a real problem because you you spend quite a lot of time telling the software what exactly you want and at the end of the day you can say to to the software look can you just use can you just create something Whimsical in you based on what we've discussed so far and then yes create that image and it's going to use maybe a conversation that lasted a day maybe two days to create something new something okay so it wasn't able to create the image on this occasion but most of the time I don't it looks like it's still creating something but most of the time you can tell it to create something based on a conversation the conversation could have gone on for hours and it remembers that and so these conversations end up being an important part of your your contribution to what the software does and what it produces so yeah so usually you can create something but on this occasion it seems to be struggling and it's telling me look you can create an image using something like darly by open AI so I don't know what's happened there it's maybe that this particular long conversation I did ask it to produce something based on what we had produced before but give it a futuristic feel you can see some of the images that he created here which are these fantastic looking fairy tale images with very futuristic drones and who knows what what else is going on but on this occasion it kind of failed and normally that's not a huge problem you can come back the next day but not being able to find the conversations themselves is a real show showstopper because you can't actually do anything with the conversations and you end up you end up losing out you can't use the logic again you haven't got access to exactly the the flow the logic the development that you followed when you were developing a final product a final image you can't reuse that and not being able to find the images not having any assistance from open AI has been a real problem so I'm looking if at all possible to move over to stable diffusion 3 and because the images everything stored locally I've got access to the prompts I can reuse them that is a lot safer I think with this particular work flow if you use it maybe once a month or once a week it's fine but if you're using it 10 times a day to create images or to do creative stuff it ends up being a little bit hazardous that's how I'll describe it so that's the problem I've got and I think the sd3 with its improved features improved ability to do landscape and portrait images might just turn out to be the ideal solution for the time being maybe once the opening eye service has improved maybe I'll take a look again at at coming back here maybe sometime when Sora comes becomes available I might restart the account [Music] [Music]
Info
Channel: Pixovert
Views: 2,389
Rating: undefined out of 5
Keywords:
Id: T-9PhcP5mVE
Channel Id: undefined
Length: 26min 52sec (1612 seconds)
Published: Mon Apr 22 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.