OpenDalle v 1.1 is Insane! How does it Compare to DALLE3?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I've been hearing a lot about open do and how people say it's better than sdxl so I decided to do a comparison for myself for quite a few months I've been looking at different models and versions including standard and turbo ones I wanted to show you that open Dolly version 1.1 is also hosted over here on hugging face now what I like about this particular layout is it shows you the prompts that were used for these image examples that were generated using this model now if we scroll down we can see exactly what the developer claims here about this particular model and what I think is interesting is they say it's proudly now we'll say proudly starting a notch above sdxl but they also say while Dolly 3 is still the big cheese uh I've been a dolly 3 user for quite a long time so I think there's not really anyone out there that's better that's better suited for doing this review than me no I'm sure there are but anyways Dolly 3 is still the big cheese we're hot on its heels so they claim to be close to Dolly 3 I think you guys will be pleasantly surprised where this model shines but if we scroll back down here they also talk about being very loyal when it comes to sticking to your prompt the soul of open Dolly sticking to your prompts Like Glue version 1.1 takes your words and turns them into visual masterpieces that are just what you pictured maybe even better well let's take a look all right so starting things off I'm going to start with prompt loyalty how loyal is the open Dolly model when it comes to prompt adherence now I think it's only fair that I compare this to Dolly 3 because after all the model Creator did Nam drop Dolly 3 and for God's sakes dolly is in the name of the model so yes I'm going to compare it to Dolly 3 when it comes to prompt adherance uh it was one of their biggest claims so let's try it out I decided to use this prompt right here it says a photo depicting an Nvidia GPU as the centerpiece on a spacious desk the room's atmosphere is enhanced by glowing RGB lights that transition between between different colors the GPU service showcases every detail from connectors to branding in the background there's a monitor displaying a wallpaper with vibrant colors so first up is Dolly 3 I plugged it in that prompt and this is what I got a beautiful Nvidia GPU sitting on a desk and behind it is a monitor with a wallpaper and vibrant colors I got exactly what I wanted but the same can't be said for the result that I got running through focus and the open Dolly model so here's that image right here I guess we can't zoom well we can zoom in but I don't see a GPU anywhere in here I guess you could say that's a GPU but uh I don't know anyways I did a batch of six images in this test and not one of them had a GPU in them which kind of bummed me out and what I found was the longer the prompt is the more difficult it is for well I guess you could say stable diffusion in this case to remember it and I don't know if it boils down to the specific spefic model so I can't really pin the blame on the model itself I do notice this with other models as well using stable diffusion I can't really remember all the details so usually the shorter and more detailed it is in a shorter scale things tend to work out better for stable defusion so but the point of this being that this is a prompt that came from Dolly 3 basically what this is proving is that it can't take a dolly 3 prompt and create the same image because Dolly 3 can remember longer more detailed prompts stable diffusion isn't quite there yet it doesn't matter what model you use it it really doesn't at this point it can't remember everything and it might have to do with vram too but I mean I have 11 gigs on this 2080 TI that I'm using so I don't think that's what it is I think it's just stable diffusion in general so I'll run this and I'll shorten it up right here just this part right here then we'll see how it does so because I kind of already knew that the longer and more detailed prompt with more words was going to basically flop with with the open Dolly model I decided to shorten it up to just as boring a photo depicting an Nvidia GPU as the centerpiece on a spacious desk so this is of course Dolly 3 let's take a quick look at the two results that it gave me right off the bat you can see a beautiful Nvidia GPU sitting here on a nice desk a nice clean desk unlike my own well M mine's not very bad but this is what I got here in Dolly 3 now if we take a look over here in Focus uh well let's take a look so we have one here with a GPU very similar results this one doesn't have a GPU but it does have Nvidia logo we do have a nice little cute little Nvidia GPU on a desk again another one with well if you want to say a wallpaper with an Nvidia GPU counts there you go and finally another one no GPU but we do have the logo and another one without a GPU so uh it's okay it's doing pretty good again not quite as adhered as Dolly 3 so shorter prompts definitely adhere a little bit better overall not too bad however when it comes to realistic portraits this is where I was pleasantly surprised to see I was amazed uh how well that this model performed when it came to realistic portraits now this this can all come down to the prompt that you use the details you use in the prompt uh but yeah look at this image here let's scroll down I'll click on it for a little bit bigger image you can see that the skin tones are good everything looks really nice uh I did tone down the guidance scale a little bit because if I had it up to the recommended seven or eight uh on the scale the skin looked more shiny it looked plastic and kind of fake and I'll show you the example here as I scroll down you can see that this image here is a guidance scale of seven you can see how much more shiny it kind of starts looking fake more rubbery so to say and then on the right here another one at guidance scale 2 looks much more realistic The Prompt used for this particular test was a well-lit studio mid shop photo of a redheaded woman with long straight hair whistful smile freckles hazel eyes dark gradient background one important note that I wanted to mention was when I tried to make any image using this model in a landscape resolution they all tend to come out more pixelated and noisy you can make it out pretty well in this image you can see that there's a lot of artifacts in the eyes uh the hair is very noisy and Pixy on the outside the finer details seem to diminish and again that's just something I've noticed with this particular model I've created many images using this resolution uh using all kinds of different models both turbo and standard models and haven't really had this issue so I'm not sure exactly what it is um well maybe the developer will see this and maybe he'll Rectify it okay moving along finally last but not least hands and fingers man I was shocked when I ran this test uh using this model uh open DOL version 1.1 blew me away when I ran this prompt a photo of an Asian man holding up one hand every single image in the batch of six came out with all five fingers and for the most part they looked really decent here you can see there are a couple that looked that look a little oversized maybe some this one kind of looked a little weird but overall I was very impressed now did I get lucky I think so because I ran another batch and some of them had three fingers and four fingers but for the most part 90% of the images came out looking good so nine out of 10 for hands and fingers on open Dolly version 1.1 I think open Dolly definitely shines when it comes to creating illustrations like these here that I found on Reddit however I do see potential in this model it's definitely there but it's becoming so difficult for me to go back to using models that require 60 steps to create images when I've been spoiled with these awesome super amazing models that only take 6 to 10 steps plus they create fantastic results don't take my word for it be sure to give open Dolly version 1.1 a try for yourself put it to the test doesn't measure up to its claims and hype you be the judge hey guys if you want to check out more of my model studies be sure to jump over on the Mind renders Wei page at wii. miners.com I'll be sure to add that link in the description below so you guys can just click through just click on model studies and then you can see all of the different studies that I've ran with different types of models and here's an example of some sneakers inspired by comic heroes as you can see there is the Iron Man one I want to thank you guys so much for watching today's video if you like the content that I create be sure to click that thumbs up if you haven't subscribed consider subscribing and I will see you guys very soon
Info
Channel: Mind Renders
Views: 2,090
Rating: undefined out of 5
Keywords:
Id: HUY_DklN8L4
Channel Id: undefined
Length: 9min 5sec (545 seconds)
Published: Thu Dec 28 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.