4 Realistic Models In 10 Minutes – Stable Diffusion (Automatic 1111 | Tutorial)

Video Statistics and Information

Captions Word Cloud
Reddit Comments
stable diffusion has so many checkpoints it can be a challenge to find the newest gems amongst to see your boobs and booty enhancing models now while those aren't a bad thing this is YouTube so let's explore four realistic checkpoints so you can spend less time reading and more time creating but like the video and give it to me bite-sized so our first checkpoint is Photon published by photographer and this model promises photo realistic and Visually appealing images effortlessly I love the thought of effortlessly generating photo realism especially if the results are good and looking at the example images they look pretty decent considering this checkpoint is on version one looking at the description the modeling coverage is a simple sentence for the prompt and I notice he has some comma separated styling at the end but they also note that negative embeddings should be avoided so I tried generating an image using the example image generation data to see what kind of outcome we would get and we get a very similar and impressive image the clothing is looking somewhat plastic and on the brink of tearing and the slightest bit of resistance and there's this lack of material consistency with the arms looking more like a stitch fabric while distressed areas look more like a plastic puffer jacket but the hair is quite erratic and captures the light quite nicely alongside having those individual strands and the detail required to see them the skin has a nice texture even at a distance and the overall Anatomy looks fantastic with nothing sticking out as odd or out of place but the lighting is somewhat harsh making it tricky to see the detail as there's a heavy contrast between the Dark Shadows and the light on the skin now I think that most of these issues are just styling problems so I ran another test using the simple prompt a woman standing in a field and I got this image which represents my problem nicely with excellent results it's simply a woman standing in the field looking like she's having the day out with a trench coat and there's a good amount of content in the image from that small description everything looks fine except for the eyes which seem to be a bit wonky and lacking the same level of detail given to other areas lastly I wanted to generate my own custom image to really push this checkpoint into areas outside of its comfort zone so so I generated an image of a woman standing in the lift the image came out well with the detail we would expect but with wonky hands and the background closer to a fitting room than an actual lift my environment pieces look fantastic but there's no alien spaceship sticking out of the building but we do get these very convincing street level shots which have so much variety you could mistake it for a photo if not for the bad logos on the building the next checkpoint is epic realism published by eponychian who has promises realism so I'm expecting that much at least it encourages the use of simple prompts with no need to use hieroglyphic keywords like Masterpiece or best quality which may or may not actually do anything they also specified that you can add Asian or Chinese to the negative prompts if you're looking for other ethnicities other than Asian so perhaps this model was trained in a majority Asian data set now something which sticks out to me is despite promising good results with simple prompts the example images have complex prompts and even the negative prompts of expansive I'm worried that my simple image generation will turn out poorly but we'll see running our first test I've taken the best image from their collection and attempted my own generation using their data which uses an embedding and the negative prompt called bad hand V4 and it uses an upscaler which I had to install from some dodgy looking website so hopefully no one's minding Bitcoin while I'm asleep but we get an image similar to the one we copied and the realism is actually on par with a photograph I'd expect to see on Instagram I'm not sure why but there's something about this from the detail on the hair to the erratic strands to the face which is expressive and well lit the clothing looks convincing with variations in the creases and wrinkles although the mouth and eyes have small artifacts when zooming in but besides this there's not much going wrong my next test is to do a simple prompt and I wanted to test this with both a simple prompt and a simple negative prompt keeping only that negative embedding we have installed to see why we're using so many prompts and something I noticed while generating an image with a simple prompt a woman standing in the field is that they are all very distant shots where the subject facing away from the camera I tried using only a negative embedding bad hand V4 and this gave me a closer shot so maybe the negative prompts but messing up the camera angles my second image came out fantastic even without all of those negative prompts and we get this lady standing in the lift in a corporate setting as described with all of that realistic detail we expect I also like how the shirt has different layers of translucency nether folds and where the blob is located finally for my environment images I will move the bad hand in Bedding as there shouldn't be any hands in our images and the results are pretty bad to say the least one bright side is that the UFO was captured this time but the downside is that the building looks really bad like a sketch rather than a photo I tried doing one with a negative prompt added in and the result was much better but still leaving much to be desired now our previous photos have better environments than this so I'm not saying it can't be done and to prove this I generated another image using the example photo in the amusement park and this result was infinitely better proving that this checkpoint can produce great environment pieces our next checkpoint is Juggernaut published by can do AI which sounds strong some expectations are already high looking at the example images you have a good variety of fantasy portraits and stylized pieces which I'll try out as it seems this may be a good multi-purpose checkpoint starting off with this discount Iron Man on the 4X Ultra sharp upscaler you get a very similar result to the reference image and the whole picture has this beautiful lighting looking like a shot from a movie the eyes look great the hair looks nicely textured and even the metal armor has dense rust and other materials giving it some nice variety I can't find anything out of place in this image except the ear looks slightly further back than it should be but that's me cherry picking next trying out a simple prompt we get this woman standing in the field and oh my God look at those toes I am so sorry to any feet lovers in the audience they avert your eyes but outside of defeat everything else looks pretty good their hands aren't too bad the clothes look good and we have a nice design and nothing on the face looks particularly bad except for one eye being slightly wonkier than the other the field also looks quite nice so no issues there next generating our own custom character we get a brilliant result minus the black blazer but everything else looks to be in place the hands are wonky but the surroundings are quite nice and nothing outside of their hand and left eye seems particularly odd we have that translucency in the shirt so you can see some of the skin tones coming through and the black mass tights seem to be rendering nicely we also are in any kind of lift it looks like a corporate setting at the very least next generating environment on its own I've noticed that some of these checkpoints seem to struggle with buildings especially glass buildings and I'm not happy with the results I got on this checkpoint as our buildings look slightly warped but there are other environment pieces provided as examples which look fantastic so it can be done using this checkpoint but next let's try out those stylized pieces to see how they turn out starting with this awesome painting of a boat now ours turned out worse using the same generation data giving us multiple boats in the same art style instead of one but this checkpoint can do different styles of art including a fantasy style Castle in the Sky and this oil style painting of some sunflowers which all turned out as expected with some slight variations between the examples and my own images lastly we have the similitude published by who saw what's this which is a mouthful and Promises realistic character portraits in a variety of genres but can do a multitude of other things X except for anime sorry weebs the example images have no seeds and a Severe lack of data which is already a bad sign so I try to fill in the missing gaps with a portrait resolution and using the high-vis fix with 4X Ultra sharp which will hopefully draw out some of that detail and mimic the portrait of the example images but no matter what we generate it won't look exactly the same because without the seeds we're rolling a dice with infinite size to land on but let's take it out for a spin regardless also worth noting that we have two embeddings installed called Nega sketch 2 and neg Anime the result is a v-listed looking character with borders on 3D and has some Anatomy issues around the lips and eyes the clothing seems a bit odd but perhaps on a different seed the result might be better the result isn't too far off what was advertised but without the generation data we'll just have to make an educated guess going forward moving on to a more simple character we get what we expected which is a woman standing in the field and the results are pretty good I don't see any Anatomy issues the hands are fine clothing is fine and the environment looks good this is a far better result than what we got before so I'm feeling a lot more optimistic now now trying out our custom character it turned out surprisingly well considering our first image no problems with the overall Anatomy except their hands struggled in this image with multiple fingers trying their hardest to look human their clothes are fine and accurate to The Prompt alongside the environment looking more like a traditional lift compared to the other images I've also noticed all of the characters seem to be having a stroke with one side of their face looking at wonkier than the other but I hope this character's accuracy cabbies over to our environment piece because one of the trickiest parts of sable diffusion is getting the checkpoint to obey your prompt and unfortunately it didn't as our environmental pieces don't have the spaceship or burning building but capture the building's details and had a bit better giving us this aerial view as opposed to a ground view thereby is floating posts and the trees look fine but it does seem to struggle exclusively with the buildings and above examples of much better quality environments proving that this checkpoint can do it well but to wrap things up I think my overall Choice out of all of these would be Photon as it gave us the most consistent set of results including a really great environment piece where the buildings look good and there was a high amount of detail with minimum artifacts or prompting required but let me know your favorites consider supporting over on patreon using the link in the description and of course subscribe this is bite size genius and I hope you enjoyed
Channel: Bitesized Genius
Views: 5,020
Rating: undefined out of 5
Keywords: BitesizeGenius, Stable Diffusion, Checkpoint, Model, LoRa, ControlNet, Prompts, AI, AI Art, AI Video, Stable Diffusion AI, Stable Diffusion Tutorial, Digital Art, Artificial Intelligence, Local image generation, Stable diffusion web UI, Machine vision, Computer graphics software, Realistic image generation, Local video generation., Beginner's guide, Tutorial, Midjourney, AUTOMATIC1111, automatic1111 install, stable diffusion ai, Control Net
Id: NCsw0OCYlas
Channel Id: undefined
Length: 10min 55sec (655 seconds)
Published: Mon Jul 24 2023
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.