Exploring an AI’s Imagination (Stable Diffusion and MidJourney)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
there are now multiple ai models that can depict through imagery anything that you can dream of you can either let the ai come up with its own styles and ideas or through prompting get as specific as you want and you can get quite specific now not everything has to be fantastical we can for example just depict a hotel lobby now you might notice that some of the depictions have a sort of drawing or rendering feel to them this seems to be the default artistic style so to speak for the model that i'm using here which is mid-journey but through some clever prompting we can elicit some other stylings so for example maybe we want more of a cartoon style we could also go for more of a sketch style now getting more realistic seems to take a bit more effort from this model but still somewhat doable with keywords that influence the model to try to generate imagery to match more high quality renderings so for example here we have an initial prompt and then some keywords to indicate a higher quality image i wish i could say i've got this down to a science but i actually wound up just writing a program to append some keywords to the end of the prompt for me and just trial and erroring until i got imagery that i was happy with beyond adding more detail as a sort of styling and keywords you can also influence things like colors so for example maybe you want a black and white themed hotel lobby or maybe you want blue and gold or you want a hotel lobby but the floor is lava all of the images shown so far have been created with mid journey which is a discord based text to image generation application mid journey produces the highest quality images that i've seen so far from text to image ais but it struggles in one area in particular and that's with realistic looking images if you're looking to make imagery that's artistic or conceptual or rendering like there is no question that mid journey is probably exactly what you're looking for but what if you're after realistic looking images what i mean by that is images that don't look like they're renders and instead look like actual images of real life even if maybe they look like they're photoshopped or something like that but realistic real life looking photos for this there's actually an entirely open sourced and free model that you can just go and download called stable diffusion now there are also options to pay to run the model in the cloud the one that i would recommend that you start with is dreamstudio.ai if that's what you're looking for so if you don't want to fuss with making the model run locally or figuring out how to run it in the cloud for you i would check out dreamstudio.ai that's the official kind of cloud application for stable diffusion but if you do have a local gpu with about 10 gigabytes or more of vram or you just want to run it on cpu and ram locally or in the cloud just a little bit slower then you can actually just download the entire model and run it open source from huggingspace for free i will be putting links to all of these things in the description so definitely check out the description if you're looking for like anything that i'm bringing up here i'll have links in there so let's compare these two imagine you want an image of darth vader go-karting with mid-journey the imagery that you'd get might be something like this or this now don't get me wrong i love this imagery from mid journey these are awesome but sometimes we want something that looks real and not conceptual from stable diffusion we get well darth vader actually in a go-kart in realistic ish looking imagery and those images are still very unlikely so what if we want to return back to the hotel lobby example but with blue and gold theme again the results that we see here are far more like realistic and plausible looking actual photos as opposed to design or concept or render like imagery so what do we do with all of this both stable diffusion and mid journey are in what i'd call active development they're both constantly getting better plus they're just simply not alone here there are quite a few other text to image models in development right now and likely going into the future this is truly just the beginning and also while the resolution of something like stable diffusion is 512 by 512 you can combine this with upscalers which is exactly what's being done with the mid-journey images that i've shared here there's not yet an upscaler specifically associated with stable diffusion but there are quite a few upscalers in the wild at this point and i am sure many more to come as these models become more widely used with both mid-journey and stable diffusion i can see them being viable right now so things like environments and characters and costumes and all that kind of stuff seem to be really well done by mid journey for example but mid journey isn't the first and it's certainly not the last text to image model to be artistic in this way next i think stable diffusion like models are a very clear and obvious incoming replacement for like stock photos and imagery it's less artistic compared to mid journey but can make some pretty impressive stock photo imagery as well as other more realistic looking photos of whatever you want which is where mid journey at least for now suffers a bit so for example say you want a conference room here's one here's another one and another we can essentially make infinite unique images of whatever we want just by essentially just changing a random seed there are other things that we can tweak too but that's really the only thing that you have to change but these images they don't have to be so boring so for example what about a porsche 911 we could get even more specific what if we want a porsche 911 in the mountains maybe another you get the idea maybe instead we want a portion 911 but in the jungle you got it how about a porsche 911 on the beach okay maybe everything is just too easy what about a portion on 11 surfing all right that was a little harder but surprisingly okay i'm not sure what i expected with that prompt so how about a porsche 911 in your living room a bear scuba diving a robot climbing a tree so i think you get the idea but we're not done yet one of the other interesting things that i just sort of stumbled upon just playing around that i've honestly not seen anywhere else show with these models is how well they can like mesh things together so one example of this is using a prompt at like a as b so for example we can do people to illustrate this and the first one that i discovered was an attempt to combine elon musk and bill gates with a prompt that was just simply elon musk as bill gates for an iconic elongates which i promptly shared with the world and likely disturbed many but it doesn't stop there another example is elon musk as mark zuckerberg and these are just like so interesting to me with how the features of the two are just blended together pretty darn well like i can see both of them in in this human that has been generated but imagine trying to do that in like photoshop and really capture the essence of these two individuals all in one it would be a hard task i think whereas this this just figures it out and that's really cool to me now to make these i usually just generate a handful about 25 at a time with random seeds and then i just find ones that look good but it honestly doesn't take that long to do so how about another example how about brad pitt and john oliver again here are the here are actual photos this is brad pitt john oliver combined again i can't help but just be impressed with how well this model can mix the two and you can just see them both in there you see it they're there but also combined do i think that is so cool now i don't quite know what to do with this capability but there it is you don't have to combine people either you could combine other things like porsche and truck and now you have some porsche trucks similarly to a recent video i made with bloom a massively large language model that's just been released for free to the world these are models that have historically been accessible really only to a very small group of people and now they're just plain generally available for either a very low price like mid journey or for outright free like in the case of stable diffusion i think that we're going to be learning a whole lot more about what we can do with models like these over time and i just like i couldn't be more excited to see how this really does like change the world i know it sounds like super cliche to say something like that but these are like really really powerful and creative technologies that are just barely understood by people to leverage them even to the point that we've shown so far also i'm just struggling with just keeping up with the proliferation of these models and like i said with bloom most people in the world don't yet even have an idea that this technology exists so we're really just seeing a very tiny and honestly just biased group of people mostly developers who are tinkering with these models but that won't be the case for long especially as entities like mid journey despite being paid it is a very cheap way for someone with absolutely no programming experience at all to generate images through an application now if you are curious about the blue model that i've mentioned a few times here i'll also put a link to that video that i did in the description along with all the other links that i'll be putting in at any rate i think it's really important to reflect on like three or four years ago image generation models were basically 50 by 50 resolution images of things like just human head shots where it could just make more human head shots you couldn't specify what you were looking for and all that now we're at 512 by 512 imagery of basically anything that you want or can think of it doesn't even have to have existed before so i'm finding it really hard to even imagine like where will we be in just a year from now what about five years 10 years well if this progress like continues at this pace it's going to be insane yet again i find myself saying what a time to be alive and playing with this sort of technology and neural networks deep learning all of that if you're interested in learning more about neural networks you can check out the neural networks from scratch book by myself and daniel kukiyawa by visiting nnfs.io and like i said links to all the things that i've mentioned here are in the description i highly encourage you to go check out these two models if you can pay for mid-journey do it i think it's a good price if you can't afford mid-journey that's fine i think there is a free trial you can still just use your free trial and then be done play with these models it's so it's just really cool to kind of play around and tinker around and the more people that we get kind of playing around and learning about these models the more we can together learn what can we actually do with these models but like i said i think i think both mid journey and stable diffusion are honestly ready for they're like commercially viable right now like if you're a game developer and you're trying to you know design um some character or some boss or some environment like this would be a it's a really cool way to get concepts and ideas and if you're looking for stock photos and imagery stable diffusion can do some really impressive stuff especially if you're looking for more typical stock photos but if you want something unbelievably specific stable diffusion can do that for youtube right now and again it's it's free it's just it's it's it's crazy it's crazy to me so um with all these things you do want to be careful i know mid journey has a it does have a license and i want to say you can use it commercially up to a business of like earning a million dollars anyway you'll have to read like the actual like licensing agreement to find out if you can use it commercially or what you have to pay to use it commercially but like i said i i think no matter what you know mid journey won't be the last paid one and at least for now i think it's cheap enough to wear like i said i think it's it's good because you don't need to be a programmer at all to use it and for stable diffusion you can also just use dreamstudio.ai and pay a little bit that way again you don't need any programming knowledge or expertise at all but if you do have programming knowledge and expertise you can run it locally and do all kinds of cool things and just kind of just play around so for example like i was saying with like the meshing i tend to generate like 25 images at a time it's a little cost prohibitive to do that on dreamstudio.ai but locally who cares so anyways uh really cool stuff i i just i can't believe what's coming out right now and including the stuff that's for free but also mid journey despite being paid it's a really well done app it's so fun i've spent so many hours just like making stupid images on mid-journey that probably no one will ever see but it's just so fun and it's very addicting and then with stable diffusion i like i said i think i think we're not we're still going to have stock photo websites and stuff but i think it will be ai based like stable diffusion is so it's really exciting to see where we're headed and to like start thinking about where will we be so anyways that's all for now until next time you
Info
Channel: sentdex
Views: 112,997
Rating: undefined out of 5
Keywords: python, programming
Id: 2R0kGTuYmVI
Channel Id: undefined
Length: 14min 19sec (859 seconds)
Published: Sat Sep 03 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.