Stable diffusion VS Midjourney: All you need to know

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

AI art is one of the hottest topics when it comes to AI discussion in general, and there's questions to be asked here. Is it possible to get a high-level AI image generation for free, or is this future exclusively behind paid services? I've decided to compare two of the best examples in each category, Stable Diffusion versus Midjourney, to find the answer. First, I'd like to briefly highlight the main differences between the two. Currently, Stable Diffusion AI is an open-source text-to-image generator freely available to anyone. It supports thousands of custom Stable Diffusion models, each tailored to a specific style. It provides an extremely flexible customization model and has a dedicated community that expands the possibilities daily. However, it's also hard to run for an inexperienced user, and it requires quite a bit of learning to get the hang of it. Midjourney AI image generator is different in almost every aspect. It's not open source. In fact, it hides its secrets pretty well. And using Midjourney is only possible with a heftily priced subscription. To give you a context of Midjourney's pricing, the basic plan is almost as expensive as the Netflix standard pricing, but it also has restrictions on high-speed generation. Comparing Stable Diffusion versus Midjourney, the latter is less customizable and only has a couple of models, but they are extensive and results created by Midjourney are of very high quality. It's also a hell of a lot more beginner-friendly, to the point where all you need to use Midjourney is a Discord account. It's also important to note, the Midjourney Discord bot requires a constant internet connection, while Stable Diffusion can be run through a cloud server or locally. I mean, that does require quite a strong PC to pull it off, unless you're willing to wait minutes for generations to finish. Going beyond the general differences, the way both tools are trained has certain similarities. Stable Diffusion is very straightforward in its approach. It learns how to generate images by destroying them. The idea here is to add layers of noise over and over and over again until there's barely anything left of the original image, just so that AI will attempt to reverse the process and recreate the original from just a few scraps of data. The original Stable Diffusion model is based on a massive data set containing various art pieces. However, the fine-tuned models are more popular in the community. These models are trained on a narrower data set and will generally produce the chosen style quite closely. For instance, a standard model would perform worse when generating a picture in an anime style, while a model trained exclusively on anime-style pictures will have no trouble with the task. Furthermore, using pictures only from a certain artist can replicate their work with a certain accuracy, which opens a whole other legal can of worms. But I'll touch on that a little bit in a minute. It's time to talk a little bit more about the Midjourney AI mystery. It's a mystery since Midjourney is closed source, of course, so all we can do is make an educated guess. Such a guess would suggest that Midjourney combines the same approach of Stable Diffusion plus a large language model, or LLM. LLM is typically trained on a massive dataset of text and images. In this case, one such dataset is Microsoft's Common Objects in Context dataset. This allows it to learn the relationship between text and images and to generate text description of images. Afterwards, Midjourney will just have to use the Stable Diffusion method to fine-tune the relationship between the generated text and images. As a result, Midjourney has a great understanding of what kind of output is required based on the text prompt. And now let's address the elephant in the room. Where do the images used for training come from? Well, most of them come from LAION-5B, a data set with more than 6 billion images, photographs, renders of 3D models, and more, each with a text description. Naturally, there's a creator for each of these pictures. And even more naturally, nobody was credited during the AI training. Even though Midjourney stopped using LAION-5B after its second version, it faced a class action copyright infringement lawsuit this year. While one could argue that a similar fate could await Stable Diffusion, it's not under the same scrutiny due to it being free and thus not profiting from the copyrighted material. That said, Stable Diffusion claims that any image created with it can be used commercially, and that's where users can be held responsible depending on their local copyright laws. Now that we understand the AI art generator training and how both image generator tools are operated, let's take a closer look at the models I've mentioned. Starting with Stable Diffusion, there are many versions of the default model by the same name. Stable Diffusion is trained on a wide subset, so it does a good job in almost every style, though it's not as detailed or nuanced as Midjourney. As I've mentioned a bit earlier, Stable Diffusion's strengths come from the community-built fine-tuned models. There are websites with thousands of those models online, and some really creative members of the community even managed to completely transform a video into a pretty epic animation using Stable Diffusion. Father always said that you are the more artistic one. Have you forgotten you're my twin brother? The creator dealt us the same hand of cards. I just played mine better. Midjourney is not that customizable at all and relies on a single, constantly updated model. However, due to the more in-depth training, Midjourney has images of better quality, which come much closer to the prompt compared to Stable Diffusion. I have never felt the need to specify my prompt too much or use a negative prompt to clarify what I don't want to see in my generated pictures. On the other hand, with Stable Diffusion, negative prompt is pretty much mandatory to avoid generating nightmare fuel imagery. Talking about nightmares? Midjourney has a strictly enforced ban on any explicit imagery. Naturally, an open source Stable Diffusion has no such restrictions. And yes, there are even specific models designed to create heavily not safe for work imagery. One more thing I'd like to cover is the copyright aspect of AI art generators. Do you think it's possible to copyright AI art? Think about that for a moment. All right, time's up. If you said yes, then you are wrong. But don't worry, the guys that said no are also wrong. In reality, it depends. As of August 2023, AI-generated art can't be copyrighted in the US since the copyright laws only protect works created by human beings. In a recent case, a US court ruled that a work of art created by AI without human input cannot be copyrighted because it lacks human authorship. But you might have noticed a small loophole here: without any human input. Yeah, that's right. If a human artist uses AI to generate images and then modifies or arranges those images creatively, the resulting work may be subjected to a copyright as an original work of art by a human artist. At the end of the day, what's the main takeaway here? The thing to keep in mind is that Stable Diffusion is free and flexible, but requires more technical insight to use. Midjourney, on the other hand, is easier to use, is trained more meticulously, and provides a better result on average. I still believe that the open-source approach creates a much more potent soil to nurture the technology. But hey, only time will tell. Which AI image generator would you use? Or maybe you're using one of them already. Let me know in the comments below this video. And if you enjoyed it, make sure to leave a like and subscribe. I am on the warpath to create even more interesting videos, so be sure to join me in this battle.

Info

Channel: CoolTechZone

Views: 4,112

Rating: undefined out of 5

Keywords: stable diffusion vs midjourney, midjourney vs stable diffusion, ai image generator, free ai image generator, image generator ai, ai image generator free, midjourney ai, midjourney discord, midjourney ai art, stable diffusion ai, stable diffusion models, stable diffusion prompts, stable diffusion art, ai comparison, midjourney pricing, image generator from text, ai art generator, ai art, stable diffusion negative prompt, stable diffusion, midjourney, cooltechzone, ctz, ai

Id: EhCZW4aZLAE

Channel Id: undefined

Length: 8min 18sec (498 seconds)

Published: Sat Nov 18 2023