Dall-E 3 vs Midjourney vs Stable Diffusion XL comparison. Which is the best AI image gen tool?

Video Statistics and Information

Captions Word Cloud
Reddit Comments
generative AI is improving at an extraordinary rate and it has become really difficult to keep Pace with all the Innovations in the AI industry in this video we're going to do a head-to-head comparison between the three top performing AI image generation tools as of October 2023 D E3 mid journey and stable diffusion Excel to easily figure out which tool is the best we're going to focus on well-known weak points for generative AI human hands text and repetitive patterns with a non-obvious structure such as piano keys our test will be exclusively focused on the actual quality of the output but it must be said that your choice of a particular tool will of course come down to other factors as well while both stable diffusion XEL and the recently launched do E3 are free the latter by using Bing image Creator mid Journey does require a paid subscription also out of these three tools only stable diffusion is open source and can be run locally on users Hardware which makes it an ideal choice for those who are focused on privacy having said this we're only going to care about the best results in this video so without further Ado let's proceed with the actual tests for the first test we'll ask our generative AI assistants to create pictures of a group of software developers painting a mural what we're actually interested in here is the ability of these tools to correctly depict the shape and even the correct number of fingers when creating human hands just recently launched di 3 is currently available for free using the Microsoft Bing image Creator tool as was somewhat to be expected di 3 produced a group of stereotypical hipster Millennials and jeans with brushes in their hands and smiles on their faces the images look decent from afar but when you zoom in you can clearly see the errors and inconsistencies not only do the people depicted have deformed hands but their faces are twisted into shapes better better suited for lowbudget horror movies as if by knowing its limitation mid Journey initially produced zoomed out cartoon drawings so we had to do a bit of prompting but the final result in line with what we had asked for still suffered from the problem of distorted hands and faces for testing stable diffusion Exel we've used a tool called Focus spelled with three which requires a simple installation process and has a very clean and simple graphical user interface if you're interested in using stable diffusion locally on your PC you can find a useful guide for installing focus in the video description stable diffusion seems to have struggled with the notion of a mural which by the way is a large image applied directly to a wall or ceiling only one of the generated pictures actually depicts people who are painting a wall and these two men are seemingly holding hands and look less interested in art and more interested in one another as expected by now the hands and faces are not great for the second test let's see what we get when we ask for a cat astronaut playing the piano none of the three generative AI tools manage to get it right when it comes to how the keys on an actual piano look like specifically the fact that the smaller Black Keys are arranged in a repeating pattern of groups of two and three across the keyboard however stable diffusion performed the worst leaving out the astronaut part almost entirely in only one of the four generated pictures did the cat display some accessories that could be associ iated with traveling into space in order to see how well these three AI tools perform with tasks that involve generating text let's ask them to depict an underwater tea party and include a happy birthday banner D E3 got the text right for just one of the three images it generated however in the picture with the correct text we couldn't help notice some very strange artifacts a tentacle snail a rather awkward symbiosis and this unknown and unexplainable object these weird artifacts illustrate that the current generation of AI tools are still prone to hallucinations both textual and visual not only did the mid-journey results failed to include the required text Banner but the overall quality of the pictures was inferior to do e 3 the quality of the images generated by stable diffusion was even poorer and the request for a text Banner was ignored completely based on the tests we've carried out so far D 3 seems to be the winner if you need to quickly generate an image without doing too much prompting being image Creator might be your best bet it's free and produces great results but it does have daily limits and speaking of prompting do E3 might just very well bring the end of prompts we've used the Bing image Creator in this video but the DOL E3 model is also available under Bing chat where you can use subsequent commands to adjust the initial results however we've noticed the quality of text to grade with each new iteration therefore you might need to spend some time tweaking your instructions in order to produce the best possible results ultimately your choice of tools comes down to personal circumstances whether you're willing to pay a monthly subscription the amount of images you need to generate and how quickly you need them and also how concerned you are about privacy and keeping your data local however we do hope that this video has been useful to you if so please like and consider subscribing for more air related content
Channel: Taming AI
Views: 2,232
Rating: undefined out of 5
Keywords: dall-e 3, dalle 3, midjourney, stable diffusion, image generation
Id: pNgn6Ygk9B8
Channel Id: undefined
Length: 6min 51sec (411 seconds)
Published: Sun Oct 15 2023
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.