Kasucast #19 - Stable Diffusion: Worldbuilding an IP with 3D and AI ft. MVDream | Part 1

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome back I was working on another video prior to and instead of this one but the release of mvy dream a text to 3D implementation was released quite recently and it fit right into what I was working on anyway which is using AI tools along with 3D techniques to rapidly World build and intellectual property or IP in the first half I'll be going over the repository MV dream or multi- viw dream Booth model created by bite dance research if you didn't know bite dance is the apparent company of Tik Tok so the content should be quite good it also has one researcher from the University of California San Diego anyway there are two repositories for MV dream the first one is the vanilla MV dream which generates a four perspective view of the prompt you can think of this portion as perspective aware text to 2D reference sheet the second repository is the MV dream 3 Studio Repository itory which can be thought of as a text to 3D mesh model it generates a neural Radiance field which can later be converted to a mesh in the second half I'll be using the 3D models that I generated to construct an interior living space for the main character of a web novel I wrote without further Ado let's get started the portion of MV dream that I was really interested in was the text to 3D cap cap ability but it's important to first see the underlying foundation of the 2D version as mentioned before the repository for the 2D implementation can be found here under the bite dance organization if you look at the paper or the repository the developers mentioned that they fine-tuned their MV dream model using stable diffusion 2.1 base model and they used 256x 256 resolution but there were four images in it more on this later the reason why I felt that MV dream was pretty promising and warranted a video slash Deep dive is that they correctly identified the issues with generating multiple views or reference sheets of an object notably the problems identified are the multiface Janis problem and content drift the Janice problem is where an object has view inconsistency you can see that this horse has a head in the back and a head in the front so it's kind of like a orob Boro situation it's named after the Roman god Janice who is often depicted with two faces right here fun fact Janice is the god that represents the transition between one state and another such as the past and the future or in this case the anterior and posterior views the second issue of content drift is a situation in which the prompt and the out outputs start to diverge in their understanding or meld together so in the figure one here it says that you want a photo of a plate of fried chicken and waffles but you have Fried Chicken on the top but it slowly melts together and becomes a waffle which is what you don't want the paper is pretty thorough and the authors even go over the data set preparation in the appendix so if we jump there it's after the citations we can see that they're using renders of 3D objects from this public obersee data set before we jump into the methodology of how the authors solve the two problems mentioned before namely the multiface Janice and content drift I want to give some context on the training data to put it simply they are for monocular 2D RGB images of the same object with each taken from a different camera camera angle the initial angles in elevation and Camera field of view are randomized in these ranges so here for elevation 0 to 30 and random camera field of view from 15 to 60 however there are some extra camera parameters here these ones and I wanted to point that out because this information is also passed in and initialized in the MV Dream Network as an embedding using the parameters that they select Ed from here to here you generate 32 perspective viewpoints of a specific object so it's 360° ided 32 so you have an incremental angle of 22.5 de or pi divided by four radians the authors also mentioned that they started rendering from the front view so directly facing you you would say the Azimuth is0 degrees if you don't know what Azimuth is just imagine that you're standing on a large sun dial in the center when you rotate your body to face another number on the sun dial the amount of rotation is the Azimuth angle so say I'm standing in the center and I'm facing the number 12 and then I rotate my body so I face the number three if you look on the hands of a clock or a sendial you can see that I moved 90° in that direction if you have the initial View view whatever it may be let's say it's 60° the remaining three views are selected orthogonally to it so just rotated by 90° each time so you could have a training set for one object that would be 45 deges if that's your initial Azimuth and then you would have 135° 225 and 315 now that we have some context behind the training data we can better understand how the multi viw diffusion model is distinct from other text to 3D models in figure three the authors point out two key implementations the first one is the addition of four perspective views as training data to account for this change in input data format the authors changed the existing 2D self attention layer to a 3D self attention layer so if we go down here this is where they say that what this basically means is that when processing and object with multiple views each perspective can consider information from other perspectives to generate a more consistent and cohesive latent space and they said that after making this change they were able to generate rather consistent images even when the view Gap is very large so this is very promising the implementation for 3D self attention is in this file which is the python file named attention. p pii under the modules directory you can see here that there is a basic Transformer block 3D class that extends the basic Transformer block we can go ahead and take a look at that if you look here you can notice that there isn't anything denoting the amount of perspective views but if we go to the 3D block that extends this class you can see right here that there is a parameter or argument called numb frames basically this num frames is where the number of different perspective view renders are specified you can also see that it has a residual connection as attention one and attention two are added to each other so if we go and look at the figure so figure two here what I would just went over before previously was the self attention from 2D to 3D and now the second implementation was adding camera embeddings for each view you can see here that they have camera and it's added to time step if you don't know what time step or time embedding means here it's just the current step of the diffusion model but encoded in an arbitrary tensor or Matrix when doing your diffusion generation you're always setting the step count anyway in the code the final embedding is represented here in the open AI model py on file under the diffusion modules directory if you go to the multi view unet model so this class right here you can see the initialization of the time embed right here so self. time embed and the camera embed here as well if we jump down to the forward pass of the same class you can see here that it starts off with the embedding being the time embed down here the embedding has a self. Lael embedding added to it I'm not exactly sure what this is but I think if it exists then it's added and afterwards finally the final embedding has the camera embedding added to it right here so whatever the paper said was happening it's definitely happening here in the code just for reference regarding the camera embedding I assume from the data set preparation in the appendix that the camera parameter or just whatever is mentioned here in the paper but I'm not 100% sure so I dug a little bit deeper in the code and there is a camera utils python file where it does seem to take in those parameters mentioned before such as elevation Azimuth and it does the normalization for the camera Matrix and it gives you a final camera Matrix right here with the create camera to World Matrix so there we go now that the theory is out of the way let's get to utilizing this repository I'll be using Ubuntu on Windows subsystem for Linux 2 to guarantee the least amount of compatibility issues I have tried using both the 20.04 version and the 22.041 I'm using yes 20.0.0 the installation for this part is pretty straightforward just get clone it from the repository I've already done so so if you see I have MV dream so let's just go there once you're in you can create a virtual python environment like so like python DM VNV - VNV once you're in you can create a virtual python environment I've already done so but this is the command to do it like that so I won't be doing that and we can just LS and see I already have it right here to activate your virtual environment type in Source VMV or the name of your directory and then activate pretty straightforward and then you can see that my virtual environment is activated now all you have to do is just install the requirements so this part is pretty simple like that and you should be good to go once all your dependencies are installed you can use it by either running this command here or you could start up a gradio frontend server so let's just go ahead and use the gradio server just to see what we're doing visually and start up the [Music] server it's doing its thing right now okay so the script has finished running and it's told us that it's at our Local Host 7860 Port so let's get to it once you head over to Port 7860 you can see that there is a simple UI here for you to use there's also some examples pre-loaded in already so let's select some of those and see what we should expect so astronaut writing a horse an earth cute cat and then uh this one doesn't look that great but this one is actually surprisingly interesting you get a ship and it's pretty good one thing I want to note is that I think the first row of images up here is the coarse version and the second version down here is the refined version I also noticed that it removes the ground plane so if we just look at it the pedestal here is removed uh this one had didn't have it this one had the pedestal removed as well so I'm not exactly sure what the difference is because I looked in the code and they seemed pretty similar now let's go ahead and try it for ourself I'm going to keep elevation as a moo at their default and I'm not going to use a negative prompt because it seems like it's not really necessary this is one I tried using already this is a Sci-Fi glass panel so this is a pretty interesting design I didn't specify a color here so I can just add in something like black color scheme and see what happens there right now I have this very flat image I could probably do some perspective warping in Photoshop and get something that I could use as a texture in 3D I ended up changing to a black color scheme and now I have something like this you can see that the view is preserved if that makes sense here you could say it's the anterior view or posterior view depending on what you want to think of this as and then when you look at it from the back you should have a flipped image right so yeah it's seems pretty consistent with what I'm working with next up let's try using a different prompt and making a bit more of a complex prop this is something I tried using already and it's a Japanese kidssun mask it's one of those Fox masks you've probably seen at matsis or something like that we have something like this it doesn't look that great but it does seem to be consistent in the vein of Japanese culture what if I want to generate a uniform that goes with this so I'll go ahead and put in this prompt which is a shrine maen uniform let's see what this gives us and we have a figure in the first row but it gets removed and it becomes this fashion statement right here lastly let's try a human so I'll put in this prompt which is just a young man we have something like that so it's not bad it's actually pretty good for fast concepting and you could probably use this for reference image or reference sheet generation lastly I want to try one more thing which is probably like a Sci-Fi vehicle so here we go Sleek futuristic sci-fi vehicle we have something interesting the course looked pretty strange but the refined looks pretty nice actually there's a lot you can do here now that we understand the interlude or interstitial method of generating contextually aware perspective views from a text prompt we can get into the real exciting part which is the text to 3D section the theory for a text to 3D is kind of glossed over quickly in the paper in section 3.2 on a high level what seems to be happening is that the 2D MV dream model shown previously is first used to generate 2D reference images and then used in in tandem with an adjusted score distillation sampling method which is first introduced in the paper dream Fusion so right here and it's written by the author p in addition they say that they're using the camera parameters from before as well instead of using Direction annotated prompts I assume this means that when they're prompting in dream Fusion they're using front view side view posterior view prompts like that anyway way the four perspective view outputs from the 2D MV dream model along with the camera parameters are processed by Guidance modules to create a neural Radiance field or Nerf if you have multiple different viewpoints you can infer the final 3D scene AKA a Nerf as for what a Nerf is it's a neural network model that learns how light rays interact with surfaces in a volumetric representation this is different from a 3D object which is a g geometric representation if you have a Nerf you can get a 3D object using various transformation techniques in this case if we go down a little bit here to section 4.2 the authors mention that they used a multi-resolution hash grid to transform the implicit volume of the Nerf to a 3D obj file before I finish up on the paper in the same section the authors mentioned that they pulled users on how well the MV dream model performed on average 78% of users preferred the MV dream over others so this gives us an idea of how promising this MV dream model actually is however why speculate let's try it out for ourselves this time the installation of the repository for MV dream 3 studio is a bit more convoluted once again and I'm using Ubuntu on Windows subsystem for Linux 2 again so here I am and you can start off by just get cloning the repository so code https just copy that I've already done so if you LS you can already see that I've cloned it in here once it's cloned just go ahead and CD into it once you're in here you have to do one more get clone if you scroll all the way down here to where it says install MV dream there's a list of commands here so you don't have to do this part right here all you have to do is just copy this first line and then there you go let's go back and try it out here and just get clone that and see it says it already exists let's just go into extern and see and you have everything here the reason why I'm not going to do this pip install Das e here is because I'm going to be doing that in the docker file speaking of the docker file and the docker composed file that is what I'm going to be using to set up the virtual environment if you scroll down here and go to the installation markdown file here it'll give you some steps you can choose you can either do a vanilla here or you can use Docker this is what I ended up going with so if you don't have Docker engine installed for Windows subsystem for Linux you'll have to go ahead and install that I'm using Ubuntu and if you want to follow along you should click this second link right here once you click on it it'll take you to this page and first up you should start from here install using the APT repository just copy that come back over to your terminal and let's just go up one level and paste it in and should do all that stuff I've already have it added in so that's why it's like that and then let me just run pseudo app get update and then once that's done let's just go back and look at the next step we have to run this PSE sudo app get install Docker CE just copy that come back here and go again and I've already installed it so you can see that it's not installing anything new so if we go back to the installation markdown file we've installed Docker engine you can create a Docker group if you want you don't have to and if it's too much of a pain you can just skip it next it says to install the Nvidia container toolkit but we can do step four first because we're already here in the terminal so it's telling you to enable system D just click on that go here and this is basically how you do it the reason why you want to do this is to enable system CTO Linux commands on Windows subsystem for Linux to do that create this file here so let's just copy that go back to Here let clear everything and then I'm going to be using Vin so just type in vinm and then that and then it'll open that up I've already added this in so go ahead and add in the same thing written here and once you have that you can just quit out or WQ next up you can restart your Ubuntu system for good measure so just close it and then reopen it I'm not going to do that because mine is working and all you have to do to check is just run Docker run and then hello world and yes hello from Docker this message shows that your installation appears to be working correctly so Docker is working in our Ubuntu for Windows right now next up remember the step that we skipped or rearranged in the order which is step three we have to install this Nvidia container toolkit because we want to enable Cuda inside our Docker containers so let's just open that up and same thing if you want to install with appt this is what I suggest to do just clipboard copy and let's go back to Ubuntu paste it in go share overrate why not and it's done next up we have to install the Nvidia container toolkit packages so copy that go back to the terminal same thing do done go back again we need to configure it so we have these two commands that we have to run so first off copy that one come back here done and then it recommends that it be restarted so let's go back here to get our restart command here and paste done and let's just check again with a Docker run hello world great nothing's broken now we have to edit some of the docker files inside the docker directory namely the docker composed. yaml and also the docker file these are changes that I made on my own because I was unable to do certain things with the container that I wanted to do such as making sure that every time the code runs I'm not redownloading the model files I wanted to create a bind Mount so that these volumes will store the downloaded models I'm going to use Visual Studio code to edit my Docker file and Docker compose files however if you don't know how to get to the location of Ubuntu on your computer I understand to do that just go down to your terminal type in explorer.exe and then dot this will open up the current directory in a file explorer so just do that and then let me change the display so this opens up and if you go up here copy that and then open up visual studio code here and then just I'll create a new window move it over here open folder paste in where I was before and there we go what I said I wanted to change is the docker files so composed. yaml and Docker file first off you'll notice that there's this part in line 14 and 15 where it says environment torch Cuda archist and tcnn Cuda architectures I've set mine to 8.6 and 86 this is for RTX 3000x series if you would like to understand what values you should put in here you should visit this website right here so I said I'm using a gForce RTX 3090 so I'll go down here to the fourth bar open that up and you can see on the left side right here if you're using a 3000 Series it's 8.6 and if you're using a 4,000 series it's an 8.9 the secondary changes that I made to this Docker file are starting from line 64 down to 87 I was running into some permission errors and I also wanted to make sure that the cache of the model download was being saved I also made the corresponding changes in the docker compos file I added this dot dot SL huggingface cach as the bind Mount so I've made my Docker files available here so the docker file and the compos file are here on my own repository here kudoa just go to your repositories and then go to MV dreamor Docker so yeah here you can see I have the docker file here and the compos file here so you can just copy the same settings that I'm using and just account for the proper GPU on your machine here I have 8 .9 because I have two workstations actually have three workstations and I have one of them is dedicated to training machine learning models and that has an RTX 4090 in it so we're back in Ubuntu now everything should be downloaded everything should be configured correctly all we have to do is build our Docker image right now so I'm going to clear everything and before I do anything else I'm going to open a t-u session so this enables you to have two windows open you'll see why if you're not confident with this you can just open a second Ubuntu window no shame in that but I'll be using t-mo so I'll just go t-mo new- s and then your session name so I'll just call this MV dream and there we go now if you want to open a split pane window to work with you can so the command is contrl B and then percentage like this and then it'll bring this up in the bottom left I want a horizontal split so I'll just press h right here and then I have my second pane over here to the right if I want to move between panes I have to press contrl B and then left or right arrows like so so hi and then High two very easy now on both windows I want to move to the docker directory like that so I'm in Docker right now first up I need to build the image so it's pretty simple just type in Docker compose build that and it'll start building this takes a very long time usually this is fast for me because it's all cached already if it's not for you most likely you'll have to wait a really long time maybe like 30 minutes at least to download download all the dependencies and build the image once that's done you can start the container by going with Docker compose up once that's up it starts up the docker container however once the docker container is running how do you get into the docker container that's why I opened a second pane on the right so to switch over there remember control B and then right arrow here I am now to enter the running container make sure you're also in the same do ER directory where Docker composed was Ran So CD Docker here I am there are my Docker files and my compose so type in Docker compose exec this is to get into the existing container then you'll need the existing container's name which is three studio and then bash and now I'm in if you have a Discerning Eye and look inside this directory you'll see that there's a radio server file actually included so I'm going to start it up and show you why I don't use it so python gradior app.py launch this is the command that was given on the website I can go ahead and show that to you guys so here is the Three Studio repository and this is what is used in MV dream 3 Studio I'm just here because there's more information they start up the gradio server with launch here Python gradior app.py and launch however since we're running in a Docker container we'll have to add the flag listen here so dash dash listen this means that the port will be exposed on 0.0.0 point 0 as well so let's just go ahead and run that okay so now you can see it's available here so let me go in here 7860 and now you can see that we have this three Studio text to 3D web demo and if we go here we can go down and select a model and then click on that SJC but you know interesting enough there isn't a MV dream config here so if we check this SJC this config here does not resemble the config that mvy dream has provided so let's jump into Visual Studio code once again here we are if we go to configs directory here and we go now to MV dream st21 yaml this is what the configuration file should look like hash grid as we mentioned before in the paper has this so I don't think the gradio server gooey is the way to go on this and I definitely did not use it so next up I'll show you how I used it from the command line one note of warning while we're here is that if you run into out of memory issues or o issues for your vram can come into your config file depending on which one you're using I cannot use the shading one because you need a more powerful GPU if you have a 24 gab vram GPU you probably will just use this one that's non- shading so if it's not working or running out of memory just search up rescale and find that so here it's set at 0.5 if it doesn't work right now on 0.5 you can go lower like 0 .2 I've gone as low as 0.15 because my GPU had trouble creating some Nerfs on some prompts but on other prompts it was fine I wasn't sure why but once I lowered this rescale value to around 0.15 everything worked fine as we saw from the gradio server it seems that the necessary config is not there so we'll be using the command line to generate the neuro Radiance field the commands are executed in two stages first you generate the Nerf so it would be this command here and then afterwards once you have your Nerf created you'll use that to export a mesh I've already lowered my rescale value as I encountered out of vram memory issues in the configuration file I'm also electing to use the vanilla MV dream. yaml file so this one as opposed to the shading version because this one requires an a 100 GPU in most cases as I said before this is the first stage of the command you might notice that there isn't a negative prompt because there already is a negative prompt in the config file the default training steps is 10,000 so let me just look that up yeah trainer maxor steps 10,000 and this takes around 30 minutes to 45 minutes so let me switch over to my other workstation the Ubuntu version on this computer is on the very first run it will download all the models necessary I think it's only one model but it's only done once and it'll be cached because I set the docker compos file and the docker file to store it in the hugging face cache so you can see here loading model from Cache file excellent so we'll just let this run you can see 4090 and then now you can see that it'll start running down here this is the number of steps and it'll stop when it's at 10,000 so you can probably expect values around this range if you're using a 4090 just for sanity sake once the script is done running you can check it in the outputs right here however you don't want to check it from inside the container so you want to check from outside the container because they are mapped with a bind Mount if that doesn't make sense to you don't worry it's just Docker things so since I'm in t-mo let me just create a new window pane like that and then you can see that I'm not in the docker container right now I am outside in the MV dream 3 studio and see the outputs then you cd2 MV dream like so and you can see I have this many because I was playing around with a lot of different prompts once again if you want to find where this is in in your file Explorer explorer.exe and then a DOT this will open up the file explorer here so you can see what's going on so I went with so let's go ahead and look for the mask I'm not going to use this one this was just for demo purposes I'll delete that one I have an older version click on that you'll have a checkpoint here that's called the last checkpoint this will be used for the mesh exporting later but first off let's just check what we have which is in the save directory so click on that and then you can you won't have this export because this is for mesh and then if you look in this folder here at the very end there's a video so let's just click on that this is the video or the Nerf not bad right now that we have our neural Radiance field we want to convert it to a mesh and the export mesh commands are found in the Three Studio project repository here and in export meshes you have of slew of options that you can choose from however simple is best you can just run this one right here so if you're using this there is a few gotchas first off uh make sure you input the folder paths correctly it gets pretty complicated because you'll be using an absolute path and lastly you'll have to add one more thing after the system exporter type equals mesh exporter because if you go here and look at the documentation or when you go to the installation here this is probably better installation markdown file you can see that there is a warning here that says that the current Docker file will cause errors when using the openg G based rasterizer so you need to set this when you're exporting your mesh so context type equals Cuda so this is an obsidian file I created for this project it has all the commands I need and I also write down notes in this as well so if we go ahead and just look at the exporting script for the mesh it's pretty nasty but it's not too complicated remember all you have to do is find the absolute path and make sure that you have the configs and the par. yaml that's the first step and the last one is to make sure you're resuming from the last dot checkpoint file and then Lastly lastly you need to have this exporter context type equals Cuda so let me just copy that and we'll head over to my other workstation here we are let's move back to the container so contrl B up arrow and just paste in what we had in the obsidian file so here we go you can see it's doing its thing right now it's using X atas to perform UV unwrapping which basically means you want to project a texture onto a UV map and then it's exporting the textures doing stuff like that and then it's done you can see it says export asset save to outputs here it's in the save directory which is where the video was as well so we can just go back here remember the video is all the way down here and then we have this export here now so let's go into it and we'll have an obj file which is our 3D object and then we'll have a material file as well as well as the texture so let's just open up this object in a 3D package I'll be using blender because it's free so you can follow along car r in blender let's go ahead and import the obj file so file import obj right there and this is where my files are go here and remember it's in save and then export and then just click on obj and it should be in here if you take a look at the mesh quality it's not superb however if you do smooth it shade smooth it looks a little bit better let's just rotate it so it is facing us so move it up actually I prefer it like that we can go ahead and set the origin so origin to Geometry yep and then we can turn on the render settings make sure you're using something you're comfortable with I'm going to use cycles and GPU so let's go take a look this is what it looks like let's change some of the environment make the background completely black and then I'll have to add a light but first I'll add a plane okay [Music] so just do a onepoint lighting setup like so then maybe we can turn up the intensity of the light so maybe 500 too strong 250 just go with 150 I guess yeah so you can see that it's pretty good actually it reminds me of a voxal mesh this could also be a great starting mesh that you can dynamesh in zbrush if you want but yeah this is quite good and then let's just look from the back sure as a placeholder mesh this could be very very useful as in aside if you're not satisfied with the mesh quality you can try a second stage ref bment if we go to the paper in section 4.2 if we look at this sentence the authors say magic 3D performs relatively better by using dmet for second stage refinement if we go over and look at the GitHub issues for the MV dream 3 Studio repository there's someone asking a question here rold scem he asked recommended export settings and then one of the maintainers of the repository season sh he said that our experiment mainly focused on Nerf generation and you can follow these steps and he said that you can use a second stage refinement as well so he's just echoing the sentiment in the paper so if you're interested in running a second stage refinement you can follow what season sh said just click on Three Studio head over to the repository and then there should be a refinement portion in the mark so here we go you can think of the core stage Nerf as the first step that we did and then the second stage would be instead of having the mesh exporting you would do a second stage refinement as mentioned in the paper here it says magic 3D so here is also using the magic 3D refine SD yo file so if you're curious in running this just copy this and make sure all your file path paths are consistent so of course I was curious and I did try a second stage refinement on the original Kitsune mask if we head over to save and then look at the video we get something like this you can see that the design has completely changed like so so you can see that it gave me a completely different object because it had to refine on the same prompt but it was using a different model and configuration so after it was shown to be this I ended up choosing not to export this as a mesh because I didn't feel like this was any better you can see that MV dream performs pretty well for a concept sheet generation or a rough based 3D object blockout I ran the MV dream 3 Studio on a few more varied prompts and here are some of those objects here so go to render mode here we have a Gundam which is actually pretty nice looking if you zoom in it's all broken but if you're looking from far away it's not bad and then we have a chibi vampire right here yeah it's it's nice from far away and the silhouette blockout is the most important part of design so this is great next up we have some clothes this is the shrine maiden outfit from the 2D generation I just ran it on the 3D same prompt and yeah it's interesting you could definitely do a lot with this if you wanted to I have some other props as well such as this desk here a feathered plumage up here it could be like a quill or something and then a bone knife if you think about it especially if you have no 3D experience doing a 3D blockout will take you some time even if you're very experienced it could be 15 minutes or it could be at least 40 minutes so this is very very quick generation assuming you have a powerful GPU lastly we have this bed right here it's pretty big maybe I'll just scale it down since we have 3D objects we can do some cool things now such as rendering out these 3D objects and then using them as control net images to direct our brainstorming phase or process I have one in particular that I want to experiment with which is this vehicle this four-wheel futuristic vehicle that I got out of MV dream so let me go ahead and set up the lighting so let me turn off this Global area light and then turn on this area car like so and I think I already have a camera for this head over here I think it's this one no not that one this one uh one thing to note is that I've set up two scenes with different camera resolutions so this is if you want a vertical aspect ratio and I've set it already to the dimensions that are on sdxl so 896 X 1152 however if I want to switch over to a more horizontal aspect ratio I've already set up a scene for that so if you go up here go to scenes and then just double click on that scene and then here I am and the I'll just have to turn off some of the objects so back to view layer and turn off everything in here and just enable the vehicle so there we go and we can go ahead and just render this out here I am in my comfy UI setup I am using Surge sdxl and I'm using the latest version which is 4.2 just for reference let me go ahead and load in the workflow that I'm going to be using I set up my prompt already here and I'm using the sdxl model that I trained in my previous video This Is The Prompt An Elegant futuristic four-wheel vehicle silver metal finish intricate design and I've changed the preset for the aspect ratio to a 21x 9 so it's 1536 by 640 now all I have to do is come down here to the control net or revision Source windows and I'll just have to replace this so let me just drag and drop in my render here we go there we go there it is and let's just try it with canny right now so this is the first generation it's not perfect but it is looking pretty interesting so maybe we have to change some of the control net settings turn down the cany to around 0.55 or maybe 0.6 and I turn down the end percentage so it was not controlling too much of the design and I got this which looks pretty nice and I'll just generate one more for good measure here I am in XM view MP my image browser and we can just go ahead and take a look at some of the Unique Designs I received so these are all very very cool I like this one it's like a dune buggy sort of design yeah this one nice very cool this one kind of looks like a tractor or lawnmower snowmobile kind of like a NASCAR kind of thing going on Formula 1 yeah Formula 1 right here like a Porsche yeah and here we go this is the one I think that would be very nice for a product render or something like that now I'm going to start showing how and where MV dream and its 3D objects could help in the art process you might have noticed a trend already with the objects I was generating My ultimate goal is to design an apartment or room or living space for the main character of an imagin AR story I did some preliminary generations for the character and these are what I ended up with I really like this futuristic look but also simple and I think there's one that I liked in particular yeah this one so I think I'm going to base the character design around this version but we'll see how the room designing goes first regarding the background of the story and the character I wrote a web novel a long time ago that I didn't really share with anyone except for some friends for reference that was maybe 7 years or so ago the setting is of a post-apocalyptic world where countries are cut off from each other because of mutated radiation there was a World War III and because of the Fallout there is my Asma anyway the story follows a young woman who is an informant in the underground or dark side of the new world if you've read toaru majutsu no index um it's pretty similar to like the underworld battles there like they had group and school those characters I really resonated with them and I don't really see stories that often focus on the anti-hero or people who do corrupt things for corrupt reasons just because they can so I wanted to write a story similar to that let's get back to the objective which is interior room design for this character I generated some potential room setups for her and these are the ones that really resonated with me so let me just bring them up so it's very standard sci-fi I wasn't sure if I wanted to have a visible view but one thing I like is this wide curved display the chair is not very futuristic it's still like a gaming chair this is the key frame that I think I'm going to base the room around it's very minimalist and clean which I really enjoy and I think it fits in well with the main character's personality but I generated a few more just to see what was out there you can see that when you're using diffusion models it's not very imaginative with the chair because if you type in chair it just gives you a chair it it can't give you anything really insane looking unless you try to choose that I did end up having some other chair designs that I can show here there's two in particular that I really enjoyed the first one is this one this is very cool it kind of reminds me of a motorcycle helmet or in Sword Art Online they have the full dive gear they put on in helmet this is cool and there's another one I like this one as well this one seems a lot more like you're sitting in a cockpit or like a racing car something like that and you're reclining backwards I really like this design too but I don't think I can explore it in the concepting phase because this is just too complex to model before I go any further I want to First generate a floor texture if we take a look at the floor in all of these images so like this or this one you can see that it is a shiny slightly reflective floor I'd like to also add some texture to this floor by making it a marble texture so let's head into comfy UI to set that up here I am in comfy UI and I've input my prompt right here so it's pretty standard my trigger word and just marble floor texture all I'm looking to get out of this is a square image so 1,24 by 1024 so a 1K image and I'll up scale it right after so let's go ahead and see what it gives me here we have it this is the first texture it looks very nice already we'll just generate a few more and see what's going on and this is the second one this one's a little bit more bold and let's just go one more for a third one and here you are this one is a more muted version now let's go ahead and first upscale these and then turn them into other Maps so if we have a base color map I'll show you how I can turn this into a normal map height map Etc this is a very simple upscale workflow all I need to do is just put in a load image and then a load upscale model so right here I'm using ultramix balance then I just add a save image Noe and just cue The Prompt and then there you have it this is the 4K image and you can see yes it looks very high resolution so let's take this into the next software which is going to be Adobe sampler Adobe sampler also used to be known as substance Alchemist so here we are an adobe substance 3D sampler this is a paid software if you don't have this then you could use something else called Deep bump however that doesn't give you a roughness map so that's why I prefer to use this instead it's pretty straightforward to use and it uses AI to convert a color map into all the other maps that you need so just click on create new up here and then you can just drag your image over here like so it'll say image to material pick this one and it says AI powered so there we go import that okay so it's loaded in now usually you don't have to do too much but if you want to see what your 2D texture map looks like go over here and click on 2D view two and then you can click down here to see normal like that roughness metallic height Etc so there you have it if you don't know how to read these Maps it's pretty simple if it's darker that means it's going to be uh less rough and if it's brighter in value it's going to be more rough less rough means it's going to be shinier and marble is generally kind of shiny so I'm going to go up here in here image to material a I powered layer or stack you can change your roughness here so I'm just going to invert it it's a little bit darker and then just raise it up a little bit like so then it's a little bit shinier here okay and then if I want to make any other big changes I could probably change the normal map if I want it to be stronger so this is the normal if I change large details you can see it happening right there medium micro maybe micro a little bit a little bit on large and ABI inclusion pump that up a little bit yeah that looks pretty nice so once you're happy with what you have here you can go over here to this button share and this is how you're going to export your textures so click on that export as and you can go 4K if you want but this is just for visualization purposes 2K will be just fine for me and I'll call this marble 01 and Export here I am back in blender I already have all my objects from MV dream in here remember so we want to create the marble floor of the room let's just go ahead and create a plane like so scale it up crl a to apply all the transforms and then I'll pull out a second pane on the right side and go to my Shader editor so to create a new material just click new and then create a fake user click on this Shield icon here so your material isn't lost when it's not being used call this marble floor 01 and next up you need to have the node Wrangler enabled so if you type that it's an add-on that ships with blender so all you have to do is just checkbox it and make sure it's enabled once you have that you can use a shortcut control shift T and then this will bring up the window so go ahead and select where your textures are exported to mine are here and miscellaneous textures marble 01 and just press a to select all of those so once that's selected go ahead and click this button here principal texture setup and it sets up everything for you including the displacement so let's go ahead and just go to render View and see what it looks like and my lights are off so I'll turn on my lights yeah that looks pretty good let see maybe the maybe I won't use the lights for now and I'll just use scene lights scene world yeah scene world turn that off just to see what's going on here so some of the details might be a little strong scale of the displacement so that's pretty low now should be fine I think this is a good start now that I have a preliminary floor or blocked in here I want to open up the reference image usually you can use something like pure ref however you can also look at it in here so if you drag this up and have another pane you can go to the image editor and then you can see here that I've already loaded in the image so I'll just redo the process here and I think it was this image so now I have this here and I can see how it looks and compare let's go ahead and visualize here maybe the marble floor is not exactly what I expected perhaps I should add more of a metallic flare to it so I think I'm going to go and generate another metallic texture using the same method of generating the texture using my own stylized model or fine-tune model got my own metal textures like these and I'm going to go ahead and do the same thing in substance 3D sampler to convert these two maps that I can actually use the conversion process is finished and I'll just show what some of the metal textures look like this is the first one and then I have metal two like that and then lastly I have a metal three that is a little more tinted to the blue color however I can always change the tint using the base color and then adding some kind of hu shift node so not a problem here here's my pure file with the image copied over I was using the image editor but I might need that space later so I usually have my reference over here so you can see that the focal point is this desk and then also probably this chair and I will have a character sitting in it let me bring up the objects that I have right now which are right here so I can very quickly just select the chair that I have and deselect everything else I'll just pull this collection outside and turn everything else off here so I have my chair here now I can decide to either work on the table or I can block in the walls I'm also not so so on the design of this chair I think I did point out the chair that I really liked before I block in the walls and the ceiling I'm actually going to show another technique there exists a software on the market right now that can turn a 2D image into a 3D object like so it's developed by a company called Common Sense machines however the methodology is closed source and they do offer a free option right here but I would say like the maker or creative Pro will give give you better will give you higher or better quality on your mesh however for the purpose of just showing you that there's another option here to get 3D objects from 2D images you can try using this so you can take me to the cube app right here and I've already generated the chair here that I wanted however I'll just show how to use it so it's very simple just click on upload image and you can either generate through Discord if you want I've never tried using this or you can drag and drop or upload an image right here so let's just find another share that I liked so my image has been uploaded and if you click on it it'll start doing a preview here to show you what it'll look like uh you can use the Unos version if you want so 1.1 generate preview mesh it'll give you a very rudimentary mesh here after a while and then if you like what you have you can just click on refine mesh mesh to get the final mesh so that's the idea here once you've finished refining your mesh you can go ahead and download it over here and then click on obj for the refined mesh I've already done it so I don't have to and it will give you a zip file that you have to unzip and then you can import that object brought in the chair and put it in front of the MV dream chair and let's go to rendered view just to see what we're working with so this is what I have what I can do I'm not sure if I will do is try to maybe take apart some part of the mesh here I'll delete some parts of the mesh and then photo bash if you can imagine it these two meshes together but we'll see how that goes first I want to build out more of the scene so I think I will go with the walls next so I'm going to stop the video here but but don't worry I plan on making this world building with 3D and AI a multi-part series in this video I went over the mvy dream theory how to install it and some basic Blue Sky concepting techniques with 3D models and textures if I recorded more of the 3D process it would take more time to release this video and the run time would also be 2 or 3 or 4 hours so I decided to release just the first part like so in case people wanted to EXP explore MV dream but we're a little intimidated by the setup thanks again for watching and thanks to all the patrons see you next video bye
Info
Channel: kasukanra
Views: 1,001
Rating: undefined out of 5
Keywords: ai art, stable diffusion, digital art, machine learning, midjourney, photoshop, character art, concept art, character design, textual inversion, automatic1111, webui, how to make ai art, kasukanra, workflow, dreambooth, style, dreambooth style, training dreambooth for style, anime art, artstyle, controlnet, LoRA, digital sculpting, training LoRA for style, convolutional layer, vtuber, SDXL, SDXL1.0, style training, blender
Id: hIsvjfU5xO8
Channel Id: undefined
Length: 66min 45sec (4005 seconds)
Published: Tue Oct 17 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.