Stable Diffusion IC-Light: Preserve details and colors with frequency separation and color match

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello everyone, and welcome to another episode  of Stable Diffusion for Professional Creatives.   Today, we'll be taking a look at the new  version of my workflow for relighting and   preserving details and colors for product  photography. We'll see how this workflow can   develop a pipeline that starts from this image  on the left, which is badly shot on a phone,   then gets rendered in a 3D environment and studio  scene, and how that render can be used as a base   for generating wonderful campaign shots as well  as animations, as we can see here on the right,   all the while relighting the subject, preserving  details, and preserving colors. This workflow is   probably the most production-ready workflow  that I've ever produced and released,   and I think I can say that because I am  actually starting productions with some brands. A ton of you asked under my previous video if I  could implement some stuff. Each and everyone of   you had different requests. The way I usually  work is by providing a bare-bones workflow,   something that you can expand on and make your  own. But since a lot of you requested different   things like IPAdapters or other segment  anything groups or color matching tools,   I implemented all of that. You will be able to  find this workflow in the description below,   and this time, you will be able to find  all of the assets I'll be using in the   full pipeline from photography to  3D to generated images as well. Why do I usually only provide a bare-bones  workflow? Because if I do not, you get a   monstrosity such as this. Now, this is  still usable as a one-click solution,   but it's a lot less understandable than the  previous workflow. So I have added a ton of   notes telling you which groups can be disabled  and what each group does. But as a quick overview,   this center here is the same as the previous  workflow, and I added an IPAdapter group here   for taking care of either all of the re-background  or of regenerating parts of the set based on a   segment anything mask that acts as an attention  mask for the IPAdapter group. Then we have a group   that takes care of blending multiple segment  anything masks together over here. Then we   have a very quick and easy SUPIR upscaler group  here. And by popular demand, not one, not two,   but three different color matching options for  preserving the colors of the original subjects. Now, let's go through all of the  workflow and what has changed.   If you don't care about going through all  of the workflow, you can just jump ahead.   You have the timestamps for it and can  see the whole pipeline starting from a   single badly shot iPhone picture and going  all the way to a campaign-ready picture. Starting from the top left, you have your  "Start from Existing Product Image" group.   This is the same as the previous workflow and  it sources and resizes an image upon which we   are going to do all of the work. Then we have  a models group. Since some of you asked for a   perturbed attention guidance node, I've added  that as well. In the workflow you download,   it's going to be disabled by default, but you  can re-enable it. The only thing you have to be   wary of is changing the CFG accordingly in the  Ksamplers because PAG influences CFG values. Then we have our "Generate Still Life Product  Shot" group, which is disabled by default because   we don't want to generate random products. We  want to start with existing products, but in   case you don't have any product shots, you  can use this one to generate a new product. Then, going down, we have a background generator  group. This group generates a new background that   is either close to the one that we already have  or very far from it. Some of you asked for a way   to integrate an already existing background.  That would be easy—you just have to blend the   original image segmented with the SAM on  top of an already existing background. But   I didn't want to implement it in this workflow  because that can be easily done in Photoshop,   and I felt like it would mess things up here  given how easy it is to do it any other way. The new additions to the workflow are the  IPAdapter for background or selective parts   group and the SAM for IPAdapter attention mask  groups. These two groups influence the background   generator. They can be used together or the SAM  group can be disabled. If you disable the SAM   group, the IPAdapter group is going to work on  all of the image. If you enable the SAM group,   the IPAdapter group is going to influence only  certain parts of the image. Like in this case,   where the background is  staying basically the same,   but I'm influencing the wood branch  to turn it into wool. For example,   you can influence only the background, only  parts of an image, or all of the image. In the center, we have the "Segment Anything"  group that takes care of segmenting out the   original subject, and that's the same  as the previous workflow. But we also   have an optional "Multi-SAM Mask Composite"  group, useful for those of you who are using   multiple SAMs and need a composite mask  of everything that is being segmented. Then we have our "Blend Original Subject  on Top of Background" group, which takes   care of blending the original subject  on top of our re-background image. Then,   we move on to our "Relight" group, which I've  rearranged into two parts: one for the actual   relighting and one for the "Mask Alternatives"  group. Inside the "Mask Alternatives" group,   you have all the ways with which you can create  masks for relighting. Each one has an explanation   at the bottom. You can either start from an  existing mask that you load into the workflow,   or you can draw a mask yourself in the "Open in  Mask Editor" here, or you can use the "Color to   Mask" node to influence the lighting based on the  lighter parts of an image. This works essentially   like an HDRI of sorts—that's a very loose term,  but it's the closest thing I think—so it uses   all of the lighter parts of the re-background  image as light sources for the relighting. Then we have a SUPIR upscaler group. By default,  it's set to 1.2x, but you can set your own rescale   factor. It's bypassed by default because it  takes a long time to calculate everything. So   for demonstration purposes while I record this  video, I disabled it. Inside, you can find an   "Upscale Original Image and SAM" group. And that's  because, as I was telling you in the comments,   if you upscale the relit image, you'll also  need upscaled SAM masks and the original image   to properly use the frequency separation  technique that we're going to see in a   second. You can substitute the SUPIR upscaler  workflow with any upscaler workflow you like,   as long as you understand what's going on here,  and that it is: both upscaling the relit image and   all the masks and original images we are sourcing  from for the frequency separation technique. Then we have the revolutionary core of the  workflow: the optional "Preserve Details   (Such as Words) from the Original Image"  group. Apart from having a very catchy name,   this group splits the relit image and the  original image into a low-frequency and a   high-frequency layer. The low-frequency layer  takes care of color and lighting, and the   high-frequency layer takes care of details.  Since we want details from the subject of   the original image and details from the  relit image that are not the subject,   and we want the lighting of the relit image, what  we want to do here is create a new high-frequency   layer that is the blended result of the  high-frequency layer of the original subject   and the high-frequency layer of everything  but the original subject of the relit image. And this was the endpoint of the previous version  of this workflow. But since some of you told me   you had issues with the coloring of the subject,  I'm providing you with three different ways of   color matching the subject. Why three? Well,  because there are so many ways of tackling a   problem, and I wanted to show you that. There's no  best solution to it. There are just many different   ways you can try and solve stuff. And this is  an example of this kind of mental approach. On the left, we have a color matching option  employing a color match node. In the center,   we have a color matching option employing an  image blending mode node set on hue. On the right,   we have a color matching option that  is also set on a color match node,   but instead of using the segment anything mask  for the subject, it uses a light map. Option   1 and option 2 are the more stable versions  of color matching. Option 3 is experimental   and interesting for those who want to see weird  results and experiment with things in their life. So now that we've seen the workflow, let's get  into the pipeline and see how we can get from   this picture to these pictures over here. The  starting point is a badly shot picture of a   product. As I mentioned in the previous video,  starting from a good picture is better, but I   wanted to show you that good, commercially viable  pictures are obtainable even from a bad photo. What I wanted to do starting from this picture was  to insert it into a 3D environment and recreate a   set much like in photography but virtual, both  in terms of lighting, set styling, and props.   To do this, you just need to select the subject,  get rid of everything but the subject, and create   a bit of transparency in the glass parts of the  image. That is, if you have a transparent subject.   Once that was done, I recreated a set in a 3D  environment. Now, I am using Blender, but you   can use anything else—Cinema 4D, Houdini, etc. It  all depends on what you need to do. For a Vellum   animation, for example, I would use Houdini. For  particles animation, I'd probably use Cinema 4D,   or Houdini. But in my case, I kept things  simple because I just needed our subject here,   our backdrop here, and a prop here. I set the  scene, and you can find all of the elements and   the scene in the description below. You can  download and recreate this scene yourself. The only thing I had to tinker with was setting  the perspective right in relation to the camera   of our subject. Once that was done, I just set  some lighting. We have a key light from the left,   a fill light from the right, and a bit  more fill light from the top. Then I   rendered our frame like so. Now, as you can  see from the render here, this approach of   3D only would not work well because we have  a 2D image acting inside a 3D environment.   But since we can now relight inside of Stable  Diffusion while preserving details and colors,   we can use this image as a source image  inside of Stable Diffusion and decide   whether to go far from it or close to it while  relighting it and preserving details and colors. Going back inside of Comfy, we are now starting  from our render. My render aspect ratio is 16:9   because, while doing this, I'm also working on the  thumbnail for this video. So I'm resizing it to a   16:9 aspect ratio—1920x1080. Since this workflow  is a one-click workflow, let's change things up a   bit from this example. Let's say we want to go for  "studio light advertising photography of a perfume   on an ice sculpture." We're going to copy that,  put that inside of the clip text encode field for   the relighting as well. Let's make sure that we  have the right prompt for the Segment Anything—in   this case, it's a bottle. So it's going to say  a bottle. Then let's use an ice reference for   the IPAdapter. I'm going to drop an image of some  ice in the IPAdapter Load Image node. Let's say   I just want to influence the branch to turn  into ice. Everything is ready, so I'm just   going to hit "Queue prompt" and speed things  up so you can see that there's no trick to it. And there you go—here we have our results.  Let's start with what was the original output   of the previous version of this workflow. The  ice sculpture is wonderful and retains the   same structure from the original render  while allowing us to recreate something   different from what we rendered, all the while  without caring about shaders, for example. But   as you pointed out in the previous video, the  color of the subject is not quite the same. Now let's take a look at our three different color  matching options. The one on the left is based   on a color match node, the one in the center is  based on an image blending mode node set on hue,   and the one on the right uses a color match node  as well but is set on a mask for everything that   is being lit instead of the segment anything  mask for the subject. Choosing between these   three is a matter of personal preference. As  a photographer, exact color matching is only   required for e-commerce pictures and catalog  pictures. For advertising and editorials,   exact color matching is not as required  as adherence to mood and feel. So that's   why I haven't set any of these three options  to maximum blending of color matching. But   if you want to do that, you can tinker  with the blend percentage values here,   here, and here to get results closer or  further away from the original color. What's going on in these subgroups before the  image is previewed is that we are recalculating   the low-frequency layer as well. Since the  low-frequency layer takes care of color and   lighting, we have to recalculate that. If  we zoom in on these tiny preview images,   there are previews of the low-frequency  layer. We can see that in some of them   the original color is well preserved while  in others it's less preserved. In this hue   option (option 2 for color matching), the  original color is not well preserved. It's   a matter of both tinkering with values here,  mostly in the blend percentage value range,   and of the scenes you are trying to recreate.  In this case, options 1 and 3 work better,   but in other cases, option 2 might work better.  That's why I left all three of them for you. But what is it you say, "I don't want any ice, I  want an actual branch like I did in the render"?   Well, good thing we have a VAE encode node over  here. This VAE encode node sources the image from   the "Start from Existing Product Image Load"  node. That means if we substitute the latent   in the case sampler for the re-background and  don't use an empty latent image anymore, lower   the denoise to say 0.6, change the prompt to "on a  wood branch," change the prompt for relighting as   well to "wood branch," disable the IPAdapter group  because we don't want ice anymore, and hit "Queue   prompt," we get a good-looking branch right away  and some good color matching for the subject. We   can also lower the denoise even more to keep the  same exact background from the starting image. Now, if you bring in the original image from the  start of the video where the wood was turned into   wool based on a morning reference and bring in  some logos—let's say this was a collaboration   between Maison Margiela for the perfume  and Marni—this, to me, looks like a good   campaign. The lighting is consistent, the branding  is consistent, the details are consistent, and the   colors are consistent. You can see how this can be  a very powerful workflow for real-life use cases. I was teasing you about animations as well  in the beginning. So let's take a look at   that. If we open Blender and scroll through  the timeline, you can see that I've added a   very simple animation to the branch. This is all  theoretical yet because I only had some time for   testing the initial stages of this, and I hope to  further develop it. But if we render the frames of   the animation or every fourth frame like in this  case, and then pass them through our workflow,   you can see how the lighting is consistent enough,  the materials are consistent enough, and the   details are consistent enough to use these as a  base in an AnimateDiff workflow. In some cases,   like where the fabric is stirring and going in  front of our subject, we might have some detailing   issues, but remember that this is one single  frame, and you won't see it as clearly in a full   animation. So theoretically, it would be possible  to create an animation out of this workflow. The only limitation is that you're working with  a 2D representation of the subject inside a 3D   space, so the camera and the subject must always  have the same perspective. They have to be locked   in. If you want to do camera movements, for  example, you can't just go around the subject   because there's no subject to the left or right of  it—it's just a 2D plane. The subject would always   have to be perpendicular to the camera. If we had  a good way to create 3D objects starting from just   one picture—and we are not quite there yet with  Stable Diffusion—we would solve that as well. All in all, I am very excited about  the possibilities of this workflow.   I have been receiving a tremendous amount  of support and interest in this workflow,   and I have seen some of your creations  as well. I think this is a great   step forward in what we can achieve  production-wise with Stable Diffusion. That's all for today. I hope you had  some fun and can create new stuff with   this version of the workflow. If you  liked this video, leave a like and   subscribe. My name is Andrea Baioni. You  can find me on Instagram at Risunobushi or   on the web at andreabaioni.com. I'll  be seeing you, as always, next week.
Info
Channel: Andrea Baioni
Views: 5,749
Rating: undefined out of 5
Keywords: generative ai, stable diffusion, comfyui, civitai, text2image, txt2img, img2img, image2image, image generation, artificial intelligence, ai, generative artificial intelligence, sd, tutorial, risunobushi, risunobushi_ai, risunobushi ai, stable diffusion for professional creatives, comfy-ui, andrea baioni, stable diffusion experimental, sdxl, ic-light, ic light, relighting, Photo to 3D scene to Stable Diffusion to Advertising Pipeline (Relight, Blender, 3d, advertising, product photography
Id: _1YfjczBuxQ
Channel Id: undefined
Length: 16min 55sec (1015 seconds)
Published: Mon May 27 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.