Hello everyone, and welcome to another episode
of Stable Diffusion for Professional Creatives. Today, we'll be taking a look at the new
version of my workflow for relighting and preserving details and colors for product
photography. We'll see how this workflow can develop a pipeline that starts from this image
on the left, which is badly shot on a phone, then gets rendered in a 3D environment and studio
scene, and how that render can be used as a base for generating wonderful campaign shots as well
as animations, as we can see here on the right, all the while relighting the subject, preserving
details, and preserving colors. This workflow is probably the most production-ready workflow
that I've ever produced and released, and I think I can say that because I am
actually starting productions with some brands. A ton of you asked under my previous video if I
could implement some stuff. Each and everyone of you had different requests. The way I usually
work is by providing a bare-bones workflow, something that you can expand on and make your
own. But since a lot of you requested different things like IPAdapters or other segment
anything groups or color matching tools, I implemented all of that. You will be able to
find this workflow in the description below, and this time, you will be able to find
all of the assets I'll be using in the full pipeline from photography to
3D to generated images as well. Why do I usually only provide a bare-bones
workflow? Because if I do not, you get a monstrosity such as this. Now, this is
still usable as a one-click solution, but it's a lot less understandable than the
previous workflow. So I have added a ton of notes telling you which groups can be disabled
and what each group does. But as a quick overview, this center here is the same as the previous
workflow, and I added an IPAdapter group here for taking care of either all of the re-background
or of regenerating parts of the set based on a segment anything mask that acts as an attention
mask for the IPAdapter group. Then we have a group that takes care of blending multiple segment
anything masks together over here. Then we have a very quick and easy SUPIR upscaler group
here. And by popular demand, not one, not two, but three different color matching options for
preserving the colors of the original subjects. Now, let's go through all of the
workflow and what has changed. If you don't care about going through all
of the workflow, you can just jump ahead. You have the timestamps for it and can
see the whole pipeline starting from a single badly shot iPhone picture and going
all the way to a campaign-ready picture. Starting from the top left, you have your
"Start from Existing Product Image" group. This is the same as the previous workflow and
it sources and resizes an image upon which we are going to do all of the work. Then we have
a models group. Since some of you asked for a perturbed attention guidance node, I've added
that as well. In the workflow you download, it's going to be disabled by default, but you
can re-enable it. The only thing you have to be wary of is changing the CFG accordingly in the
Ksamplers because PAG influences CFG values. Then we have our "Generate Still Life Product
Shot" group, which is disabled by default because we don't want to generate random products. We
want to start with existing products, but in case you don't have any product shots, you
can use this one to generate a new product. Then, going down, we have a background generator
group. This group generates a new background that is either close to the one that we already have
or very far from it. Some of you asked for a way to integrate an already existing background.
That would be easy—you just have to blend the original image segmented with the SAM on
top of an already existing background. But I didn't want to implement it in this workflow
because that can be easily done in Photoshop, and I felt like it would mess things up here
given how easy it is to do it any other way. The new additions to the workflow are the
IPAdapter for background or selective parts group and the SAM for IPAdapter attention mask
groups. These two groups influence the background generator. They can be used together or the SAM
group can be disabled. If you disable the SAM group, the IPAdapter group is going to work on
all of the image. If you enable the SAM group, the IPAdapter group is going to influence only
certain parts of the image. Like in this case, where the background is
staying basically the same, but I'm influencing the wood branch
to turn it into wool. For example, you can influence only the background, only
parts of an image, or all of the image. In the center, we have the "Segment Anything"
group that takes care of segmenting out the original subject, and that's the same
as the previous workflow. But we also have an optional "Multi-SAM Mask Composite"
group, useful for those of you who are using multiple SAMs and need a composite mask
of everything that is being segmented. Then we have our "Blend Original Subject
on Top of Background" group, which takes care of blending the original subject
on top of our re-background image. Then, we move on to our "Relight" group, which I've
rearranged into two parts: one for the actual relighting and one for the "Mask Alternatives"
group. Inside the "Mask Alternatives" group, you have all the ways with which you can create
masks for relighting. Each one has an explanation at the bottom. You can either start from an
existing mask that you load into the workflow, or you can draw a mask yourself in the "Open in
Mask Editor" here, or you can use the "Color to Mask" node to influence the lighting based on the
lighter parts of an image. This works essentially like an HDRI of sorts—that's a very loose term,
but it's the closest thing I think—so it uses all of the lighter parts of the re-background
image as light sources for the relighting. Then we have a SUPIR upscaler group. By default,
it's set to 1.2x, but you can set your own rescale factor. It's bypassed by default because it
takes a long time to calculate everything. So for demonstration purposes while I record this
video, I disabled it. Inside, you can find an "Upscale Original Image and SAM" group. And that's
because, as I was telling you in the comments, if you upscale the relit image, you'll also
need upscaled SAM masks and the original image to properly use the frequency separation
technique that we're going to see in a second. You can substitute the SUPIR upscaler
workflow with any upscaler workflow you like, as long as you understand what's going on here,
and that it is: both upscaling the relit image and all the masks and original images we are sourcing
from for the frequency separation technique. Then we have the revolutionary core of the
workflow: the optional "Preserve Details (Such as Words) from the Original Image"
group. Apart from having a very catchy name, this group splits the relit image and the
original image into a low-frequency and a high-frequency layer. The low-frequency layer
takes care of color and lighting, and the high-frequency layer takes care of details.
Since we want details from the subject of the original image and details from the
relit image that are not the subject, and we want the lighting of the relit image, what
we want to do here is create a new high-frequency layer that is the blended result of the
high-frequency layer of the original subject and the high-frequency layer of everything
but the original subject of the relit image. And this was the endpoint of the previous version
of this workflow. But since some of you told me you had issues with the coloring of the subject,
I'm providing you with three different ways of color matching the subject. Why three? Well,
because there are so many ways of tackling a problem, and I wanted to show you that. There's no
best solution to it. There are just many different ways you can try and solve stuff. And this is
an example of this kind of mental approach. On the left, we have a color matching option
employing a color match node. In the center, we have a color matching option employing an
image blending mode node set on hue. On the right, we have a color matching option that
is also set on a color match node, but instead of using the segment anything mask
for the subject, it uses a light map. Option 1 and option 2 are the more stable versions
of color matching. Option 3 is experimental and interesting for those who want to see weird
results and experiment with things in their life. So now that we've seen the workflow, let's get
into the pipeline and see how we can get from this picture to these pictures over here. The
starting point is a badly shot picture of a product. As I mentioned in the previous video,
starting from a good picture is better, but I wanted to show you that good, commercially viable
pictures are obtainable even from a bad photo. What I wanted to do starting from this picture was
to insert it into a 3D environment and recreate a set much like in photography but virtual, both
in terms of lighting, set styling, and props. To do this, you just need to select the subject,
get rid of everything but the subject, and create a bit of transparency in the glass parts of the
image. That is, if you have a transparent subject. Once that was done, I recreated a set in a 3D
environment. Now, I am using Blender, but you can use anything else—Cinema 4D, Houdini, etc. It
all depends on what you need to do. For a Vellum animation, for example, I would use Houdini. For
particles animation, I'd probably use Cinema 4D, or Houdini. But in my case, I kept things
simple because I just needed our subject here, our backdrop here, and a prop here. I set the
scene, and you can find all of the elements and the scene in the description below. You can
download and recreate this scene yourself. The only thing I had to tinker with was setting
the perspective right in relation to the camera of our subject. Once that was done, I just set
some lighting. We have a key light from the left, a fill light from the right, and a bit
more fill light from the top. Then I rendered our frame like so. Now, as you can
see from the render here, this approach of 3D only would not work well because we have
a 2D image acting inside a 3D environment. But since we can now relight inside of Stable
Diffusion while preserving details and colors, we can use this image as a source image
inside of Stable Diffusion and decide whether to go far from it or close to it while
relighting it and preserving details and colors. Going back inside of Comfy, we are now starting
from our render. My render aspect ratio is 16:9 because, while doing this, I'm also working on the
thumbnail for this video. So I'm resizing it to a 16:9 aspect ratio—1920x1080. Since this workflow
is a one-click workflow, let's change things up a bit from this example. Let's say we want to go for
"studio light advertising photography of a perfume on an ice sculpture." We're going to copy that,
put that inside of the clip text encode field for the relighting as well. Let's make sure that we
have the right prompt for the Segment Anything—in this case, it's a bottle. So it's going to say
a bottle. Then let's use an ice reference for the IPAdapter. I'm going to drop an image of some
ice in the IPAdapter Load Image node. Let's say I just want to influence the branch to turn
into ice. Everything is ready, so I'm just going to hit "Queue prompt" and speed things
up so you can see that there's no trick to it. And there you go—here we have our results.
Let's start with what was the original output of the previous version of this workflow. The
ice sculpture is wonderful and retains the same structure from the original render
while allowing us to recreate something different from what we rendered, all the while
without caring about shaders, for example. But as you pointed out in the previous video, the
color of the subject is not quite the same. Now let's take a look at our three different color
matching options. The one on the left is based on a color match node, the one in the center is
based on an image blending mode node set on hue, and the one on the right uses a color match node
as well but is set on a mask for everything that is being lit instead of the segment anything
mask for the subject. Choosing between these three is a matter of personal preference. As
a photographer, exact color matching is only required for e-commerce pictures and catalog
pictures. For advertising and editorials, exact color matching is not as required
as adherence to mood and feel. So that's why I haven't set any of these three options
to maximum blending of color matching. But if you want to do that, you can tinker
with the blend percentage values here, here, and here to get results closer or
further away from the original color. What's going on in these subgroups before the
image is previewed is that we are recalculating the low-frequency layer as well. Since the
low-frequency layer takes care of color and lighting, we have to recalculate that. If
we zoom in on these tiny preview images, there are previews of the low-frequency
layer. We can see that in some of them the original color is well preserved while
in others it's less preserved. In this hue option (option 2 for color matching), the
original color is not well preserved. It's a matter of both tinkering with values here,
mostly in the blend percentage value range, and of the scenes you are trying to recreate.
In this case, options 1 and 3 work better, but in other cases, option 2 might work better.
That's why I left all three of them for you. But what is it you say, "I don't want any ice, I
want an actual branch like I did in the render"? Well, good thing we have a VAE encode node over
here. This VAE encode node sources the image from the "Start from Existing Product Image Load"
node. That means if we substitute the latent in the case sampler for the re-background and
don't use an empty latent image anymore, lower the denoise to say 0.6, change the prompt to "on a
wood branch," change the prompt for relighting as well to "wood branch," disable the IPAdapter group
because we don't want ice anymore, and hit "Queue prompt," we get a good-looking branch right away
and some good color matching for the subject. We can also lower the denoise even more to keep the
same exact background from the starting image. Now, if you bring in the original image from the
start of the video where the wood was turned into wool based on a morning reference and bring in
some logos—let's say this was a collaboration between Maison Margiela for the perfume
and Marni—this, to me, looks like a good campaign. The lighting is consistent, the branding
is consistent, the details are consistent, and the colors are consistent. You can see how this can be
a very powerful workflow for real-life use cases. I was teasing you about animations as well
in the beginning. So let's take a look at that. If we open Blender and scroll through
the timeline, you can see that I've added a very simple animation to the branch. This is all
theoretical yet because I only had some time for testing the initial stages of this, and I hope to
further develop it. But if we render the frames of the animation or every fourth frame like in this
case, and then pass them through our workflow, you can see how the lighting is consistent enough,
the materials are consistent enough, and the details are consistent enough to use these as a
base in an AnimateDiff workflow. In some cases, like where the fabric is stirring and going in
front of our subject, we might have some detailing issues, but remember that this is one single
frame, and you won't see it as clearly in a full animation. So theoretically, it would be possible
to create an animation out of this workflow. The only limitation is that you're working with
a 2D representation of the subject inside a 3D space, so the camera and the subject must always
have the same perspective. They have to be locked in. If you want to do camera movements, for
example, you can't just go around the subject because there's no subject to the left or right of
it—it's just a 2D plane. The subject would always have to be perpendicular to the camera. If we had
a good way to create 3D objects starting from just one picture—and we are not quite there yet with
Stable Diffusion—we would solve that as well. All in all, I am very excited about
the possibilities of this workflow. I have been receiving a tremendous amount
of support and interest in this workflow, and I have seen some of your creations
as well. I think this is a great step forward in what we can achieve
production-wise with Stable Diffusion. That's all for today. I hope you had
some fun and can create new stuff with this version of the workflow. If you
liked this video, leave a like and subscribe. My name is Andrea Baioni. You
can find me on Instagram at Risunobushi or on the web at andreabaioni.com. I'll
be seeing you, as always, next week.