Advanced UpScaling in ComfyUI

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hi so slightly a different comy video this time slightly more insane um a lot of my videos come out of hitting a problem and uh wanting to want find and looking for a way around it and this problem was a particularly naughty one there's lots of upscales around and I don't really like any of them and I certainly don't like what they do to Illustrated type images because they're very focused on on photographs and I'm very unfocused on photographs so so there you go so I I really don't like what the upscaler do I they sharpen them to hell and the upscaling isn't very good either you know there ultimate pen penultimate and super ultimate and whatever but they they they mostly do a a pretty horrible job over oversharpened or over smoothed or they sometimes they manage both at the same time which is Quite a feat so what I've done is build a workflow that is does a much better upscale for a start and not only that if um you can adjust it a lot along the way uh without redoing it all because the ultimate upscaler could take half an hour more to churn through a 4,000 odd pixel image and um which is rather a long time uh this will do it the last one I did wasn't very large but it was 4,000 pixel image but it did it in uh 23 minutes which isn't too bad really from scratch so what we'll do is we'll start at the beginning here and we'll work our way across although it looks very complicated I should point out a couple of things first of all it looks very complicated it's actually made up of repeated units there's one there one there and another one there so there's two three repeated IM three repeated units you see this shape here that's just the same shape again and again so though it looks complicated it it not isn't necessarily that complicated and these in turn are made up of three duplicated workflows as well so this junk over here which is the upscaler looks very impressive but it is actually quite simple in principle slightly more complicated in practice as I found anyway we'll start at the beginning how you make your image now this lot over here will adjust to whatever image size you put in so this is my input image here it's intended to work with image to image not with text to image if you want to use text to image you need to put an empty image in here for it to start with because all of the numbers all the mathematics behind it is working off this box here these are the numbers that feed into all of the rest of the workflow so so the way this image is being made is just a bit of fun really this is IP adapter Matteo's wonderful invention which is styling this is styling the image uh I'm not going to go into this as a separate I shall do a separate video on this at some point once I've dis discovered all that essentially what this does is slap four Images together to make this grid which will be then put through the adapter and this will produce influence that go into the first generation this section that makes the image is very boring it's just it's just image to image there's absolutely nothing in there of um that that is new clever or or of any difference this is a completely standard imageo image setup the only difference is that this image is being resized to a very specific size for the to go to the flow here and the image needs to be for this process to work divisible by eight so youve got to make it divisible by eight might be work with divisible by four I haven't tried it I think divisible by eight to uh and obviously it can't be below uh below 768 I don't think it'll really work very well so this setup here has churned out this image which is barck fantasy sort of image and the next thing to do is to upscale that which we do up here so first upscale so this upscale which which is again a completely standard image to image has an image resize and the image is made 1.5 bigger so 76 8 that's 11 something or other so this is just a bit bigger than 102 foot so then we come to the interesting bit this image here is cropped into altogether nine pieces but in three rows three horizontal strips are taken across this image and that horizontal strip is broken into three this way so this here is the top row if you can see you check the image on the left here top left that's the middle bit and that's the right hand bit so this is a step across and all they all overlap can you see that they overlap and we're going up twice in scale so we're doubling in size and we're increasing in size by using take one of these apart so this crops out our section in this case the top left section top left it upscales with a model twice so the image is now twice the size it was so it's two three something or other I don't know anyway it's a couple of thousand pixels across this and it's just rendered with a doo of 35 and eight steps it's rendered just that section at a higher s so you see it's gone very fril up there and it's done that to each one in turn so it's taking a strip out of here and resize them in three sections that overlap and then when that has produced the image which is here the images are composited together with this node with a mask and they're composited together automatically so that's one of these sections now the next thing is how did that mask happen as you see there's a mask maker here which makes a mask and here we get into the mathematics and I'm sorry there is mathematics so here we have a bunch of math nodes you don't really need to understand how they work what they do is produce from the original image size which is here getting each size that's coming down I'll go over here show you that's coming down from our original image so the numbers of this the 768 768 are going down this Pipeline and then being processed through this there's two strands of processing because it needs to do not Square images so it'll do any proportion images provided the sides are divisible by eight all you need to know really is that this this produces the numbers that go into all of the crops here so you see the numbers are colorcoded so this turquoise relates to these numbers so this number here which is going in is 576 so you see that comes out there 576 so that telling it to make a 576 crop and the other one is also 5 576 cuz we're square and that's telling it to make of the other 576 so it's doing the calculations from the original image and putting the numbers into the image crop so I've made the outputs of the image crop into inputs you can do that you right click on here you see convert width to widget WID to widget Etc convert y to input so you can convert any of these widgets here sliders any of the sliders to inputs and I have done that there because I I didn't want to be adjusting the numbers every time I did it for every single proportion so I automated it essentially and those numbers are used up here as well to make a mask you you really don't need to know how it does it but um what this does is essentially from a empty image empty black image it comers setes a black shape onto a white one and then blurs it and makes a mask and then then we go back into being a mask there we feed that back in here and that does the join so that does the join the join is pretty good sometimes you can spot it it's quite hard to spot actually I think it'll be about here so that Tower might be a bit doubled not really it's usually very good I can't see any join at all sometimes with uh the edges of fry architecture you can see joh but as I'll come to later with this method that can be corrected quite easy so for example actually there's a good example here uh I might want not want my I rather like my CL going freil but I might not want my clouds to go freil in which case so I can change the D noise just on this one so you see here R 35 if I took that to 25 I don't think it would do funny clouds and if I at the end wanted to change that it will only reprocess that one it won't reprocess a whole lot it'll only reprocess that one which saves just a huge amount of time if you want to update your image okay so that's one of these modules explained and the other three are exactly the same there's no difference in them at all and this math unit here feeds the numbers into everything so if I put a different shaped image in here all these numbers will automatically adjust I should put the workflow in so you can fiddle with it I warn you fiddling with it fiddling with it can produce strange results but as I'll I'll go over the things you can fiddle with this number here and this number here here and this number here we'll just we'll just uh make an image there quickly so we can see the mask so there's the mask as you does it very quickly so that's my mask so there's the soft Edge uh this now these numbers here decide how big the soft Edge is so if I change this to 300 I change this to 50 and I change this to 75 if I click on there the mask changes shape so you might want to do that that's and also if you it it will have as I did that it re it updated the entire image in that time which is pretty good we change that back otherwise it'll I forget for those whove interested interested I'll explain the math noes so the 400 here is the size the black the black shape you see on the mask is deciding the size of that so it's taking this empty image resizing it and then compositing it back over the original image and inverting it so it's making a black square and then compositing it over a white square that's all it's doing and it's taking this this empty image is an automatic batch and it makes just a black image as you see the color is set to zero that means black and then the blur node is not blurred until the very end so if we look at it before the blur there it is before the blur so these three section are all the same and these numbers from here and The Mask are all piped across they're all piped across with re Roots so you can see the mask coming in here from The Mask these are the numbers coming up from the math from the math box down there these are the numbers coming up and so forth so we just rinse and repeat then finally when we've got these one you see we've got our strips of imagery we have three of these now overlapping rectangles that if we go back to our original image these are horizontal strips top one a bottom one and then a middle strip and you can see it's rendered them here is rendered them into three strips the top top one middle one and bottom one so though it looks hugely complicated it's not really and then we do the same thing here there is another mask Builder that makes a horizontal mask from the same numbers see here's the numbers being piped in and that composit the three strips together so this this composits the three sips together and this is our mask from that mask maker this mask maker is a little bit more complicated but the same because I leave the um vertical one as standard and then I adjust for width you could do it the other way around but it's just easier to adjust to width but it can do it will do portrait or anything you want really and that takes these three strip and it joins them up into this upraised image and then as a final we do a upsc model which sharpens it a little bit oh you can see uh you can see a join there excellent I was hoping to find a join so you can see a couple of little artifact you see the edge here and there's a there's a ghost of a bit of a dome there and quite honestly there if you would bother I would hardly bother it's it's a Photoshop it's quite hard to find uh defects really you little tiny one there but with this method you don't have to fix that you can fix that so easily or all you do is go back all these images have been saved you see so all you do is go back and get that one and it'll drop in perfectly and that's true with any of the Little glitches most of them I really wouldn't bother hardly doing it any of that really but but if you want to if you're a real mad perfectionist then there you go go at it here's another advantage of this method of course is that um if I want uh if I felt my building here was not high enough not clean enough it detail then I can we have a look at um how it was then I can send it round Again by chopping the right sized bit out of there and then putting it back in here to go through this lot again so you can see the Improvement and then in Photoshop I don't bother making specific masks in Photoshop you can drop the two together and uh it'll be a perfect fit I felt this building was a little bit too um undecided with sort of random um you see how it is when it's finished it looks pretty good it had too much random AI stuff in it so I sent it around again and you can do that with any part of the image obviously that um that is especially useful if you've got um figures small figures and so on you can you can cut them out and send them round again okay I shall do a few uh comparisons at the end and I should reiterate oh I actually missed one node which I is very handy so I should actually thought I'd finished but I haven't so this is extremely important this turns on and off the groups so we turn everything off and this is a great strength of this proc you don't have to run the whole lot at once so if I turn on the IP adapter and I turn on the main process I can make my initial image drop back all this has been turned off so I can make my initial image here without running the whole lot and indeed I can run only this in stages I don't have to run the whole lot so if we uh wanted to put something berserk in here let's put something berserk in just for fun it'll show how this module so we'll put in something silly so we put in a weird fractal look at that you think the clouds were funky before I think they'll be more funky now so what I'm getting out here is this section to make our initial image and to you know get something that I mean you know this is just upscaling we need first of all something that's worth upscaling so we cue that prompt so here we are back and that's what it's done which is rather nice actually not as funky as I expected do we want it more funky yeah we want it more funky of course we do want it more funky so if I uh up the D noise it'll go more extreme 75 so as that took 40 seconds to do you can afford to play in this area you you you can you can try out lots of variations and lots of um different seeds and so forth until you get an image that you like before you send it onto the rest and we'll show the I'll show the next stage and just to show how you you can control every stage and you don't have to run through the whole lot repeatedly so there we go more architectural Madness and the same is true for the refine so what we can do we can turn the first refine pass on en refine go yes we don't need any of the math for this first two stages so we don't need to turn the math on until you start the upscale and this here we can control how much it's changed so this is an image to image using the image you just made down there it's a straight image to image with an ordinary upsize rescale image we're going one and a half so here as well you can afford to do this as many times as you want and you can you can make it change the image a lot you can make it change you you can go up to uh we'll do it and all of these um all of these units are running with the same luras and the same prompt here's the prompt being piped about you see the prompts here positive and negative prompts are being piped across from here so these are the original prompt so the conditioning is being plumbed around which brings another possibility out you could put an adapted prompt into any of these sections here if you wish so if you perhaps were having trouble with the clouds and you wanted to take out some of the architectural stuff you could just make a prompt to go in here you you you don't necessarily have to have the prompt the same prompt running throughout it's not really required so we run that and see what 55 will be too much I think it change it around too much but maybe not we'll cue that so there you go big change so with this method you can do that I can put this huge change in and at only a minute and 20 seconds um you can afford to you can afford to play around with this and get what you want what you something you like so you see because we increase the D noise bring the other one up you can see it's quite a difference okay I think that is covered everything I'll put some uh some of the things I've made with it I'm going to do some comparisons so here we are it's examples I'm afraid I had some some mad Barack moments and these these strange uh shapes are caused by um IP adapter and putting fractals in them so you see the resolution and detail is pretty good this is a huge image so here's another one this is putting Tio in again more bad he's got a haircut isn't he um more barck madness but um the way it's upscaled it is I I rather like because it does the uh Illustrated stuff a great deal better than any other way of upscaling and you can sort of see this on a Venetian ceiling couldn't you and uh you see the architecture is is is very clean and good this is actually the one that started it off cuz uh cuz I I wanted this quite big which wasn't so easy and of course it has lots of little people there's a rather tall trap there that needs an edit but for the most part the people are about right for the size of image that's another one always with a any image with a small figure is a challenge and as you see she is pretty good there' be no edits on this I would I would um finesse the hands but generally we're pretty good and a miracle here it's even got the numbers right on the clock look at that well almost two threes but I I reckon I reckon the AI needs points points for getting that right then I I did some the I the first ones were all square and then I did some Landscaping ones with uh more more fractals in the background causing these um swoopy clouds and stuff you can see a few joins there's a little glitch there but generally figure is quite good but generally pretty nice here's something completely different uh I I did think it would fail on on uh faces because you know they're they're quite hard to hard to um upscale certainly in tiles but uh but as you see she's pretty nasty all the way in here we go you see those fractals going crazy up here but um once again I I won care I'm going zooming in it gets boring but uh we got very nice uh rational architecture and everything all the statues and bits and Bulbs like that all look very nice indeed and again we have we have no glitches where it went together at all you I you can't find the join there's no join at all to be found another Street Scene It was a street scene that that that really started me off so you see here again we have very good resolution even it's even almost managed the number plate there some of it's a bit uncanny valley but um but pretty good generally and here we go with this one that you saw me uh refining the building itself and uh as you see it all fits in absolutely perfectly so this is um an ulable upscale it took uh it's 6,000 pixels across and it took um nearly an hour to do and it's pretty good we zoom in which is pretty good 6,000 pixels however uh my one took um 15 minutes by a quarter of the time and it looks like this you can all write to me and say you prefer the ultimate upscaler one but uh I think there's no comparison R this method is many times better uh than the ultimate up scale not only that four times faster but uh there you go so as usual I will put the workflow into the words underneath and you can try it yourself and tell me I'm wrong okay I hope that was interesting and uh informative thank you very much for watching

Info

Channel: Rob Adams

Views: 1,383

Rating: undefined out of 5

Keywords:

Id: HStp7u682mE

Channel Id: undefined

Length: 22min 32sec (1352 seconds)

Published: Sat Apr 20 2024