Sitting at your desk looking at your monitor  during the day, there’s really nothing out   of the ordinary. But if the sun sets, or I  just cut the lights, and the room darkens,   you get this characteristic glow from the  monitor, contributing to that feeling of an   overwhelmingly bright light in the darkness. This is called a bloom effect, and at its heart,   it can be both a simple and surprisingly  complex effect to implement. Do it right,   and it looks subtle and sells that idea of an  overwhelmingly bright light, but do it wrong,   and the scene ends up looking a bit wonky,  kinda straight out of the mid 2000’s.   Let’s talk about how it used to be done, way  back in the day. There’s really no official   “correct” way to do this, whatever works, but the  premise basically works like this. You’ve got a   copy of your main framebuffer, and you’ll apply  some basic operations. You threshold the image,   meaning you choose the bright parts to bloom out.  You downsample it, mostly for performance reasons.   And you apply a bunch of blur to it, that’s  how you get that bleed over effect.   Once you’ve done all that, they can be combined in  some handwavy way that looks nice to end up with   a bloomed out image. That’s the theory anyway.  It’s not really set in stone, this is kinda an   artistic effect, so in reality you’re going to  just go with whatever works and looks nice.   Let’s talk about thresholding an image first.  So what most people do is take the image,   and do some sort of luminance cutoff. The  reasoning behind this was simple. A lot of   stuff only ran with 8-bit per channel. Nowadays  modern hardware is fast enough to support floating   point buffers wherever you need them, but back  in the dark ages, you often went with 8 bit   channels. Xbox 360 had a 10 bit float format,  but I don’t think ps3 had an equivalent.   So yeah, you take your image and there’s a bunch  of ways people did this. You could contrast the   heck out of your image, basically preserving  the bright areas and dropping off the rest.   You can also just set a luminance  threshold, so calculate luminance   per-pixel and then drop off towards 0  everything below a certain threshold.   The general idea is that you need to isolate some  part of the image for blooming, not all of it,   because we’re going to be additively blending  this bad boy back on the screen later,   and without dropping off some parts,  it’ll look like a bit of a disaster.   You can use the full size main colour target  directly, but working with a smaller version   of that is faster and cheaper. So it’s  normal to take your main colour target,   and down sample that to a fraction of the size. Every time you halve your resolution, the # of   pixels processed gets cut by 75%, which is pretty  insane savings. The tradeoff here is that, as you   progressively downsample, you risk shimmering  artefacts due to the lack of resolution.   Then there’s a blur pass, so you’ll take your main  colour, which has probably been downsampled, then   you’ll blur it somehow. Now, a dead simple way is  good ol’ box blur, nothing beats that. The basic   idea is to convolve the image with an equal weight  kernel, that might look something like this.   And that basically amounts to, if you were to  zoom in and look at individual pixels, for any   given pixel in the input image, you’d sample the  nearby pixels and then average them all together   to produce the final pixel in the output image. Go and do that a few zillion times for every pixel   in the image, and you’ll get a slightly blurry  image. This has some obvious limitations.   For starters, to get a bigger radius, you can  either skip pixels, which just kinda looks bad,   you can see the artefacts there pretty clearly Or you can expand the kernel, which looks better   but comes at the steep cost of being N^2 for the  radius you wanna blur. Bad options all around.   Luckily, the box filter is what’s referred  to as a separable filter in image processing,   meaning that the 2d version can be  expressed as 2 1-dimensional filters.   That is, it can be done along the  horizontal axis, or the x axis,   and then the output of that can be run through  another filter on the vertical or y axis.   So I can start with the source image, and we can  start by applying a box filter along the x-axis.   Notice the blurring in the  image is strictly horizontal.   Once we’ve finished doing that, we can apply a  2nd box filter along the y-axis, giving us the   box filter in 2n steps rather than N^2. An arguably higher quality blur is to do   a gaussian, which is just changing  the weights around in the kernel.   Instead of equal weights, you can bias  the weight towards the centre pixel.   The kernel basically looks like this, kind of  a hill with raised centre and a gentle falloff   towards the sides. Contrast that to a box filter,  which is exactly what it says, it’s a box.   This is separable, just like the box blur, so it’s  easy to get going. There’s other kernels here too,   but I feel like the important takeaway  here is the general idea that you wanna   make a blurry version of the scene. And now that you’ve got your blurred out   copy of the scene, you blend it back over the  screen as a post effect, generally an additive   pass. And that’s it, that’s all it is. Problem is, doing it this way is part of   the reason why the games in the mid 2000’s looked  so bloom heavy. They relied on using things like   thresholds to bloom out. If you’ve got an 8 bit  per channel buffer, that means that the maximum   value is 255. If you choose to say, use luminance  to threshold what’s bright and what isn’t, ie.   what blooms and what doesn’t. That approach makes  zero distinction between just a white object,   and a glowing hot one. The end result being,  it's super hard to get under control.   Back in the day, we developers thought it looked  awesome and soooo next-gen, and it kinda did,   so how about shut up. But let’s look at how  they’ve improved the approach in some of the   major engines since those bright, bloomy days. So, in general, any current generation   implementation of bloom is going to be  trying to accomplish a few things.   High quality, nuanced. We’re long past  the days where bloom is new and exciting,   so bloom implementations want to give that  feeling of an overwhelmingly bright light,   without beating over the head with it. Performant: This should be obvious,   it should be fast, realllll fast. Kawase’s Blur Filter   Let’s talk about new approaches, but before that,  way back at GDC 2003, this was I think before I   even started in the industry, there was  a presentation by Masaki Kawase, Kawacy?   Uh whatever, it was a very clever way to use gpu  hardware to get a really nice blur for cheap.   The idea is, take the main image, take 4  bilinear samples, and use those to downsample   progressively to a smaller resolution, in the  talks I think they mentioned ¼ resolution.   After that, you’ll ping pong between 2  textures, each time taking 4 samples at   a time with an increasing offset. By iterating  this over and over again, you can get a really   good approximation to a large gaussian blur, for  really few texture samples. This is important,   because this algorithm serves as the backbone  of pretty much all the modern approaches.   Let’s start by taking a look at Unity’s bloom  implementation. In 2014, the folks at Sledgehammer   games, they were the ones behind the latest  (at the time) iteration of Call of Duty.   So they outlined how their bloom  implementation worked. Their technique   was an evolution of the Kawase filter we  just outlined. The idea is as follows:   You take your framebuffer, and there’s no  threshold step. With HDR commonplace now,   this isn’t needed. Now you downsample it to  create basically a mip chain. The default way   is to take a bilinear sample right in the  middle of every 4 texels, and the Kawase   method showed you could take 4 taps instead of 1.  They decided, screw it, we’re taking 13 taps.   And it’s really neat what they do here. So  if you’re looking at this group of pixels,   and they’re going to take 9 samples total,  the upper left 4 will form a group, the   upper right will form another, and so on. Lastly,  there’ll be one group of 4 pixels in the middle,   and that puts a large bias towards those  centre pixels. The idea being that this   should help reduce pulsing artefacts. These are all combined with a simple equation   with half the weight being the centre 4, and the  rest of the 4 groups getting equal weight.   One extra trick they added here is  that on the very first downsample,   they take the Karis average based on luminance,  which helps preserve bright pixels.   Next, they progressively upsample, starting  with the smallest downsampled image.   Each step, they apply a tent filter to the  downsampled version and also sum in the previous   mip. I think. This is based on the slides. So once you do that for a while, you’ll end   up with a fully downsampled and then  re-upsampled image, which you then   lerp with the main image to get a bloomed out  scene, and overall it looks pretty good.   Notice that the scene isn’t too bloomed out  like in shots from earlier in the video,   the bloom is more subtle and comes off  more as a glow rather than a weird haze.   So unreal’s approach seems to be mostly an  extension on those same older techniques.   They go over the general approach way back in  their Siggraph 2012 presentation on the technology   behind the “Unreal Engine 4 Elemental Demo”. From what I can gather, they take the main image,   the framebuffer, and from that they  progressively downsample a fixed   number of times using a plain Kawase blur. They then do a similar upscaling process as Unity,   but from the sounds of it, they do a small  separable gaussian blur at each level, which   is artist configurable. So you can modify the  size of the blur at each level to your taste.   They also add some knobs and dials for doing  things like applying tints to each level, giving   you that extra level of artistic control. So I could, for example, specifically choose   to tint an individual layer of the bloom chain,  while leaving the rest as varying shades of grey,   or something to that effect. I haven’t tried  profiling it, my guess is that it’s slightly   more expensive than Unity’s approach but can  give higher quality and more flexible results.   This is based on no actual research on my part. So while most of these approaches are just tweaks   and evolutions on Kawase’s approach from  way back, the newer versions of Unreal   engine come with a fancy new convolution  bloom, which is really interesting.   If you have a signal, let’s represent it  with a seemingly random function here,   then you can use something called the Fourier  transform to decompose that function into a series   of sinusoids, a series of sin and cosines  of varying amplitudes and frequencies.   If you take an image and imagine each  pixel as data points of a 2d signal,   of a 2 dimensional function, then we can use a  2D transform to convert this to frequency space,   and then we can use the inverse fourier transform  to bring this back to our original image.   The whole theory and background behind the  fourier transform deserves a long video by   itself, which I clearly didn't do. There  are some phenomenal resources on Youtube,   which I’ve got linked in the description. For  our purposes, I think it’s probably sufficient   to know that this transformation exists and  what it buys you, in terms of functionality.   Once you’ve converted this to the  frequency domain, there’s some   really cool manipulations you can do. A high pass filter is one that preserves   high frequency information while removing low  frequency information, and that’s as simple as   multiplying by zero’ing out the image near  the origin before doing the inverse FFT.   A low pass filter is the exact opposite,  preserving low frequency information while   discarding high, which is the inverse of what we  just did, and that gives us a really blurry image.   In fact, I can actually perform a gaussian blur  in a similar way, by using the fourier transform   of a gaussian filter, which lets me perform a  gaussian blur for pretty much free. I mean, free   after the FFT, which wasn’t even slightly free. That kind of leads us to an interesting property   of the frequency domain, multiplying 2 images  together in the frequency domain is the equivalent   to convolving them in the time domain. This is where Unreal’s new convolution   bloom is so neat and different. You  can create custom bloom kernel shapes,   so here for example I can create a star like  object which I’ve done just by drawing some 2d   sdf’s. I’ve also added a bit of a gaussian blurs  on top of that. Now, we can transform that kernel   that I just made into frequency space via the  FFT, and multiple our fourier transformed image   Then when we take the inverse  of that using the inverse FFT,   that ends up being our blurry bloom buffer. When  we combine that with the framebuffer, it can give   us this neat streaking effect that would be  a lot more difficult to do with conventional   methods but trivial using this approach. The unfortunate downside is that doing an FFT   is a heavy operation, but I imagine it’s one that  we’ll probably see more often as graphics hardware   undoubtedly gets more and more powerful. There was  a time when doing simple loops was unfathomable,   so a few more iterations on hardware and  this will be a small blip on the profiler.   Until next time, Cheers
