First thing to notice about JPEG is: It's not actually a file format, although everyone talks about JPEG files. JPEG is actually a compression method much like the codec you would use in a video. Umm, and we actually use the JPEG File Interchange Format, or JFIF as the actual wrapper that holds that compress data So what happened was the Joint Photographic Experts Group, which is what JPEG stands for came along and they created this incredibly complex, specification of how you should compress image data. Umm, very long, lots of different options and what that means, is that basically, in practice You couldn't possibly hope to implement all the different options. Progressive JPEG, sequential JPEG files, different color spaces and so no one did. Someone came along, and said how about this JFIF format, and everyone went: "Actually, that's much easier" and now everyone just uses that. And more recently, the Exif format, which has been sort of championed by the photographic industry, camera makers, umm, has kind of joined with JFIF, and so you'll either have Exif files of JFIF files. Or both in the same file. "but they still have the .jpeg-" They all have .jpeg, and so really, when we're talking about a JPEG file,we're actually talking about a JFIF file most of the time, but we just don't make that distinction. JPEG compression works in a very clever way. So first of all, it depends on the fact that we don't see color quite as well as we do grayscale which is something we touched upon in our video on the biofilter. "Two greens for every blue and red." "And that's because our eyes are more sensitive to green than they are to blue and red," It also deals with the fact that we don't see high-frequency changes in image intensity very well either. So we can get rid of some of that high-frequency information. So bits of image that change intensity very very quickly, we can kind of sorta blur out, and those things will go away, we won't really see a difference, certainly not if we're not zooming right in looking at individual pixels. So, to start with we'll talk just about the color aspects of JPEG. I have an an input image here what we want to do is try to shrink it down as small as possible, for storage, and then be able to extract as much as possible on the way out. So what we first do, is we, we, the change the color space. We transform it into the Y-cb-cr color space, which is what we spoke about in our little video on color spaces What we're trying to do with Y-cb-cr is separate the luminosity of an image so the intensity of each pixel, form the actual color. After we've converted to Y-cb-cr, we down sample, and essentially reduce the amount of color in our image and that lets us save quite a lot of space, without actually seeing any difference in the image quality. We then apply a discrete cosine transform, which is a fairly a fairly complicated mathematical technique, which hopefully I can explain in a slightly, slightly easier to understand way. And then we quantize it, which is the actual lossy part of the jpeg compression. Then we encode it and that's our file. (Brady)What does lossy mean? So, some file formats that we encounter like, ah, BNP and, um, PNG are losslessly compressed. So, um essentially, it's equivalent to put them in a zip file. You might use LZX compression or something more complicated but, generally speaking, you take the image data. You compress it in such a way that when you uncompress it on the other side, it's exactly the same. I believe it's, uh, professor rels that did a video on LZX compression. In Jpeg, the compression is almost always lossy. You aren't guaranteed the same image when you output it as you put in However, it will be very very close most of the time and the advantage of lossy compression, is you get a huge amount more compression for your money. Jpeg allows you to do, basically any color space you want to. You could use RGB. You could use YCBCR or you could use CIE. and because of the fact that it's totally impractical to program every single possible color space in your own Jpeg coder or decoder, most people just followed the JFIF standard. Which is just YCBCR, very occasionally RGB. So, we're going to assume, for the rest of this video, that we're talking about JFIF which is essentially, a small subsect of the JPEG standard. So, we take our image, which is an RGB and we convert it into YCBCR And what that does, is it separates out the luminance and chrominance components. And, as we talked about in our other video, luminance represents essentially the brightness of the image and it's a greyscale component. and the CB and CR represent the blueness and the redness of the image. But, both of these values fall, after conversion in JFIF standard, fall into the range of 0 to 255. So the amount of data that YCBCR holds is exactly the same as the 0 to 255 RGB One of the nice things about YCBCR, is that human eye doesnt really see chrominance very well. It's certainly a much lower resolution than we see changes in intensity. So, just like with TV encoding, we can massively down sample the amount of CB and CR that we see in the image. And, most humans, unless your right up to the pixelboard, wont notice a difference. So, to use a demonstration: This is a flower picture that I took and this picture on the right has had the chromance component down-sampled by a factor of 10 in both directions So, 100 overall. There's 100 times less color in this picture, than there is in this one. And, to my eye, they look almost exactly the same. And that's because my eye only sees the grey scale and a little bit of color. If you zoom in on one of these pixels you can see right on the edge of some of these petals you can see slight discrepancies, where the color and they grey don't match up. but, at a normal level of zoom The level of zoom of your computer monitor or the screen you're looking at or a photograph you're never going to see the difference. And we've managed to save a huge amount of space by getting rid of a huge amount of color information. Once we decide to transform to YCBCR we have to decide how much down-sampling we can get away with. In general, it's very comon to down-sample the color by a factor of 2 in both directions. So essentially, you have 4 times less color. For every 4 Y pixels, you only have 1 CBCR pixel. You might also down-sample by a factor of 2 only in the vertical direction and keep the horizontal. Depending on how much space you want to save. In general, down-sampling by that much, you wont see much of a change in the image. So, you can get away with quite a lot. So, down-sampling is sometimes tied to the quality of the JPG that you output. So, in some software you will say, I want it a quality of 85 and it will decide how much down-sampling that is and how much of a compression it does later on in the stages. In general, most software will use a down-sample of two in both directions. So, four times less color. But, you might find, if you choose the highest quality, in a software, such as photoshop, it won't down-sample at all and it will have they same resolution of color to grey-scale. So, once we take an RGB image, we've converted it into YCBCR and we've done whatever down-sampling we think is necessary, or that we can get away with That's when we pass this information onto the DCT, the Discrete Cosine Transform Which is right at the core of how JPEG compression works. But that's for another video. subtitled by a terry
Misleading title. But argh! They left the most interesting part to the next video! Understanding how the discrete cosine transform works is essential to understanding what operations you can safely perform on a jpeg file without suffering a massive quality drop.
Do I look l̷iķe I ͏kno͝w w̸hat̕ a j̵p̧e͏g is? I̡͢ j͞u͘s̀t̨ w̷̢͜͜a̷̢n͏̛͘͜t̴͢͟ ̀͘͡͞t̴҉̡͘ǫ͜͏̵f̸̢̛̬̼̕͜g̵̡͟͏͖̮̗h͎̬́͢͝t̞̝͍͇͢͜z̶̶͎̝̯̗͉b̸̛͍̟̦͍̻̪̰̝͢͜͞ͅź̸̢̠͔̮̼̱̺̳̪̠͓̘͎͘̕
And MP3 isn't a file format either, you're thinking of MPEG-1 layer III.
Good luck with that.
Of all the places to find people complaining about fussing over the difference of a file format and an encoding /r/programming is the last place I expected. We're programmers, specificity and technical accuracy is a part of our job. Knowing precise technical information is vital to quickly and accurately conveying complex ideas without confusion.
Huh. Interchangeable, indeed.
This is good information but who the fuck really cares. It's like the argument on the correct way to pronounce GIF.
Which, btw, you pronounce GIF with a hard G otherwise it's a peanut butter brand.
Why don't we talk about something actually interesting, like the Huffman codes used in jpeg.
needsmorejpeg.jpeg