The Problem with JPEG - Computerphile

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

There's use of #jpegmafia on TikTok that's all about the file type and nothing about Peggy.

👍︎︎ 13 👤︎︎ u/[deleted] 📅︎︎ Jan 01 2022 🗫︎ replies

i thought this nigga was gonna school my king

👍︎︎ 11 👤︎︎ u/kensky 📅︎︎ Jan 01 2022 🗫︎ replies

wtf happened to shawn cee??

👍︎︎ 7 👤︎︎ u/xivuy 📅︎︎ Jan 01 2022 🗫︎ replies

the problem with the mafia next?

👍︎︎ 1 👤︎︎ u/Bucharik 📅︎︎ Jan 02 2022 🗫︎ replies
Captions
It perhaps isn't clear from the JPEG video when JPEG isn't a good idea. I mean, a lot of people say "oh, you should never use JPEG for scientific images" or something like that because it's totally lossy compression, you're going to lose those equality. And that is true but it's also not in a sense that you're applying its lossy compression over very very small image blocks. So you won't get coherence between one block and the next but it'll look pretty good and for most imaging that's okay. Obviously lots of people swear by shooting in raw, and you know, good luck to them. JPEG uses up a lot less space, and so for most practical purposes a JPEG image is fine. One time where JPEG images are not fine is text. Most people will have spotted JPEG artefacts, that is, speckly bits of image around text and maybe not quite understood why that's there apart from it's just a side effect of JPEG compression. Well specifically, it's a side effect of JPEG compression on text because text violates our assumptions that high frequency information doesn't contribute a lot to the image. So this is a small 8x8 image that I've come up with to illustrate its purpose. So this is, in a sense, text. This is the Computerphile C with its little triangular brackets. It's 8x8, so it's not the highest resolution, but it serves our purpose quite well. One thing that this image has that our last image of the flower didn't have is sharp changes in intensity. So this C has a sharp step down into the background and that is not something that JPEG handles very well at all. If we look at the encoded luminosity block of this we get this. So this is our C represented as just 0 to 255 luminosity values. So these are our background ones of about 48. This is our C here and our brackets here Each of these represents the greyscale intensity of that corresponding pixel in our 8x8 image. Now if we were encoding this in JPEG, what we would then do is we would shift all these and we would calculate our DCT coefficients. And then we would get rid of the high frequency ones and we would encode them. And in doing so, we massively compress the image at what we assume to be a pretty reasonable quality. But that isn't true in this case. If we look at the DCT coefficients you can see that our assumption that the big ones are always in the top left so that the low frequency contributes more to the image is hugely violated. This particular coefficient, for example, only contributes 0.8. That was, I think, a value of 200 or something in our last video. Down here we have big, big coefficients. 30, 67.5, 53, -53. All in these really high frequency cosine waves. So if we look at our logo coefficients next to our DCT We can see that what we've essentially got is a loss of this one here, so that's this one. So this C has a lot of this particularly contributable one which you can kind of see because there is a kind of C shape in it. And so, it's hard perhaps to grasp the exact contribution that this will have because these coefficients are essentially arbitrary numbers But the point is that this image is the addition of lots of these high frequency sections and a lot less of these low frequency ones. So when we do our standard quantization, we're going to divide all of these numbers by huge amounts and set most of them to 0 and that's going to be a big problem, because when we then recreate the image on the other side we're going to find that what was vital in creating this image is now gone and we're not going to get it back. And in fact that's exactly what you do see. So if we show the actual output here we can see that our C is kind of visible but is being completely dwarfed by all this random noise that's been added to the edge of our text. And this is exactly what happens in normal compression of text using JPEGs. Essentially we assume just like in normal nature photographs that we can get rid of the high frequency information and we couldn't do that. That was a bad idea. And so we've got all this stuff that we shouldn't have. If we look at the block when compared to the original we can see that this value is 48. It's now 66. So a lot of these values have changed by quite a large amount. In our last video I think the standard error between the old and the new were something like 3. On average they changed about 3, up or down. This is about 11. It's over triple the amount of sort of average error that we're getting in our pixels. And because it's text, we can see that very clearly in the output image. So, the solution to this, really, is not to use JPEG when you've got a huge amount of text bearing in mind that I shrank that C down to fit into one 8x8 block. In actual fact you would have, if you had like, a sign you would find that a letter took up a huge amount of the image and so maybe you are only compressing one small edge of it and it won't look so bad. But certainly, if you're compressing your JPEG with text in it at 50 percent or lower quality you're going to start to see JPEG artifacts where because these higher frequencies have been removed, you get kind of speckles where they would have dulled that down. You might have seen it actually, if you load up a poorly compressed text document that when you zoom in it doesn't scale well and that's why e-readers won't use something like this, they'll try and render the text sort of from source, as it were, and that way they don't have any of these problems. The interesting thing is that once this damage is done it doesn't make it worse to keep re-encoding it, because the coefficients for this are now all 0, because we set them to 0. If we re-encoded this as a JPEG, it's not going to get progressively worse unless we change the quality settings. It's actually just going to stay this bad. So essentially, this is a bad JPEGable version of this, which you should stick to if you want to keep using JPEG. But otherwise, avoid it. ...absolutely useless in almost any other domain. If you put a chess AI in a Google self-driving car not only can it not drive the car, it doesn't have the concept, it doesn't know what a car is...
Info
Channel: Computerphile
Views: 586,182
Rating: undefined out of 5
Keywords: computers, computerphile, JPEG (File Format), Computer Science (Field Of Study), university of nottingham, compression, image compression, DCT, discrete cosine transform
Id: yBX8GFqt6GA
Channel Id: undefined
Length: 5min 37sec (337 seconds)
Published: Tue Jun 09 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.