How are Images Compressed? [46MB β†˜β†˜ 4.07MB] JPEG In Depth

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Do i look like i know what a jpeg is? i just want a picture of a got dang hot dog.

πŸ‘οΈŽ︎ 3 πŸ‘€οΈŽ︎ u/Farzle πŸ“…οΈŽ︎ Jan 21 2022 πŸ—«︎ replies
Captions
here we have an uncompressed image and it uses 46 megabytes of space and over here we have the same image as a compressed jpeg and it uses 4.1 megabytes can you see the difference what about when we zoom in so that we can see the individual pixels well in this video we're going to take a deep dive into the jpeg algorithm and see how images can be compressed to just a tenth of their uncompressed file size all while keeping the same image resolution and a very high quality appearance to begin let's take a quick 26 seconds to understand the importance of this algorithm why we're making this video and truthfully why you should stick around first most digital images from your phone or a camera are saved using the jpeg format second i spent a couple hours on the internet recording which images were jpeg versus other formats and found that 86 of the images were jpegs so essentially this algorithm is everywhere third video compression algorithms such as h.264 well that's 26 seconds so let's get back to seeing what jpeg does in short jpeg goes through and analyzes each section of an image and finds and removes elements that your eyes can't easily perceive when you compress an image via jpeg you can use a sliding scale called quality to decide how much you want to compress the image as the quality of an image decreases from one hundred percent to zero percent the amount of file compression increases thereby decreasing the amount of space the image file takes up here we have 12 images along with the quality and file size of each image as we continue to compress the image we can see that the picture's resolution or number of pixels stays the same but eventually we get these defect squares which are technically called artifacts let's take the 90 image and the 10 image and zoom in here we can see the inner workings of the jpeg compression hard at work but wait how exactly does jpeg work well that's the focus of this video so let's dive right in the jpeg compression algorithm is composed of five key steps each with a rather complicated name but before we dive into the details it's first important to understand the reason why jpeg works human eyes are not perfect they have their nuances and jpeg exploits these nuances to remove information that our eyes are not great at perceiving for example in the human eye there are two different types of light receptive cells rods and cones rods are not color sensitive and are critical for seeing in low light conditions whereas cones with their color receptors of red green and blue are color sensitive furthermore in each eye there are 100 million rod cells whereas there are only six million cone cells and as a result your eyes are far more receptive to the brightness and darkness of an image which is called luminance and far less receptive to the colors contained in that image which is called chrominance take this image of some tulips for example the black and white version that shows only the luminance appears to be just as detailed as the full colored image however when we look at just the color alone or the prominence that same image appears significantly less detailed so let's see how the jpeg algorithm exploits the nuance of the human eye the first step is color space conversion see the original image is composed of pixels and every pixel has a red green and blue component each with a value from 0 to 255 and the combination of these three values of r g and b results in a color for a single pixel the process of color space conversion takes these three r g and b values for every single pixel and calculates three new values luminance blue chrominance and red chrominance abbreviated y cb and cr this process is reversible and no data is removed during the conversion however the next step called prominence down sampling removes a considerable amount of data remember how we said that our eyes are bad at detecting color or prominence versus brightness or luminance well in down sampling we take both the blue and red chrominance component images and divide the component images into two by two blocks of pixels then we calculate the average value for each block remove the repetitive information and shrink the image so each average value of a 4 pixel block takes up a single pixel as a result the information that our eyes are poor at perceiving the red and blue prominence component images are shrunk to one quarter of the original size but the luminance remains the same now with just two steps the image is half the original size note that when reassembling the picture the blue and red prominence images are rescaled to match the size of the luminance component with the rgb values being recalculated from luminance blue chrominance and red chrominants and because the luminance changes from pixel to pixel the recalculated rgb values can change from pixel to pixel as well the next two steps are definitely a little more complicated and they're called discrete cosine transform or dct and quantization together these two steps also remove information but they do it by exploiting the fact that our eyes aren't good at perceiving high frequency elements within images what does that mean well let's take a look at this picture of the woods our eyes are great at seeing the edge of a tree or the outline of a rock but when it comes to focusing on and distinguishing high frequency color data such as single blades of grass individual leaves in a cluster of leaves or variations in the shadows created by the leaves of a tree our eyes can't really pick out the details furthermore most nature or landscape photography has portions of the image that are out of focus and removing high frequency color variation to create smoother textures is unnoticeable so then how does the jpeg algorithm exploit the nuance of the human eye well essentially the discrete cosine transform and quantization steps go through each section of the image and find areas that have a high frequency of alternating prominence or luminance these elements that our eyes aren't able to perceive are then removed this process is rather complicated but bear with us let's use the luminance component image as our example but know that the same process happens with the two prominence components the first step is to divide the entire image into 8 by 8 sections called blocks each with 64 pixels with values from 0 to 255 that represent the luminance at every pixel next we shift each value by subtracting 128 from each value so the range becomes negative 128 to 127 where negative 128 is black and 127 is white the next step is complicated so let's start with an analogy pretend you have a painting that you want to recreate and you only have a dozen different colors in order to recreate this painting you'll need perhaps 15 parts of the first paint and then three parts of the second paint followed by eight parts of the third paint all the way up until you use perhaps 11 parts of the last paint and in the end we have recreated our original painting the discrete cosine transform works kind of like this however instead of paint we use these 64 base images and just like in our analogy we can rebuild any block of 64 pixels using a combination of these 64 base images with each image multiplied by a value or a constant saying how much of that base image is used thus the 64 pixel block each containing a value is transformed into 64 values or constants that represent how much of each base image is used let's take this letter a for example we can rebuild this letter a using this set of 64 base images with a constant multiplied by each base image we add up all the base images times their respective constant and as a result we get this letter a nothing in dct actually compresses or shrinks the image but the next step quantization does so how does quantization work well here we have our table of constants corresponding to the utilization of each base image the next step is to divide each value in the table of constants by the corresponding value in the quantization table and round each result to the closest integer this quantization table has higher numbers in the bottom right where the high frequency data that your eyes aren't great at perceiving is located and smaller numbers in the top left where more distinct patterns are located after we divide each constant by the corresponding value in the quantization table and round to the nearest integer our blocks data looks like this it has just a few numbers and a lot of zeros in this step we're throwing away data but really we're just throwing away data that our eyes don't perceive so we can't even tell the difference we also use a second quantization table with the prominence values that are larger and thus we generate even more zeros in the resulting table in essence throughout the discrete cosine transform and quantization steps the entire image uses a set of 64 base images which are always the same and two quantization tables one for luminance and the other for chrominance in order to transform every eight by eight block of pixels into just a few numbers and a whole bunch of zeros the last step is called run length and huffman encoding and in it we list all the values for every block in both the luminance and prominence images however when we list the numbers we use a zigzag pattern like this because it's more likely that the non-zero numbers will be found up here next we use a run length encoding algorithm where we list the numbers and then instead of listing all the zeros we just say how many zeros there are perhaps you can see that this list of just a couple dozen numbers is far more compressed than 64 pixels being represented each by a number from zero to 255 after that we use a huffman encoding scheme which is a whole separate encoding algorithm that's covered pretty well in this video by tom scott that you should take a look at after we discuss the h.264 video compression algorithm and how the image is rebuilt as well as a few caveats the h.264 video compression algorithm also called advanced video coding or abc is currently the recommended video compression algorithm for uploading videos to youtube and it uses techniques such as chrominance down sampling or chroma subsampling as well as variations of discrete cosine transform and quantization however h.264 is more complicated because instead of compressing a single static image as in jpeg video compression must compress 24 to 60 or more frames for every second of video the very short explanation is that it uses intra frames or iframes which are similar to jpeg images for one out of every 30 frames and then for the other 29 frames it uses prediction or bi-directional prediction to only code for the difference and motion while using previously decoded frames as reference note that the frequency of iframes varies widely and there is typically an iframe at the start of every scene change as prediction doesn't work well across scene changes these topics are incredibly complex so they'll have to be covered in a separate video but let's now get back to jpeg in order to rebuild the original image we follow the reverse set of steps first we disassemble the run length encoding and perform huffman decoding schemes and lay the values into our 8x8 blocks next we multiply each value by the quantization table and then multiply the resulting constants by the corresponding base images and add all the constituent base images together then we upscale the red prominence in blue prominence images and reconvert the luminance and chrominance values into the red green and blue color space with this we can see how four blocks of luminance and two sets of prominence blocks yield a 16 by 16 grid of pixels finally when we zoom out we have something that looks nearly identical to our original uncompressed image it's truly amazing how your smartphone can take images composed of millions of pixels and then perform calculations on every eight by eight block of pixels compressing all that data into just a couple dozen numbers and then turning around and uncompressing the image faster than it takes you to swipe your finger across the screen for example this picture is 4032 by 3024 pixels which yields a total of 190 512 blocks and in order to compress or uncompress this image every single block must go through each step of the algorithm indeed our smartphones are truly impressive but wait wait we're not yet done with this video there are some additional notes and major shortcomings to the jpeg algorithm that we should discuss first sometimes you can select how much you want to compress an image and this scaling level of compression changes the values in the quantization table because the algorithm divides using these quantization tables and then rounds to the nearest result if we increase the values in the table we will inevitably get more zeros in the resulting discrete cosine transformed and quantized block and as a result the file will be smaller however with too much compression you get artifacts or issues with the compressed image that look like blurry splotches on the edges of square blocks you can see how many blocks have similar traits to the top left blocks in the discrete cosine transformation table the next note is that earlier we mentioned that quantization removes high frequency data which is partially correct in reality quantization reduces the precision of an image block and reduces the precision more for the high frequency data compared to the low frequency data thus making the image block less accurate the third note is that jpeg is great at compressing pictures taken from a camera because natural world pictures tend to have a lot of smooth textures and because no camera is perfectly in focus it's hard to tell the difference between the uncompressed and compressed image however it doesn't perform well at compressing vector graphics like this and in fact you get rather noticeable artifacts close to the boundary lines in vector graphics this is because the jpeg algorithm needs to reconstruct these straight lines using the base images which don't work perfectly when the data is compressed therefore it's recommended not to compress vector graphics using the jpeg algorithm finally jpeg is by far the most common image format because it's old well understood and royalty free but there are a number of other image formats some with comparably better compression capabilities rather fittingly this video is sponsored by brilliant a website and app that teaches you all kinds of stem topics in hands-on interactive ways from the basics such as foundational math or computer science fundamentals all the way to complex topics such as astrophysics and quantum computing in this video we just scratched the surface of algorithms by showing you the inside of one algorithm but if you want to learn more about the algorithms that run our technology-filled world we recommend you look at brilliant's course on algorithm fundamentals brilliant uses interactive courses to bring explanations and thus your understanding to the next level textbooks boring lectures and powerpoint presentations are out and fun animations and interactives are in for the viewers of this channel brilliant is offering 20 percent off an annual subscription to the first 200 people to sign up just go to brilliant.org slash branch education and you can find that link in the description below thank you again to brilliant for sponsoring this video that's pretty much it for the jpeg compression we believe the future will require a strong emphasis on engineering education and we're thankful to all of our patreon and youtube membership sponsors for supporting this stream if you want to support us on youtube memberships or patreon you can find the links in the description also remember to subscribe comment below and share this video with others this is branch education thanks for watching to the end [Music]
Info
Channel: Branch Education
Views: 2,460,033
Rating: undefined out of 5
Keywords: JPEG, Camera, Picture, Compression, JPEG Compression, JPEG Compression Algorithm, JPEG compression algorithm with example, Compression Algorithm, Image Compression, Video Compression, Cameras, Pictures, Jpegs, .jpeg, saving images, saving pictures, iphone camera, smartphone camera
Id: Kv1Hiv3ox8I
Channel Id: undefined
Length: 18min 47sec (1127 seconds)
Published: Thu Dec 23 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.