Explaining File Compression Formats

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] welcome to another video from explaining computers.com this time I'm going to talk about zip and other file compression formats this will include a bit of history as well as discussing the software required to create and access access different formats of compressed file in different operating systems so let's go and get started file compression reduces the size of a file to create a smaller version that uses less space often the compressed file is in an archive format which means it can include multiple files file compression methods can be lossy or non-lossy in lossy compression such as the process used to make jpeg image files some data is discarded however this video is about non-lossy file compression which maintains all data by identifying repeated data patterns and replacing them with codes that take up less space file compression methods have been in development since the 1940s for example in 1952 the David a Huffman published a paper called a method for the construction of minimum redundancy codes this described a technique now known as Huffman coding which uses fewer bits of data to encode information patterns that occur more frequently in 1977 and 1978 Abraham lmel and Jacob ziv also published seminal papers that defined lossless compression methods called lz77 and lz7 8 these are known as dictionary or substitution coders as they replace repeated data pattern with a reference to an entry in a dictionary that is created during the compression process and like Huffman coding lz77 and lz78 still form the basis of many of today's data compression standards talking of which in 1986 Phil Katz founded a company called pkware to create data compression software in 1989 pkware introduced a lossless file compression format called zip along with a Dos program called pkzip to compress and decompress zip files pkware also released the zip file format into the public domain so enabling its widespread use zip is an archiving format that permits the use of a number of compression algorithms the most common of which is deflate which uses a combination of lz77 and Huffman coding in the 1990s pkzip became available for Windows and as we can see from the early days it allowed zip files to be password protected and indeed still today if you want to email or otherwise transfer some files with an added level of security a decent option is to create a password protected zip file since Windows me Windows has natively included the option to create a zip file it is however worth noting that if you want to create a password protected zip file in Windows you'll need to use commercial software such as pkzip or windzip or a free open-source application such as szip and we'll compare these and other file compression programs later in the video meanwhile since windows me the functionality to access non-password protected zip files has been built in to all versions of Windows Beyond Windows zip compression and decompression is included in iOS Mac OS Linux and FreeBSD for example in a modern Linux drro to create a zip file we just select content and right click followed by compress exactly what options are available will then depend on the drro but here in Zoros 17 we can create a standard or password protected zip file or alternatively a tarx zed or 7 Z file and this leads us to the subject of other file compression formats alongside zip many other file compression formats are in common use detailing all of these would make this video very long indeep indeed so let's run through some of the most common before summarizing which applications can create and access them and comparing compression efficiency in 1992 Jean luk gayy and Mark Adler released the gzip format and software for the gnu project gzip files have a gz extension and use the deflate compression algorithm commonly used in zip however unlike zip gzip is not an archiving format so can only compress a single file in 1993 Eugene Rell created the Rell archive format or raw while this remains proprietary the license allows anybody to create software capable of decompressing raw so lots of file compression packages can read raw files however raw files can only be compressed using commercial applications such as win ra from raw lab back in the public domain in 1999 Igor pavlof released a free file compression application called 7zip which has a native archiving compression format called 7z 7z can use a number of different compression algorithms including LZ Ma and LZ ma2 lzma stands for the lmpo ziv Markoff chain algorithm is based on a very of lz77 and was also created by Igor pavlof LZ ma2 compression is faster than LZ ma as it makes better use of multiple processor cores LZ ma also has its own non archiving file compression format with the extension LZ ma a fre set of command line tools called XZ utils previously called LZ ma utils can create LZ ma files xed utils also has its own xed native format which is non- archiving and uses lzma compression bzip 2 is another open-source file compression utility it has the native format bz2 and uses a compression algorithm called Burrows wheeler transform or bwt next on our non-exhaustive list we come to compressed tar files in 1979 the tar or tape archive format was developed by the AT&T Bell lamps as this suggests tar was created to package files together for storage on tape and is not a file compression format however t files are often compressed using a variety of methods for example gzip can be used to create compressed tar files with a tgz or tar tgz extension and tar XZ and tar b2z files are also common finally it's worth noting that in 2008 the windzip file compression program introduced a new version of the zip format called zip X this stands for zip extended and create smaller files by using XZ compression so which file compression format should you use well the two key factors to consider are how your files will be compressed and decompressed and how efficient the compression will be not that many years ago it was also worth considering the time and Computing resources required to compress and decompress data but with Modern Hardware this is unlikely to be a significant concern or constraint for most users as we've noted all mainstream operating systems can create and access zip files in general Windows has the weakest native support for other formats although in October 2023 Microsoft issued an optional update for Windows 11 22h that added the ability to access many more compression formats including 7z gz raw tarx and tb2 Z providing that the files on not password protected however if you want to create compressed files in Windows in a format other than zip or if you want to create password protected zip files additional software is needed so let's look at the capabilities of those applications mentioned earlier firstly we have pkzip which can be used to create zip files and his commercial software with a free trial for desktop users the program is is available for Windows and Linux and like all popular file compression applications as we can see here it can decompress a lot of other file formats in addition to its own zip files moving on windzip is a commercial Windows application with a 21-day free trial the package can create zip and zip X Files whilst being able to access most other compressed file formats next is 7zip which is a free open source application for Windows Linux and Mac the package can create zip as well as its own 7z format files as well as being able to access loads of other compressed file formats and for most users 7even zip is my recommended file compression application moving on we come to win raw for a Windows PC which is also available under the name raw for Mac OS Linux free BSD and Android this is again a commercial application with a free trial but it's far less intrusive when you install it than for example windzip the package can compress raw and zip files as well as being able to open many other formats as we can see here in its setting screen and personally if I was going to purchase a file compression application it would be win raw lastly we get to gzip XZ utils and bzip 2 all of which are command line utilities and all of which are pre-installed in most Linux dros here for example we're working with gzip in a terminal where if I press enter we will compress the final tpep file for my book digital Genesis and there we are it's done and if we do a list like that we can see the original file at the top and the compressed gz file that similar we could also compress the file using XZ utils like this very similar syntax just change gzip there to uh XZ like that takes a little bit longer but uh it will complete there we go or we could compress using a bzip 2 fit about the same again there we are and if we again do a list like that we can see all three compressed files with the xed compressed file being the smallest and a gz file the largest and this brings us to the subject of file compression efficiency using different file compression formats greetings here we are back in zorin OS where I've been using the functionality in the file manager to create compressed files of this sample content which is about 90 megabytes in size and I've already created 7 Z and T XZ files so let's now create a zip file just a right click and then compress zip is our default here I'll just change the name there we go and there we are we now have a zip file so we just close that down like that we can see we've now got three compressed files and earlier I used win raw to create a raw file of the same content so let's just paste that in as well there it is and so now we have four compressed files there they are of the same content and as we can see the zip file is the largest followed by the raw file and matar XZ file with the 7z file being the most efficiently compressed now it's extremely important to note that different formats and algorithms work best on different types of content and that some formats can use different compression algorithms so our results here are very much just indicative based on one set of content and compression settings however they hopefully provide an illustration of the relative efficiency of four the most common file compression formats today Drive capacities and online transer speeds have never been greater and therefore our need to use compressed file formats is not as critical as it once was this said compressed archiving formats like Zip or 7z or raw remain important not just to save space but also to allow us to bring together lots of files in one container that can potentially be password protected but now that's it for another video if you've enjoy what you've seen here please press that like button if you haven't subscribed Please Subscribe and I hope to talk to you again very [Music] soon
Info
Channel: ExplainingComputers
Views: 129,432
Rating: undefined out of 5
Keywords: file compression, file compression formats, zip, zipx, WinRAR, rar, WinZip, PKZIP, Huffman coding, lzma, lz77, lz78, 7-Zip, 7z, zip vs 7z, compressed tar, tar.xz, BZip2, GZip, gz, bz2, Christopher Barnatt, Barnatt
Id: VPj_dILDK6I
Channel Id: undefined
Length: 15min 25sec (925 seconds)
Published: Sun Jan 21 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.