Pacman Parallel Downloads Way Better Than Expected

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
ages ago i did a video on a really awesome feature being added into the pacman 6 beta and that was going to be parallel downloads now when i did that video i didn't actually test out the feature mainly because i was waiting for it to be properly fleshed out and i didn't exactly want to run a beta package manager on my system i mainly focused that video around tools like powerpill which is basically a wrapper around pacman which allows you to have parallel downloads to a tool called aria 2. however about a month ago this feature actually was released with pac-man 6 and i sort of forgot to do a video about it so here's that video i guess so that we have a point of reference let's install a package without actually having the feature enabled and one program that has an absolute ton of dependencies is pen doc now normally i would have this installed but i did go and uninstall it and clear my package cache so let's actually go and install that now so sudo pacman and we'll do an syu just so we can update our databases as well and install pandoc and it's going to go and synchronize those one by one by one and let's go and download the packages and you've seen this tons and tons of times before it's going to go and download everything one by one by one it's going to download all 125 packages and eventually it'll be done i'm not going to go through all of this because you guys probably installed packages before but you get the general idea of how it would normally work now enabling this feature is actually going to be incredibly easy if you've just installed arch linux with the latest iso you're already going to have pacman 6 installed and the config file that it comes with has the option we need to enable already in the config we just need to uncomment it if you're using an old install like i am though we just have to add a new line into the config so what we need to do is go into our slash etsy directory into our pacman.com this is where all of our pacman configuration lives and the line we need to uncomment or add is this one right here parallel downloads equals and then a number where this is placed in the config doesn't exactly matter so i believe the default number for this is going to be five and that is basically how many downloads to do at the exact same time and what you set this number to very much depends on what hardware and what your network actually looks like the general safe rule for any threaded operations is match the number of threads on your cpu so in my case i have a 3600x which is a six core 12 thread cpu so i should have this number set to 12 but you don't actually have to have it set exactly to that number going way too far above that number let's say i went up to like 100 or 200 threads definitely can lead to performance reduction because every single extra thread you add is another thread that needs to be managed by your system but you're never going to have a situation where the only threads running on your system are the threads for this application if nothing else if you have nothing else running in user space there will always be some sort of drivers running internal space that are still eating up some of those cpu cycles my suggestion is going to be set it to the number of threads on your cpu and then incrementally raise that until you're no longer seeing any performance gains whether that's because you've say maxed out a cpu core and you've reached a cpu bottleneck or maybe you're just splitting up your network connection so many times that doing that more isn't actually giving you any sort of benefit all we need to do now is just save the config file and then basically it's done if it's not actually working what you've probably done is misspelled parallel i don't know if there's other people out there who are as bad as spelling as i am but i occasionally forget how many l's are supposed to be in that word so make sure you don't do that anyway let's actually go and test out the feature off camera i did go and uninstall pandora and clear my package cache so let's actually see what happens now so sudo pacman dash syu and we'll do pandoc again so the first thing you'll notice is actually going to synchronize all the mirrors at the exact same time well that's cool and all that's not where you're going to see the main performance gains let's actually go and install the package so as you can see all of this stuff is going at the exact same time and because all of these are very very small packages we're getting a massive massive performance gain here because trying to set up each of those downloads is going to have some sort of i guess what's the word for it some sort of setup time allocated to it but in this case we're doing all that setup at the exact same time allowing us to start the downloads much much quicker while this isn't going to give you a massive performance gain when you're installing really big packages let's say firefox chromium caden live you already have all the dependencies installed most programs aren't going to be like that where you have one really big package and then a couple of little things most are going to be more like pandora or when you're doing say a system update let's say there is a hundred updates to install at the exact same time most of these packages are going to be these very small programs whether it's say like an update to cp or an update to ls or an update to any of these little haskell libraries most things you have installed on your system are only really a couple of meg and that's where something like this really really shines you still only have one network connection so shouldn't the downloads be considerably slower when you're downloading let's say 16 things at the exact same time and yes each individual download very well may be slower but the total throughput is probably going to be higher at my house i have a 50 megabit connection which is by no means a super fast connection in 2021 this would be basically i think six ish megabytes a second if i was using the entire connection on one singular download but most of the time when i'm downloading something i'm only getting maybe at max about half of that am i being lied to then do i just not really have 50 megabit and it's closer to like 25 or so well that might be the case but networks aren't really that simple unless the server is like in your house most of the time when you connect to a server you're not making a direct connection to it you're going to jump from router to router to router router all around the world and then eventually you'll get to the server and then for that data to come back to you it needs to make some other path maybe the same one maybe a different one doesn't really matter and then the data will eventually get back to you every single extra hop along that path is going to slow down the connection because each of those routers needs to do a little bit of processing on that packet to make sure it's being sent to the correct next location to make sure it ultimately gets to its destination and if one of those routers happens to be say a router from 15 years ago and is actually really slow well the network connection is only going to be as fast as the slowest thing on the route so if you're not going to make use of the entire connection anyway is there any harm in splitting that connection up into multiple small connections and then sending them all out at the exact same time the answer to that is absolutely not and even if there isn't any sort of jam there is another problem that needs to be considered as well and that is the server you're actually sending that data to unless you've set up the arch mirror yourself you're probably not downloading from a personal mirror in which case that mirror is probably there to serve thousands of other people in the community and it wouldn't really be fair for that server to give the first person who connected say like the entire server's bandwidth to themselves what makes more sense is to split up that connection for each of the users who are trying to connect to it so you're probably not going to be getting your entire network's connection straight to that server anyway even if it is actually physically possible just to make it easy let's say the mirror has a one megabit cap for each of its connections so if you're only downloading one thing at a time no matter how many extra things you need to download you're still only going to be getting a total throughput of one megabit but let's say instead you send 16 connections out the exact same time well now your total throughput is going to be 16 times what it is i know that's a very simplified example of how that actually works but i hope it does get the point across and another thing you need to address is that you probably don't have your entire network's connection going to a single thing you're trying to download usually gonna have some other network connections running on your network let's say you're trying to watch youtube at the same time or someone in your house is gaming and if everything was allocated to this one connection it wouldn't really be fair to the other users on the network while each of the individual connections probably will be slower this should hopefully lead to a much higher throughput now there is a big caveat with everything i've said about network speeds and that is as long as your network connection isn't absolute garbage if you have like i don't know a dial up connection it's probably going to make it slower by doing this but most people out there have at least like 5 or 10 megabit and in that case you are going to see some sort of benefit the amount of benefit you see is very much going to depend on how fast your connection actually is last time i talked about the parallel downloads feature someone said wouldn't mirrors just throttle people who are making parallel connections to their mirror while this could very well happen i've i kind of doubt it because this is an official feature built into pac-man but let's just assume that it did happen that's why you would include multiple mirrors inside your mirror list because if one of them is being slow then it could just jump over to the next one that's it for me and before i go i would like to thank my supporters so a special thank you to joachim donald logan michael andrew mitchell nathan david caldwell brennan chica bender jamie joseph josh mike repeatedly steven t has the return to shah and all my two dollar supporters if you'd like to go and support the links down below to my patreon subscribers i'll leave a fail that sort of stuff i've got my podcast tech over t available basically anywhere i've got a gaming channel called barry robinson plays where i'm live usually twice a week and upload usually about five youtube shorts sometimes it changes but the general gist i'm also over on odyssey that's it for me and i'm out [Applause] [Music]
Info
Channel: Brodie Robertson
Views: 2,652
Rating: undefined out of 5
Keywords: brodie robertson, pacman, arch linux pacman, pacman package manager, pacman arch linux, pacman parallel downloads, parallel downloads arch, arch linux parallel downloads, parallel downloads arch linux, arch linux package manager, package manager linux, pacman linux, linux package manager, pacman package manager arch linux, pacman 6 arch linux, pacman 6 arch, arch linux pacman 6, arch pacman 6
Id: p8e3uEAsQO8
Channel Id: undefined
Length: 11min 9sec (669 seconds)
Published: Sat Jul 17 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.