Linux Server Course - System Configuration and Operation

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
linux is a popular operating system for server administration because it's secure stable and flexible in this course sean powers from cbt nuggets will detail every part of configuring monitoring and supporting a server setup that runs the linux operating system this course will teach you everything you need to know to configure linux servers including the boot process kernel modules network connection parameters localization groups and more let's get started [Music] bios and uefi are two tools that basically do the same sort of thing but they can be a little bit confusing when it comes to how do you know which one to use on a computer and and do i need to support one or the other or both or the nice thing is you usually either have bios or uefi now bios is an older program it just stands for basic input output system whereas uefi is the new kit on the block and this stands for unified extensible firmware interface which sounds confusing but really it's just the the way that we can interact between the hardware and the operating system now let's say that we have two different vehicles now these are two obviously very very different vehicles one of them is a sports car and one of them is an awesome yellow volkswagen beetle i actually have a volkswagen beetle that's yellow like this and it's awesome but nonetheless you could be a sports car person either way even though they're ridiculously different vehicles they both have some common interfaces right they're both going to have brake pedals they're both going to have steering wheels they're both going to have windshields and those interfaces are fairly common across all vehicles now the brakes in the underlying system are going to be different this sports car probably has really nice disc brakes whereas these old volkswagens have drum brakes now drum brakes aren't as good but the interface itself is very very similar you push the brake pedal and you stop and that's kind of what bios and uefi are they're interfaces between the hardware and the operating system itself now they do work a little bit differently so here we have our hard drive on our system now using bios like the old method of booting a computer you would have like the very first sector on the hard drive would be the boot sector and that's where the mbr the master boot record would live and then that would tell the computer where the partitions are and you point it to where to boot now there's a lot of limitations with bios you could only have four partitions using the bios and mbr combination there are some hacks to get around that you know they would take a partition and do extended partitions inside that but that's a whole nother nugget still there was this limitation also a limitation of size with the drive how much this boot sector or this master boot record can actually reference it can be a small amount like two terabytes instead of exabytes of data and so uefi is a replacement for the bios technology and rather than just have the single boot sector what it does is there's an entire partition on the system and that partition is where all of the boot code is for whatever operating systems might be on the computer so rather than just the boot sector pointing to you know the rest of the hard drive this is an actual specialized partition on the computer and that's where the uefi code is stored also it uses a different partition scheme so you can have tons and tons of partitions and it can address a much larger hard drive there's also other things like secure boot that uefi supports that bios doesn't basically just know that uefi is the replacement for bios it replaces the functionality of connecting the hardware to the software of the operating system now the good news is if you have a computer that's older or even some new ones so come with bios you can still get around a lot of the limitations because there are hacks that will let you use really big hard drives or hacks that will allow you to do some of the things that you can't normally do out of the box but most computers now are coming with uefi and from a from an installer point of view there's very little you have to do because the operating system is going to say okay i was booted using uefi so i'm going to create a partition a uefi partition and i'm going to put all the boot code in there so from your standpoint from the end user or the installer standpoint there's very little difference but under the hood a lot of new cool stuff is going on and that's why uefi is kind of the way of the future remember when you were a kid and in the doctor's waiting room there were these magazines and one of the games inside the magazines was that you had to find the difference in one of a bunch of similar looking figures like for example here are what do i have nine different jokers and one of them is a little different now it's not terribly difficult to see which one is different here if you watched while they appeared this one has a tiny little spot right there on his shirt that's different than all the others so we've been able to identify it but you can see they're very very very very similar well the same thing is true about grub and grub 2 which of course is the next iteration of grub now grub stands for grand unified boot loader but really it's just the way that the computer transitions from bios or uefi booting up conditions into the actual operating system itself it's what tells the computer okay where's my partitions and stuff like that it's very easy to confuse which is which on your system which seems silly but it can be very very embarrassing when you sit down at a system you're like okay so is this running grub or grub 2 because they do the same thing right they both boot the computer but they do have some minor nuanced differences grub of course is older you probably guess that by it not having a 2 at the end of its name but grub is often called grub legacy because it is older but it's still on a few operating systems not many but a few i think slackware still uses grub legacy now the biggest and easiest way to figure out if you're dealing with grub or grub 2 on your system is to look inside the boot grub folder and if you see menu.lst or grub.com that means you're running grub legacy because grub 2 does not have those configuration files grub 2 has this configuration file grub.cfg now it's easy to confuse cfg.com so i always just look for menu.lst if that exists you're on grub legacy now the thing about grub is it's kind of difficult to modify it was uh it's very easy for automated systems like when you install a new kernel to figure out how to put how to you know update the menu and update the boot code but it's really difficult for the end user to modify that stuff and the boot menu usually when it's booting up it says you know press something in 10 seconds in order to change the way it boots so this usually just appears for you and you can see it now grub 2 is a lot more customizable in fact if you go into the etc default grub file this is a configuration file that's easy to read and it's going to allow you to change the way that it looks when it boots up now grub2 also has a ton of other cool features it can boot from an iso file from a usb disk it can name or it can identify hard drives based on their uuid or their device like dev sda so it's a lot more advanced than grub one but one of the you know the advancements that's also a frustration is that boot menu that i talked about that comes up with grub it's hidden with grub 2. so it just goes right to the login screen you never see anything from grub 2. and if you don't know this little trick it can be a real bugger to get into that menu if you want to change something during boot so be sure to just hold down the shift key when the computer is booting up and then boom all of a sudden you're going to get into the grub 2 interactive menu that you can change boot code things on the fly it's really really cool i'll show you what i mean but it's not difficult once you know if you have grub or grub 2 grub 2 which allows you to do a lot more configuration stuff now i'm here on an ubuntu system this has grub 2 and if we were to look at let's just look at it etc default grub we're going to see this is just a configuration file we can change these things once you make a change you do have to do sudo update grub you do that it's going to update the boot code inside the boot folder and if we look in that boot folder so go into boot grub type ls i look at that grub dot cfg there's no menu.lst so we know that this is grub 2. so that's how you manage things in grub 2. i'll show you really quickly before we end this is a computer that's turned off if i start this computer and hold down the shift key as long as i hold down the shift key boom we get this menu but if i don't hold that we don't see any menu for us to interact with grub at all now figuring out if you have grub or grub 2 can be a little bit challenging if you don't know that simple trick to look for menu.lst but they both do the same thing they tell the computer how to boot up and how to mount its different partitions it's just that grub 2 is definitely an advancement it's more configurable it does more things and it's easier for the end user to manage i hope this has been informative for you and i'd like to thank you for viewing linux is so flexible it can boot from an incredible number of methods or sources now there are several different things we need to understand about the boot processes like is it hardware based or is it software based and i'll tell you what i mean about that in a minute but there are just this multitude of ways that a linux system can boot it can boot over pxe or pixie boot if you haven't seen that it's just the coolest thing ever it's when it boots up completely over the network we can boot from usb from cd ipixi is a more advanced version of pixi i'll talk about that too and then iso images just like the dot iso file you can actually make a grub 2 entry that will boot directly from the iso even though you don't have it burned to a disk or to a usb drive it can live on your system and you can have a grub menu item that boots directly to that iso image it's really flexible what it can do but that hardware software thing i wanted to talk about because where the boot process is taking place is really important some of it is linux specific some of it is not so first of all let's talk about the hardware things now when i say hardware i mean the bios or the uefi the part that takes place before linux is ever introduced and pxe is one of those things pxc which stands for pre-boot execution environment is the way that the hardware says okay i don't see a hard drive or i'm not set up to use a hard drive so i'm going to query the network and it just queries a dhcp server and the dhcp server responds with not only an ip address which is what dhcp servers normally do but also a boot file and a tftp location a tftp server is just a place you can store files on a network and then basically the computer then downloads that image that boot image from the tftp server and that's where linux comes into play that kernel or that boot file is the linux kernel so it downloads it off the internet or off the internet um off of your local network and then it puts it in memory and boots itself from there so pxe starts as just a hardware thing and then turns into a software there's also ipxc which is very similar but instead of using tftp to download that thing it allows you to use http which is faster and usually more reliable than the old-fashioned tftp but it's very similar in concept ipxc if your computer supports it then usb this is also right on the hardware now the hardware determines exactly how to boot from the usb but on the usb itself is where the linux code is right so the hardware knows that it can boot from usb same thing with cd and to be quite honest same thing with a hard drive right the hard drive is booted too because the computer itself understands how to do that but when under linux comes into play we'll say the software side of things this is once it loads into linux linux does some things on its own like the iso booting with grub 2 that's after linux starts right or after grub starts it says okay i'm in grub now what are we going to do we're going to either mount a partition or we're going to look at this iso file and actually use that as our operating system just like it was burned to a cd so some of this stuff is software this is where you like select a kernel this is where you can have mem test so a lot of the stuff is done in software but most things actually all things have to start in the hardware otherwise you're never going to you know get to the point where the software takes control but it's important to know that pxe ipxc these are not linux specific boot methods these are boot methods the computer supports that linux also supports in that they can provide the boot file so i don't want you to confuse pxe network booting with something specific to linux same with usb and cd we can boot windows from usb or cds and it works you know because the hardware supports it if i'm honest probably the most fun way to boot a computer is using pxe only because there's no media right it just boots directly off the network and for some reason that's just really awesome to to be a part of it's really awesome to see but there are multiple ways that you can boot linux and it's important to know that they're all there it's okay if you don't know exactly how to you know boot from an iso file using grub 2. anybody's going to have to google the specifics of that in order to make it work but knowing all these different processes exist for booting a computer that's vital because that will help you learn and help you troubleshoot when you run into a booting issue when it comes to actually booting the linux kernel there are a lot of problems that need to be solved and it's sometimes sort of like a chicken and an egg scenario you have to do one thing but you can't do that thing until the other thing is done and what do you do first and how do you do it it can be really confusing but basically we want to get the full kernel with all of its modules running so the boot process is very complicated and it might seem over complicated but the issue is to try to get the kernel running and then allow it to access the modules that are stored on the hard drive on your system and it ends up being like i said fairly complicated and if i'm honest i've been a system administrator for over 20 years i've passed lots of certification tests and i have never fully understood every step of the process of booting the kernel so you're gonna be like the king of the next nerdy party that you go to uh because you're gonna go to you're gonna know all the trivial pursuit answers when it comes to linux kernel so let's look and see what the boot process actually is and then i'll show you on a system where those files live so i have sort of a little flow chart here now a lot of these steps we probably are already familiar with first the computer either has bios or uefi and that's the hardware on the computer which looks for something that grub or grub too it's one or the other here will provide and then that boot code of you know that grub uses points us to the kernel now this is the stuff where it starts to get a little not confusing so much but complicated it's almost like this fine dance that has to be done now norm what the process is normally is we have the actual linux kernel itself which is a file one file and it's called vm linux or you'll probably currently see it called vm linus with the z and the only difference here these are the same file the pr the only difference is that this is compressed right it's the z means that it's compressed so it's just a space-saving method uh so usually we use vm linus just to save some room on the system and then this is the the kernel itself with no modules right so there's no modules this is just the base kernel and then once the kernel boots up it will mount all the file systems and then it will have access to all of the modules that it needs to insert to make things work like you know your usb mouse and your keyboard and your monitor and your video card all those things are modules that are loaded into the kernel they're not part of the the static kernel right i mean we could build this huge kernel that includes everything but that's just a waste of resources and so we have this stripped down kernel that has just enough stuff to make sure that we can mount the file system so that it can have access to all of its modules and then become the full running kernel on our system there's a few problems what if this full kernel and all the modules yeah so basically what if all the modules here are on a file system that this stripped down kernel doesn't know how to mount right it's like well i don't know how to get onto a raid device or i don't have any idea how to mount this fancy new ssd drive that you put into the pcie slot or something well that's where init rd or init ram disk comes into play this is just enough information like module information and driver information to be able to have the linux kernel access the file system so that it can get to its modules so rather than make you know a custom bigger kernel for every specific system what we've done is we have a generic strip down kernel and then this init rd has the stuff we need in order to load the modules by mounting the hard drive or you know whatever we need maybe this is stored in a network drive so we have to have the nfs stuff in order to mount a remote disk on the on the computer so it can access its module so this is just like the the temporary staging area for kernel stuff that we need and this is kind of inserted right into this running stripped down kernel okay now i want to mention this because a lot of people confuse the init ram disk with the init ram file system and they seem similar conceptually they are fairly similar conceptually except that the init ram fs is actually part of the actual kernel itself it's part of this vm linus or vm linux kernel and this is just like it says it's a file system that it creates and mounts in ram and i want to mention this tool even though we're not going to go into it much dracut is a tool that has made this extremely generic using udev and and stuff like that so that it's very flexible but this is basically the tiny little file system that the linux kernel uses in order to do what it needs to do to get to the point where it can load the full kernel by loading modules so init ram fs is a file system that it loads into ram but that's part of the kernel itself this init rd is a ram disk that is mounted alongside of the kernel that allows it to get to the point so it can use the full kernel and knit ram disc is not used after the system is booted up this is just like a temporary staging ground to get to the full kernel wow that was a lot of information but it's really fairly straightforward when you see why it's doing all of those complicated things now i want to show you really quickly some of the files that we talked about so here i am inside ubuntu and this is the boot folder and we can see we have here's the init ram sure enough and this is i have a couple different kernels in here there was an update but the kernel numbers are listed here but here's the init ram disk and here is the actual kernel file itself there's a couple other files the system map actually tells the kernel where on the file system all of its modules live and then config this is the configuration file when the kernel was actually compiled so if you want to see the options that were used there but those are the main files that are on the system one other thing i want to show you is this system.map tells the kernel to look in lib modules and then the name of the actual running kernel and inside there is where you'll see the actual different modules okay and this is where the modules that are going to be loaded in to get that full kernel live so that's the basic uh basics of starting a linux kernel on your system now i know we covered a lot of things in a lot of terms and a lot of concepts but they should make sense once you go through that flowchart of what's happening and why it's happening and in the end which comes first the chicken or the egg well it depends on what you need to do right if you need an egg then the egg comes first if the chicken can handle it on its own no eggs are needed now a kernel panic certainly seems like a really good time to panic but i assure you just because the colonel panics doesn't mean you should soldier but really kernel panic just means that something went wrong with the kernel now there's a couple common causes and there's a couple things we can do depending on what the root cause is now if all of a sudden you're starting to get kernel panics and you haven't done any system updates recently and it's kind of hit or miss it'll happen it won't happen it happens when you're doing something but not another chances are you may have a piece of faulty hardware and this is one of the really common ways that you can experience a kernel panic if you have an overclocked cpu for example that can sometimes cause kernel panics a bad stick of ram is another one that'll really do it and you know some add-on card like video cards are sometimes you know guilty of doing this really anything on your system any hardware if it fails it can cause a kernel panic there's a couple ways to you know try different things especially if you have multiple sticks of ram sometimes you can pull out a stick of ram see if you still get a kernel panic if you do then put that stick back in and keep pulling them out one at a time and see if the kernel panics automatically go away when you remove one certain stick hey chances are that's what's wrong with you know that certain stick is a bad stick of ram it can be frustrating to track down hardware related kernel panics but there's also another very common kind of kernel panic that happens and that's when you upgrade your system so let's say that you just ran a system update on your system and it installed a new kernel you reboot the computer and then boom it says something like failure to init or kernel panic blah blah blah blah if that happens there is something that you can do which is fairly easy to do and you can you know rescue your system so to speak so when it boots up we're just going to go into grub and we're going to pick an older kernel because when you update a system it keeps the older kernel for this exact reason if something goes wrong you can pick a different kernel and boot the system and then fix it so i'll show you what i mean here we are on our ubuntu system and let's say i just recently updated it and during the update it says now i need to restart my system so i'm going to restart the system and everything should go well blah blah blah blah and then boom we get a kernel panic now this doesn't actually give us the words kernel panic because i did something fairly simple to simulate what would happen if there was some corruption in the kernel uh but it says here you know it's unable to find the init rd sometimes you might see kernel panic or unable to kill init or something like that it says press any key to continue but there's nothing that's going to happen right it's kind of locked up on me because it aired out there's no init rd so what we're going to do is restart the system and when we do that we're going to initiate the grub menu by holding down the shift key so first of all we do need to restart the computer i'm going to hold down the shift key and we should get presented with a grub menu sure enough here's our grub menu advanced options and we're going to see here we have multiple kernels this is the kernel the newest one is the one that it tried to boot by default so i'm just going to go back to the last successful kernel that we had press enter and it should boot this up and our system should be just fine then all we need to do is remove that kernel package with the failed thing using our app system reinstall it again you know do an update so it reinstalls and then it should install that init rd back in place and we should be fine so it's nice that linux keeps a few of the older kernels kind of in line so that we can do exactly this if something goes wrong with one kernel or we've deleted modules or just something doesn't load or we get a kernel panic we can select a previously successful kernel and then here we are on our system working perfectly and now we can go through and remove that package and fix it etcetera etcetera etcetera so if you do get the kernel panic especially on a production system oh it can be really scary but know that there are ways you know to fix it if it's just a kernel corruption you can boot to a previous kernel if it's a hardware issue it's a little harder to troubleshoot but you can also do something like boot from a usb drive or a cd and at the very least you'll be able to access the hard drives so that you can get stuff off of your computer if it is a hardware failure that's causing the problem and then you'll have your data to put onto a new computer but kernel panics are no reason to panic it's just one more thing that we need to understand how is working so we can fix it or troubleshoot the problem and move on from there we know that the linux kernel is modular in that it doesn't load all of the drivers that it could possibly ever need automatically by default it has these modules that it dynamically loads when it discovers that it needs like a driver for a certain sound card on boot now we can configure it to automatically load those modules by setting it up in a config file which we're going to look at now what happens though is it'll load that module but then on behind the scenes there's some incredible dependency checking going on and that dependency checking is really really good at knowing what modules one you know a particular module will depend on there may not actually be a robot called modbot in your system or even in my system i just like to think of a cheesy robot plugging in different modules in my computer but nonetheless the computer is really good at finding dependencies in fact sometimes it's too good and we need to blacklist certain modules so that modbot doesn't say oh this one will work and put it in i'm going to show you why we would do that and also how to do that but we're looking at two concepts today how to automatically load a module on boot and how to make sure that the wrong module isn't loaded on boot automatically by the system detecting a dependency that you don't want it to use now here on our this is an ubuntu system but all linux distributions have kernels we're going to go into the etc folder and there's a file here called modules so we're going to look at that we can see right now it's empty it says kernel modules to load at boot time this contains the names of the kernel modules that should be loaded at boot time one per line lines begin with this are ignored so what we would do is put the name of the module and we would add something like e 1000 which is the name of an intel network card module now normally the system automatically detects the hardware and it knows what to load but we want to i want to give you an example of how you can manually load a kernel module and have it done automatically on boot even if the system doesn't you know detect it on boot like oh i have new hardware i need this particular module so this is how you would manually do it all right now we're not actually going to save this because i don't want to load that module but let's say that's loaded now if there are dependencies it will also load those dependencies let's say that the intel e1000 network card also requires the pci bus to work well then it would load the pci bus module so it's really smart but let's say that there are two different pci modules there's like uh pci version one and pci version 37 and by default it's going to load pci version one and we don't want that so what we would do is blacklist it so if we go into etc mod probe dot d first i want to show you there's a bunch of files in here anything with dot conf is read in right so it doesn't matter which one you put it in there's just some conventions here anything we're going to manually put in we're going to put into blacklist.com so let's actually edit that file and we're going to see there are a bunch of things that are already blacklisted now what this does it just makes it so that the kernel doesn't automatically load it like here's a let's look at this ethernet one okay so this is an e-1394 this is a firewire network device all right it's saying here we never want to use a firewire network device so if you detect a firewire device i do not want you to load the eth1394 or the ethernet module for a firewire port so if we put blacklist here it's not going to load it even if we put in the firewire module itself so that it activates that port this will make sure that the kernel or modbot or whatever however you like to visualize it doesn't say oh well you're putting that in why don't we also activate the ethernet ability of that port we don't want that so we put blacklist in here same thing like that i've i've done this the most with sound cards because there are like 150 different sound card drivers for the ac 97 model of sound cards so a lot of times you have to blacklist a bunch of different models and see there's a blacklisted sound card right here right this is saying don't load this module that's not what i want and this looks like it's because of an ubuntu bug all right so anyway this is where we tell it what not to load even if it's compiled in and it's part of the module we would put it in here or any of the files in this folder anything that says blacklist space blank is going to be blacklisted honestly like i mentioned the linux kernel is usually really good at automatically detecting hardware and loading all of the modules all on its own by detecting new hardware on the system but if there are times where you want to manually load something but you want it to be done on boot that's where we would put it in the etc modules file and then blacklisting is really important especially if you're having problems with something like uh it just doesn't work with this kernel module let me blacklist that so the kernel will pick another module and try to activate that one one of the things that makes the linux kernel very special is that it's modular so we don't have to install all of the drivers for all the potential hardware in a system we just installed the drivers and the modules for the things that exist it makes the linux kernel very efficient because we don't have all that bloat all those unneeded things sitting in memory where they're never going to be used so the kernel can be very very efficient because of its modular design but in order to take full advantage of that we have to make sure that we're putting the right modules in the right place with all the dependencies so we don't end up with a kernel looking like this right with pieces that don't belong thankfully there are a handful of tools that we can use to properly manipulate the kernel modules and the first two i want to look at are ins mod and mod probe because on the surface they appear to do the same thing insert modules into the running kernel this is after we're booted it'll insert them into the running kernel so let's compare them to scientists right so ins mod is a very basic program you have to give it the full path of the kernel module that you want to install it doesn't do any dependency checking it just kind of slams it into the running kernel and if the kernel doesn't work like it's the wrong kernel version or it doesn't have the proper dependencies it's just going to fail and it's not going to give you an explanation as to why it's just going to say well i crammed it in there and it didn't work boom but mod probe on the other hand is a more advanced program it's a more efficient model and what it does you can give it just the name of the kernel module you don't have to give it the full path you can give it the just the name of the module itself it will look and determine all of the dependencies that that module needs so if one module needs you know depends on another module it will say oh let's load that other one first it does a really good job now it does need us to have a map of all the all the needs and dependencies on a system but there's a program to do this we just make sure we type dep mod when we're done installing a new kernel module and it will recreate that map so mod probe knows where to you know find the dependencies now you might be thinking okay well obviously i'm going to use mod probe when i insert modules and exactly that's what you should do why does ins mod even exist well here's the deal mod probe actually knows what modules to install and then behind the scenes it uses insmod to do the actual inserting so mod probe is really kind of a front end that's very intelligent about what to do on the system but that's why insmod still exists it's still the you know the basic tool that is used to cram modules into the running kernel mod probe just knows which kernels to load and in what order to load them so modprobe is what the humans use and ins mod is what modprobe uses if that makes sense it's really easy to see that in action on our system so i'm root on ubuntu here and let's go into lib modules and we're going to see we have folders for each of the kernels installed on the system we're going to go into the one that we're currently running which is the latest one and inside here we'll see this is where we have a lot of like the map files for the mod probe so it knows where the kernel modules live let's go into the kernel folder where the modules live type ls let's go into the drivers folder type ls again oh there's still a lot of drivers let's go into the net folder all right there we actually see some kernel modules here okay so uh we're going to play with thunderbolt net but let's say we wanted to load this in so we could use thunderbolt as a network device if we wanted to use ins mod we would say ins mod lib modules generic kernel drivers net thunderbolt net because we need the entire path we're going to press enter and it's going to say well i can't do that unknown symbol in module it doesn't work blah blah blah it has a dependency it doesn't tell us what the dependency is but if we were to use mod probe and we were to say modprobe i want you to put in thunderbolt net and that's all we don't have to say a path we don't have to put ko at the end just press enter boom it's automatically installed how do we know well one we didn't get an error it just installed it but if we type ls mod will show us all the installed modules and if we scroll a bit to the top we're gonna see well look at that thunderbolt net is installed into the running kernel and we can see the dependency was thunderbolt but modprobe knew to do that and it installed it now there's another tool for removing them and that's rm mod now there's no super smart uh version and dumb version of these rm mod is kind of halfway between because if we were to say rm mod thunderbolt it's not going to do it it's going to say i can't do that it's in use but at least it tells us what it's in use by right thunderbolt underscore net which that's a little strange because the name of it doesn't have an underscore right if we want to get rid of this we could say first we have to rm mod thunderbolt dash net okay no errors and now if we do the up arrow twice now we can rm mod the thunderbolt module and again no problems because we we got rid of that dependency first so now if we do ls mod we're going to see we scroll up that they're not in there anymore they're not installed into the running kernel and that's how you can manipulate kernel modules on your system some of the tools are smarter than others but it's really important to understand why they're smarter and like insmod is important still even though you don't want to use it on the command line for doing very much at all now one more thing i just want to tell you about is if you install a new kernel module like you download a piece of hardware and it has its own kernel module and you compile it and so then you have the kernel module that's put in this folder you know so that it's in our current running modules folder you have to type dep mod in order for it to update that database or that system map so that modprobe knows where it lives and what dependencies it might have it takes a while and now we have it updated so now we can install that new kernel module because it knows all the dependencies on the system now i know in the picture here i have these you know wrong puzzle pieces put into the puzzle this is actually a lot harder to accomplish than you would think because the tools just fail right they say oh this won't fit or this is the wrong kernel module some of them are smarter about it than others but it's really difficult to put the wrong kernel module into a running kernel but i want to make sure that you understand all of these tools exist how they work what they look for and the difference between them first off we're talking about network connectivity today and the first thing i want to show you is always check the cable now it seems silly right of course the cable is fine this is plugged in just fine however ah see it wasn't plugged in tight and whether it's a jokester in the office or somebody who ran over a cable with a vacuum cleaner or something sometimes it is the cable and you can spend a ton of time on the command line and be very frustrated to discover that it was just a bad cable so that's just a pro tip check the cable make sure it's plugged in tightly and that the little lights on the back are flashing and doing their thing now we could probably do an entire course on troubleshooting a network connection that's not working or not working right but today i want to go over just a couple quick tools so you can determine very quickly what your issue might be on the network so i want to look at ping and then just to check out our address on our computer and so the first thing i want to do is give you this scenario we're on the computer and we're trying to go somewhere like google and it says unable to connect oh we can't get online so the first thing i would do is open up a command line window and i would say let's try to ping google so ping google.com network is unreachable okay well i know that the google dns server has an ip address of 8.8.8.8 so that's the next thing i would do to eliminate whether it's dns right maybe it's dns that's not working so i want to see if my network itself is working so i'm going to say ping 8.8.8.8 ah network is still unreachable okay so there's an issue with more than dns that means that there's a problem with my network so let's look and see what our network address looks like now there's a couple tools you can try to type if config if this doesn't work yes see it's not on my system it's a newer system so type ip add for address and it will give you the ip addresses on your local computer for your network devices so this is the localhost 127.0.0.1 this is just like a virtual host that says it's it's local so we don't want to use this one but this is our ethernet port right eth0 and it looks like we have an address of 10.10.10.10. so i would say let's ping that so let's make sure that you know our network stack itself is working okay so this is working i can ping my own ip address so the ip stack itself is working i'm going to hit control c but for some reason i'm unable to ping out of the network now let's assume that my gateway address is 10.10.10.1 ping 10.10.10.1 okay i can ping other computers on my network so it's not well first of all we know it's not my computer cable right because i can ping a different computer that's across the network but i'm still unable to ping google so for some reason my internet connection is down well let's look at our ip routing information i'm going to type ipspace route and press enter all right it looks like 10.10.10.0-24 so this network is directly accessible via eth0 okay so that's working that's why we could ping this up here but if you'll notice now you might not notice but there's a line here that's missing we don't have a default route set so that means that the computer while it can access other network or other computers on our local network it doesn't know where to send network or where to send packets that are destined for another network there's no ip routing information here now if this is an ip routing nugget i could say okay let's add a route manually but this is more of a troubleshooting nugget so here's the deal probably when we got our dhcp address we didn't get all the information or something fell apart the first thing you do is either turn it off and turn it back on right that's that's the standard it guy response right well in this case it might work or we could actually go and just go to our wired network interface here turn just the network interface off connect again turn it back on and now let's well let's just look and see what our ip route information looks like haha look when we brought our network interface down and back up now we have a default route added so now let's see if we can ping google.com boom we're able to ping it we go over here and we should be able to access google as well and sure enough there's google so we've successfully troubleshooted the problem with our network connection in our case it was a route thing and really just turning off the network interface and turning it back on was the way to solve it but the important part is going through the process of determining where the problem is we knew it wasn't a network line or that network cable because we were able to ping other computers on our network so it's just a matter of troubleshooting and going down the list of where can i connect where can i connect and what might cause those issues figuring out connectivity problems is sometimes more of an art form than it is a science you just have to kind of put yourself in the mindset of a packet and say okay what information do i need to go from point a to point b and what is stopping me from getting in between there tools like ping were are able to make sure that you can or can't connect to remote computers and then using the ip80d you can see what your address is we also used iprout to look at the ip route information and that's where we discovered we didn't have a default route but tools like this are what you can use to troubleshoot your connectivity to figure out what's going on but don't forget always check that cable because sometimes it's just a physical cable unplugged dns or domain name system is the way that your computer converts a url or a domain name like google.com into the numbers the ip address that the actual computer knows how to connect to using ip routing so google.com isn't really helpful for anything except for dns really what the computer needs is that ip address so that it can get there now there's a couple tools that we can use to test dns on our system one of the most common is ping right you just ping and see if it can reach it if it can reach it and get a response hey it's working great but there are three other common tools that i want to cover dig nslookup in host and let's actually look at those individually because they all do about the same thing but they do it in a slightly different way now dig is the program that i use most often only because it has a cool name right there's really no reason that i use it other than the word dig is just kind of cool so dig you type dig and then you can just type the host you want to look up like dig google.com and it will use your default dns server and look it up if you want to specify a dns server for it to use instead of the one that your system is currently using which is a way that you can test your particular dns server you you put the at symbol and then the ip address of the server that you want to query now the other two programs can do the same thing they just do it in a different order you can say nslookup and the host you want to look up and then if you want to specify a server again you don't have to but if you want to then you put the server after that same thing with the host command you put type host the host name you want to look up and then if you want to specify a server you put it there if you leave it off it'll use your default and they all like i said do kind of the same thing so let's look at them really quickly and then i'll talk about how to query different servers and do just a little bit of troubleshooting on your system because testing dns is kind of vital if you want things to work on your system and they don't seem to be so first of all let's look at them all how they function just by default so i'm going to say dig google.com all right and this is the response i get it's queried my default server and we can look my default server is listed right here 8.8.8.8 so that's pretty awesome and this is the response it gets there's a bunch of a records for google.com now let me clear the screen we can do the same thing with ns look up google.com and this is going to give us a little bit different format right it says our server right here this is the ooh i missed i highlighted the wrong line that's our server the same server of course it's our default server and it gives us all the addresses these are all the a records a little less detail but it gives us the same information now if somebody tells you that ns lookup is deprecated and it's not used anymore and it's going to be abandoned if there's a weird case with nslookup that's true until it wasn't they were going to get rid of nslookup and replace it with dig but then they came out with bind 9.3 and for some reason decided that nslookup was going to stick around so we have nslookup and dig and lastly let's clear the screen again we can just say host google.com and this will give us information as well it actually doesn't tell us what default server it's using but it gives us a lot of information these are all of the a records for it these are all you know the ip addresses this one actually also tells us the mail handlers the mx records for google.com's domain which is kind of interesting as well so just pick the one that you like the best and that's what you can use and the reason that it's important to pick one that you like and use it often is so that you're comfortable with it right like i said i usually use dig and that's weird i know because it's the least like common way to handle specifying a server right the other two you do you know like nslookup what you're looking up and then a server with dig you say dig at and let's say 127.0.0.1 that's localhost right so dig at localhost for google.com and we got a response okay google.com and it gave us this address which is different but it came from our local server now it did not get all the same responses that we got from the 8.8.8.8 domain server you know our default domain server if we were to say dig at 8.8.8.8 google.com we would get a totally different set of ip addresses so that's why it's really important to be able to query different servers so that you can get different responses because on this computer if we try to ping google.com it's going to try to ping that other address and time to live exceeded it's not ever going to work so there's something wrong with the dns server running on localhost now i'll show you what the problem is the hosts file i've actually sabotaged here this is the etc hosts file this is where your computer will look first for any dns lookups and sure enough i've kind of bamboozled it here i put the the wrong ip address for google.com so i put that in there now if i restart our local server now if i try to ping google.com it's going to be just fine because it's going to actually use the real address that it looked up on the internet but that's how you can use the different tools to specify not only what host you want to look up but what server you want to query when you look it up so whichever one of these makes the most sense to you or whichever formatted results you like best just pick one and use that because you can specify a server for each of the different commands but knowing how to use them is important because troubleshooting usually means querying more than one dns server so you can figure out what the heck is going on one of my favorite things about linux is that everything is configured with text files right it's just plain text files it's awesome and the network is no different it's configured with text files now there are a few differences when it comes to distributions if you're on debian or if you're on centos there are going to be some configuration files that are different because they configure their networks different now there's nothing wrong with being unique i really like the differences in the distributions but there are some files that are consistent regardless of what distribution you're using we're going to call those the common files and that's what we're going to look like look at in this nugget these files here are consistent across the board they're just kind of standard linux files so i want to show you what they are and and where they live and how they are configured and what they do and that's again just the common files there's going to be specifics in the different distros but we're just looking at the files that are common to linux in general in this case we're going to look at the files in ubuntu and the first one i want to look at is the etc hosts file so let's actually become root if we want to edit them we have to be root to do so so i'm going to look at etc hosts there's no extension at the end it's just etc hosts and this is a file that acts kind of like the first resort dns lookup so before a system even looks things up via dns it looks in here and there's a couple things here like localhost and ubuntu this is our hostname are set up in here but we can add something we can say what if we wanted to make sure users were never able to go to google we could say okay 127.0.0.1 which is our localhost is now going to be google.com so if somebody tries to go to google.com they're going to try to hit our local machine and we don't have a web server at all so it's just going to error out so if we save that now on our computer if somebody tries to go to google it's going to fail so google.com unable to connect even though our network is working just fine we could go to yahoo if we wanted it's just a matter of that dns entry in the etc hosts file that we kind of broke all right so another place we can look is in the etc switch dot conf file now this configures a bunch of things on our system like group and password files but what i want to show you specifically here is the hosts line so this is saying where does it look for dns lookups to you know find what the host's ip addresses are the first one is called files this points to etc hosts if we have this listed first that means before it queries a dns server it's going to query that file it's going to look in that file first and then in order these are what it's going to do from there it's going to do you know the mdns for local lookups and then it's going to use a dns server so this is the order that it's going in but the first one it looks in is the files specifically that etc hosts file so you can change the order of that i don't recommend changing the order because that's kind of what we wanted to do but one more thing i want to show you is etc resolve.conf and this is a little bit of a confusing file because while this tells the computer what name server to use you'll notice up here it says do not edit this file now back in the day you would just put name server and then the name server you wanted to use on your system but now this is all handled with the systemd resolve d service and and it creates this file on the fly so really we do not edit this at all network manager handles all of the name server entries in here um or whatever your distribution uses for configuring the network so we don't actually edit this but if you want to see what server your your particular computer is using you can look in here and it looks like it's it's querying 127.0.0.53 which is another ip address to our local computer so that means that it has some sort of local caching dns server and it's not querying directly to the internet now you probably noticed that all of these common files have to do with dns and that's fine but it's it's interesting that dns is kind of commonly configured across the distributions when it comes to actually configuring the individual network devices that's drastically different from one distribution to the next but the common files are generally dns files and that's what we looked at in this nugget ubuntu is one of the most common distributions out there but it's based on and built upon debian so when we look at configuring one it's very similar to configuring the other and that's the case with network files now i want to show you where to find the different network files on an ubuntu or debian system but it's important to note that in the last iteration of their long-term support for ubuntu that's when a change has been made to how the network is configured and what files are used and how those files are configured now there is one commonality between them and that's if you use network manager to manage your network instead of editing the configuration files that hasn't changed you can still use that across all the versions that are current with ubuntu and debian but these two ways have changed so first i want to look at the older version of ubuntu and when i say older i mean one that's still valid to use as of the date today you can still use version 1604 let's see what version this is etc os release so this is ubuntu 1604.1lts long-term support and this is still as of today a valid version of ubuntu to use in production but this is configured by going into the etc network directory and there's a single file in here called interfaces so if we look at interfaces we're going to see this is how you configure a network interface with an older version of ubuntu or debian now this is pretty much the stanza that you configure there's usually something in here you can look at and base it off of but this is the format for it we're not going to go through the you know configuring different things it's pretty straightforward you don't have to indent this is just for ease of readability but this is how you configure a network interface via the configuration files now we could still use the um network manager if we were on a gui system like this is a gui system we could use that but i'm actually ssh into a server that doesn't have a gui interface and that's where we use these network files to configure it so let's quit this one this is how you configure an older version but a modern version like 1804 or the most modern version of debian uses a completely different network configuration system now here we're on a newer version of ubuntu this is actually 1804. let's check that out so this is 18.04.1 another long-term support release of ubuntu and this one uses the net plan system for configuring network interface so if you go into etc net plan and you look in here depending on what kind of system you have you're going to have some sort of file in here it's either going to be like zero one network manager dot yaml or 50 cloud init dot yaml if it's a server the thing about yaml files so let's look at this yaml files do depend on indentation in order to be properly configured so while with the et cetera network interfaces of the old ubuntu system it didn't matter how things were indented here it really really does and so here this is where we would configure like a static ip address and we you know here's our ip addresses gateway name servers the format's a little different but it's fairly similar conceptually to the old system like i said except for that white space that's really important that it's indented properly so if you make a change to this then you just would do sudo netplan apply and then it will activate the new changes that you made now i do want to point out again this is a gui system so it has network manager but there's also nmtui which is network manager text user interface and then you can edit interfaces using just the text boxes on your screen if you don't have a gui system installed so that's just one more way that you can edit your file in ubuntu or debian if you don't have a gui but you want to use network manager as opposed to just configuring those files on your own whether you have an older system using etc network interfaces or a newer system using configuration files and etc netplan configuring the network on debian ubuntu is not difficult and you know it's usually just a matter of changing what's existing but if you want to use the network manager to manage all those interfaces either using gui or text tools you can do that as well it's very flexible and even with the change in versions it's not a whole lot more difficult to do one over the other if we've been learning together for a while you know that i prefer debian based distributions and ubuntu is my jam but i have to admit when it comes to network configuration files centos or red hat they really have an elegant solution for how to configure the network and there's no confusion between if you're using the gui network manager or if you're just using the configuration files in the etc config folder let me show you what i mean because centos is just so awesome when it comes to network configuration now here we are on a centos system and it doesn't matter which way you configure the network interface you can go up here and go into the network manager to configure it or we can go directly into etc sysconfig and i just wanted to show you quickly inside the sysconfig folder and sent to us is where all of these configuration things are this is kind of unique to centos and red hat they put most of their configuration inside this etc sysconfig folder but inside here there's also another folder called network scripts and that's where our networking is configured if we look in here we're going to see there is this configuration file for our interface ifconfig dash eth0 so let's look at that if we edit this it's going to change the way our network interface comes up like if we were to change boot proto from dhcp to static and then we were to add a line ipaddr equals let's say 10.10.10.111 which is not currently the ip address if we were to do that and save it and then just do a quick sudo service network restart we would be able to restart our service and if we were to do ifconfig we're going to see scroll up here sure enough that is our ip address now now there are a couple problems we didn't add like netmask or anything so let's go up into network manager wired settings click on the configuration thing so we're going to edit this in network manager if we go over to ipv4 we can see it went from it used to be dhcp now it's manual this is the ip address i put in and it just guessed this netmask because the 10 range is a class a ip address so it guessed that i wanted to use a class a netmask but i don't i want to use 255.255.255.0 and i didn't supply a gateway so let's do that now 10.10.10.1 and i didn't supply any dns server so we can do that here 8.8.8.8 and now we'll save this we'll click apply if we go back and edit that script sudo vi we're going to see that some changes have been made now this stuff is still the same boot proto equals static ip address but look down here it added the gateway address that we added it added the dns server that we added and it added the prefix 255.255.255.0 translates to a 24-bit netmask so put that in there just use a different form and we can see that it made the changes to the actual configuration files rather than having like two different systems and you have to decide which one you want to use centos allows us to make changes in one place and whether we're using the gui interface or just this text-based interface it's going to allow us to make those changes i love that it has all the change places in a single file as opposed to conflicting and trying to decide which method you're going to use if you use network manager you you're using this configuration file if you don't use network manager you're using this configuration file so it's a really elegant way to handle network interface configuration like i said at the beginning i'm a debian ubuntu man that's what i usually use in my server situations but when it comes to configuring network interfaces man centos has really stolen the show they have an elegant way of doing it either using network manager gui tools or using the scripts inside the network scripts folder of etc sysconfig network bonding in linux is really just a way to utilize computers that have more than one network port and most servers nowadays come with that let's say you have a server and it has three network ports built into it well if you have to connect it to a switch all each of these connected to a switch you would ideally like to use all of that bandwidth and you don't want to have to supply three different ip addresses to the computer you just want to use all of the available bandwidth on all three of those wires and that's where network bonding comes into place now there's basically two different kinds of network bonds that we're going to look at those that require special switch support and those that are generic and don't require the switch to know at all what's going on now linux does provide some pretty cool options when it comes to there but the options can be a little overwhelming i mean this screen is like oh my goodness there's so many choices well that's okay because your choices are going to be fairly limited once you learn what all of these are now the first thing i want you to look at when we're looking at this list is does it require switch support and what i mean by that is a lot of switches especially layer three switches or smart switches they're sometimes called will support link aggregation and different vendors call it different things but it's going to be either link aggregation or lacp or ether channel the idea is a smart switch will have built-in code that will allow them to work together so if you have a smart switch chances are that you have the ability to use link aggregation now some of these require a switch that supports that i'm going to start with the confusing one here so balance round robin basically how balanced round robin works is you have multiple ports and it says okay when i transmit packet 1 i'm going to use this port packet 2 is this port packet 3 is this port packet 4 is this port packet 5 is this port packet 6 is this and it just keeps going through packet by packet and transmitting across all of your interfaces now the reason i said it sorta requires switch support is if you're plugging these into a switch it does require a switch that supports link aggregation but what a lot of people do is they'll use this balanced round robin and they will connect two servers together with crossover cables and actually with gigabit we don't need crossover cables anymore but they'll connect two servers together and they'll use balanced rod robin and in that case you don't need a switch support or a switch that supports it because there is no switch involved right they're just directly connecting the two computers together and so like if you have a file server that you want to connect to your other server a lot of times this is a way that you can increase throughput without requiring any special switch support so if you're connecting directly computers you don't need to have switch support if you're connecting to a switch you do all right that's mode zero mode one is an active backup and this one's pretty easy to understand basically however many ports you have on your system only one is going to be active and if that one fails then another one is going to turn active and that's just how it works they always have one this is fault tolerance but this doesn't speed anything up it just means if one fails another one's going to come online all right this doesn't require switch support because well you know it basically just only uses one port unless the next one fails and then you switch to that port and the switch is like okay so now we're going to use this port great balance xor does require special switch support and this is kind of cool how balance xor works basically you have your computer here your linux computer and it does a hash based on your mac address and the client's mac address and so here's another client over here and basically it says okay based on this hash i'm always going to use port 1 to connect to this client and then over here the hash is different so i'm always going to connect to this computer so it's basically a way to just spread out which computers use which port but this computer is always going to use this port and this computer is always going to use this port now this isn't used very often because if you have switch support which is required chances are you're going to use 802.3 ad which is the industry standard link aggregation protocol right this means that your switch knows what to do and your client knows what to do and it's just a really smart way of increasing through throughput and availability fault tolerance so if your switch supports link aggregation you should use 802.3ad or mode4 which is the industry standard okay so even though balancexor is cool it's not very often used because if you can use it you might as well be using the one that's even better broadcast i kind of skipped over that broadcast is only in very specific cases that you would use it it just takes all of your ports on the server and spews all of the data out all of the ports at once it's not used very often and it definitely requires your switch to know what on earth is happening now the two i do want to focus on a little bit are these bottom two let's say you have a dumb switch it's not a switch that supports link aggregation it's just one you bought from amazon for 40 bucks it's gigabit but it doesn't have any smarts built in linux is smart enough to be able to utilize the dumb switch and increase bandwidth and throughput and reliability and there's two different ways it does it balance tlb which is balance transmit load balance and the balance of all load balancing and what this does is let's say we have four ports on our switch okay with tlb it's going to transmit from whichever port is currently the least busy so all the transmit is going to be balanced out now incoming is still going to always go to one active port so that's not as good as alb which is the same concept except it does it for incoming and outgoing so basically the least busy port gets the traffic and this is really brilliant and how it does is it constantly changes the mac address on these ports or on these ethernet cards so the switch is like oh you moved again oh you moved again oh you moved again and from the switches standpoint it doesn't care how many times you switch the mac address so it generally works just fine now some people have issued with this i've used this in production for years and never had a problem so balance alb if you have a dumb switch i highly recommend you use mode six okay if your switch supports link aggregation i highly recommend mode 4 because that's the industry standard and if you're just connecting two servers together with multiple cables my shirt mode 0 works really great to be honest the most difficult part of linux network bonding is figuring out which mode to work but really it's not that tough of a decision if your switch supports it use 802.3ad there's really no reason not to if your switch is a dumb switch and doesn't support it i highly recommend mode 6 or balance alb if you want to take advantage of bonded network interfaces but you don't have the expensive hardware that will support it once you know the type of bond that you want to set up for your two ethernet or two or more ethernet connections on the computer configuring them is pretty straightforward although it's drastically different depending on whether you're on ubuntu or centos so i'm going to show you how to do it on both and then i'll show you also how to test it to make sure that the bonding modules are working so here i am on ubuntu i've actually already configured it so i've gone into etc netplan and i have my file in here so let's look at this file and what i've done is i've set up the proper yaml formatted with indentation and everything to set up bonding so i'll go over it really quickly first of all we need to use network d as our renderer we can't use network manager in order to configure bonds because it just doesn't support bonding we do have to define the ethernet cards themselves so ethernet's eth0 i set dhcp 4 to false because i don't want it to actually assign an address to eth0 after all that is done we don't need to set up anything more for the ethernet ports themselves then we need to set up a bonded interface so a new section here bonds the name of it is bond 0 dhcp 4 is false because i'm going to assign a static ip address and the interfaces is a section that we tell it what interfaces are going to be a part of this bond in our case we only have one which is silly i mean we're only bonding one interface but you can do that one or more it's just silly that we only have one to bond that's just all this computer has i've set up the addresses the gateway the name server is set up just like it would be if we were setting it up for each zero really the only new difference down here is the parameters and the parameters what i've done is mode active backup that's just one of our modes it happens to be mode one but we actually say active backup we specify by name the mode here and then we save this file and do netplan apply now how you can tell if it's working we can do ipadd to see that you know sure enough it's up here's our bond zero interface here's our ethernet zero interface which is up but doesn't have an ip address and another way you can test is to look at so we're just going to cap the file it's in the virtual file system proc under net bonding bond zero so if we look at this file it's going to tell us the information about it and say the bonding mode is fault tolerance active backup primary slave none but the current active slave is eth0 it's our only one that we have it's up and it looks like everything is working good here's our slave interface eth0 down here and that's all working well as well so we have it all set up and it's working now with centos the setup is a little bit different because centos is set up differently in centos in order to configure our ethernet ports we go into etc sysconfig network scripts and in here we have our ifconfig files so i'm going to first look at the changes we have to make to ifconfig zero so here we basically make it pretty simple ethernet type boot protocol is none uh the name the device is eat zero on boot we want it to come up but notice i haven't given any ip addressing information and here are the two things the master is bond zero and yes this is going to be a slave to this bond all right so those are the changes we make here if we look at if config bonds zero this is where we set up bond zero we name it bond zero bonding master is yes i did give this one an ip address and a prefix which means the subnet mask on boot we want it to come up a boot protocol is none and then here the bonding options we tell it what we want it to do i actually set this one up to mode 6 mode 6 is balance all load balancing so it does its own version of load balancing for incoming and outgoing traffic again i only have the one interface on this computer but that's okay let's do the same thing now checking to make sure it's working is similar we can do ipadd you can see sure enough here's the bond set up right here and our ethernet is right here and really for configuring the bonds themselves that's all there is to it ubuntu and centos are configured differently but underneath they're both doing the same thing they're using the bonding kernel module and they're allowing you to do some awesome things with multiple ethernet ports on your computer gpt and mbr are two different ways of taking a hard drive and sharpening it up into pieces so that those pieces can be recognized and mounted as different drives and stuff on your on your system so they're they do the same thing but gpt is much newer and much more feature-rich than the old-school mbr now there's also a really cool thing called protective mbr i want to talk about but basically we're going to talk about the differences between gpt and mbr here i have two big squares and i'm going to say that these actually they're rectangles i'm going to say that they represent the drives themselves and they're going to be partitioned with two different systems so we'll start over here with the old one mbr which stands for master boot record okay now how this would work is let's say this drive is device sda okay so this is the device on the linux system it recognizes this drive now the mbr is a little tiny bit of reserve space at the very beginning of the disk and this is basically like a table of contents and it describes how this is chopped up okay so this is chopped up into let's say four pieces there's four primary partitions you can also cut each one of the primary partitions into four so we could have you know mult we could have this one be like have four partitions there if we wanted but basically what this means is once the mbr defines in this little first part of the drive right here what is what we come up with other devices on the system we have dev sda1 devsda2 and it goes on down the line and so each of these chunks is referenced on the system as their partition name now this is the raw device dev sda doesn't go anywhere it's still there but the individual partitions get their own device and that's how we reference them on the system so mbr has this you know this little tiny table of contents at the beginning and it says where things are chopped up now gpt or gui id partition table which i think is interesting because there's an acronym wrapped inside of an acronym this stands for globally unique identifier partition table and basically it does conceptually the same thing right it will chop up this drive and you'll end up with dev sda for the raw device and so on down the line the differences are though gpt does have a spot at the beginning with the table of contents but then it also scatters them copies of them around the drive so that if something were to happen with one of the copies of the gpt it can still figure out what's on the disk it also does some crc correction on the files itself or on the the file system itself so that it knows if there's some corruption and it can fix it it's just very very robust along with the ability to use much bigger drives mbr is limited to two terabytes gpt is limited to i think petabytes i mean there's no practical limit at this point of how much drive space it can talk about rather than being limited by four partitions you can have tons of partitions on here so you know we could chop it up into as many as we wanted and it just goes down the line sda1 sda2 sda3 and one other really interesting thing and i wanted to talk about is the protective mbr so gpt is used on newer systems it's specifically part of uefi that replaces bios but biosystems can still see gpt drives and part of the reason they can do that is at the very beginning there's this section that looks exactly like an mbr basically it's an mbr record here that says okay there is one big partition on this drive the entire drive is a partition there's no room available on this partition and it takes up the entire drive now why would that exist it's basically so systems don't say oh well there's no mbr in this drive so this must be a blank drive with nothing on it well we don't want that to happen so basically what it means is it allows the mbr system or a bio system that is used to seeing mbr it will allow it to realize that this is not an empty drive it's just something it can't read and then there's some really cool stuff where part of this initial mbr can be hints to the operating system itself as to what's happening underneath and so some older bio systems are able to boot from a gpt drive even though it shouldn't be able to because the linux system or you know the operating system itself says okay i see mbr is saying that um you know it's really a gpt drive but i know enough about gpt myself that i'm going to be able to find the software and you know the the drive files and partitions on here myself and so you can actually use it on there but it's because of that protective mbr that all that is possible so when it comes down to it mbr is the old way of doing things gpt is the new kid on the block honestly there's no reason not to use gpt it does everything mbr does it does it more efficiently more reliably it's just the better way to go so i recommend that you use gpt and while there are significant differences a lot of those backwards compatibility issues make it so that you don't even have to worry about them the linux file system is actually really cool because all of the network mounts and different hard drives and usb drives are all on one giant file system it's not there's no a drive b drive c drive d drive there's nothing like that it's just all one big file system hierarchy and we're gonna look at the different things because whether it's a real or virtual file system it's on the same file system whether it's relative or absolute this is just how you traverse the actual file system we're gonna look on the command line about that network mounts they're just you know on a remote system but they just show up as a folder on your local computer it's just it's really cool so first of all let's let's start at the very beginning it's a very good place to start so we have a hard drive right so we're going to say it's dev sda1 and that has the root partition on it you know just forward slash so this is like where everything starts the root partition is the basis of our linux file system it's it's on a hard drive of some sort in our case it's this one now inside there there's tons of files and folders some of the files are directly on that dev sda drive right on the root file system or folders within that root file system they're actually stored on that hard drive some of the folders and the files in there are a virtual file system if you look inside the root directory there's a proc folder and a sys folder these are dynamically created file systems that are just a way to interact with the kernel itself so if you want to make a change to the running kernel you can make a change to one of the files in these folders and it's going to affect the running kernel but it's not actually files it's a virtual file system it's just a way that they represent that interface with the kernel is this virtual file system then we also have like a remote nfs server it could be nfs or samba or you know whatever your network protocol of choice is but it's mounted on the folder inside of your root system so this could be like the home folder on your system might not be on the actual hard drive it might actually be on a remote system but it's mounted inside your file system and from you know from a layman's position just looking at the system you don't know if it's a remote system or a local system or a virtual system because they all look the same same thing when you insert a usb drive it doesn't come through as like you know like the e drive or the f drive like on a windows system it's just mounted somewhere on a folder inside that same one solid monolithic file system even if you put in another hard drive so this is we're saying this is dev sda1 like the partition one on the sda drive even if you were to put in a second hard drive in the system it's still going to mount inside a folder or a subfolder of this root drive right it's going to be like in mount data or wherever you happen to mount it it's just going to appear as a folder inside this root file system so everything is inside that root file system it's really a neat way to handle all of the various things that can be stored in linux and the virtual file system is very unique in that it's not really a file system it's more like an interface designed as a file system so you can interact with the kernel itself all right so let's go to our file system here and i just want to show you kind of how it looks i want to talk about this whole absolute versus relative thing all right so i'm just going to show you the file system here we have ls and we have a folder called pictures we can do ls picture gotta spell it right ls pictures and i have a folder cbt gold which is this background and then a trips folder uh what if we were to do ls trips okay there's a folder called grocery store mexico orlando now there's a cool program called tree i'm going to use so let's say tree pictures it's going to show us a tree representation of all the files and folders inside here so we can look we have the pictures folder there's a one file in there called cbt gold then there's a folder called trips inside that folder are three different folders grocery store mexico orlando inside each of those there are files that are stored in there this is the hierarchy of that now honestly this mexico folder this could be like a remote nfs share we don't know because it's all on the local file system that is you know mounted everywhere on the system it's just one file system so we don't know where these are we just interact with them as if they're all one local file system so that's just a really cool thing about it but this whole absolute versus relative let's go over to the pictures folder and trips i'm going to clear the screen and if we were to type ls minus a we're going to see all of the things in here now we already saw some of these orlando mexico grocery store but these dot and dot dot now these are special folder entries that mean the dot means the current folder so if i were to say cd to dot i'm in the same folder dot dot means the directory above me so if we're in trips dot dot is going to be pictures so if we were to say cd dot dot all of a sudden you'll see now we're in the pictures folder that's pretty cool right now we can use that as a folder name anywhere so let's go into trips orlando okay and inside here let's do an ls minus la we can see here are all the the pictures that we have and then these two folders dot dot and dot in every single folder you're gonna find a dot and a dot dot because it's just a pointer that means this directory the directory above me okay so we can put that in a string too let's say i want to cd to dot dot dot and see where this takes us two pictures now why did it do that well because dot dot if we're in orlando dot dot means trips and then we're in trips the second dot dot means pictures so that's interesting we could do something really complicated and it'll end with this we could say cd trips orlando dot dot grocery store oh okay what happened there we'll end with this one like i said so we started in pictures and we went to the trips folder orlando back to the trips folder and then into the grocery store folder and sure enough that's where we ended up pictures trips grocery stores so you can use dot dot anywhere in your thing to talk about the relative path of where you're where you're going okay now the absolute path of this is going to be cd home bob pictures trips grocery store and then sure enough we're in the same folder this is the absolute path starting at the root level but we can use relatives paths using things like dot dots one other thing i said i was going to end there and i guess i lied to you this tilde image or this tilde thing is a shortcut for your home directory so if we were to say cd to the tilde which is usually to the left of your one key boom now we are in the tilde which if we type pwd we'll see is home bob all right so those are the relative path tools that you can use to construct where you want to go but that's really how it works and the tree program is just kind of a cool way to look at things you can see everything else spelled out there the big takeaway is that the linux file system is just one big monolithic file system and whether you're mounting network shares or you're mounting usb drives or second hard drives or even a virtual file system to interact with the kernel it's all under the same bunch of file system in this big hierarchy partitions are really just organizational units that are on a hard drive you kind of chop a hard drive up into different partitions and those partitions are used for different things like swap space or you know a particular folder on your system you can use partitions for a lot of things and depending on the system that you're using whether it's going to be an mbr master boot record or gpt gui id partition table you can have a multitude of partitions or just a few and depending on which scheme you use it depends how big your hard drive can be now we're going to look at a handful of tools like parted and g parted and this stands for partition editor and graphical partition editor and the old school f disk which is just a command line tool that didn't used to handle the gpt partition table but now it does so there's this one size fits all tool called fdisk and then i do want to show you a couple tools that will let you see what block devices are available to partition on your system so let's actually go right to the command line so that we can check out what's on our system and i chose to use centos today for no other reason then i thought let's use centos so i want to show you first of all how we can identify the block devices on our system we can type lsblk and it's going to show us this really nice little cool tree thing right so we have fd0 floppy disk which actually is a lie there's not a floppy disk on this system it's a virtual machine but then sda is the main hard drive on the system and it's partitioned into two partitions the sda1 which is the boot partition and then sda2 which is used in lvm which we'll talk about in a different nugget then there are a bunch of other hard drives installed in the system that don't have anything on them at all there are four 10 gigabyte drives sdb sdcsd and sde we can also see these things if we look at proc partitions this is again the virtual file system that shows us an interaction with the kernel and we can see sure enough here they are sdb c d and e and this is how big they are the number of blocks so they're 10 gigabytes and of course if we just looked in the dev folder and gripped for sd we would see all these devices too right sd b c d and e so what i want to do is just partition one of them we're going to pick sdb and there's a couple ways we could do this we could use g parted now notice i'm root we have to be rude if we're going to do the system level stuff like partition a hard drive but we could say g parted and it'll start up and scan for all the drives available and there's a little drop down here sdb very cool we would have to start by creating a partition table and then we could create a partition with gui tools it's really really easy to use g parted but a lot of times the system doesn't have a gui so i don't want to do it that way now we could also use parted or partition editor this is the cli version of the program that we just used we can type help and it'll show us it does the exact same things it's just used you know use text to tell it what to do all right so i'm not going to use this one either but i want you to know that they exist and if you're more comfortable with them they're a little bit newer and they have a few more features and that sort of a thing but what i want to show you is the old school f disk and i want to show you this because if you're on a server it may not have partition editor you know part ed or especially g parted the graphical one so i want to show you what almost every linux system is going to have and we're going to use that one to actually do our work so we're going to say fdisk dev sd b we have to tell it what device we want to use so it's a good thing we know how to find the available devices on our system type this and type m for help now we just have these single letter commands but that's okay the first thing we're going to need to do is to create a partition table and it can be one of two types we can create a gpt partition table by pressing g or we can create the old mbr or ms-dos partition table using oh i'm just gonna pick since it's not over two terabytes in size either one is going to work so let's just use gpt for the heck of it so i'm going to say g press enter it says okay it built a new gpt disk label this is its id and now we have a partition table but no partitions to create a partition we can actually type p to see what's there there's gonna be nothing there so to create one let's press m again so we hit the whole help screen we can say n for add a new partition so new and now it says partition number one through 128 right this is not mbr mbr would only support one through four this is gpt so one through 128 i'm just going to pick the default first sector the default is going to be the first one available last sector the default is going to be the last one available right this is our range and it chose the last one so i want to fill up the entire partition or the entire drive with this one partition so if we type p now we're going to see sure enough we have a partition there it's 10 gigabytes linux file system type that's okay and now to quit let's press m again for help to quit we could just press q and it would quit without saving changes but we actually want to make these changes so we want to make the change to the disk so we're going to use w to exit which means right so w enter and now our partition table has been done we can say lsblk and look at that now it shows up that we have a partition created on our system it happens to be a gpt partition table and it's a single partition that takes up the entire drive i didn't actually go through the process of using parted or gparted they're very straightforward and they walk you through it and the help screen is right there and easy to use so if you want to use those by all means go ahead but i wanted to show you the more complicated fdisk because it's going to be on every single system that you run across there's a wide variety of file systems that you can put on an empty partition in linux but the idea of all of them is the same you take a big empty partition like a big field and you divide it up so that you can store data efficiently like if this parking lot were an example of a file system each car would be a piece of data that is stored logically in its place and and some file systems have you know bigger spaces for parking buses some have littler spaces for parking compact cars but conceptually they all do the same thing now like i said there's a bunch of different options available in linux ext is the most common family of file systems that you can use on linux it's very mature it's been around forever the later versions support journals so that if a computer gets powered down before the reads and writes are synced up usually you can salvage the data on there so it's fairly robust as well xfs is a file system that's been around for a long time this used to be what you had to use for really really big drives but now everything supports big drive so that's not really the issue anymore it is still used by centos and red hat though so xfs is still widely used interestingly it has its own set of tools like for file checking fixing things manipulating the the xfs file system itself rather than using standard linux tools it has its own set of xfs tools there's a couple others btrfs it's often called butter fs this was the new kit on the block right this has awesome features like snapshotting and it's just awesome unfortunately it's kind of been abandoned which is weird but nonetheless it's kind of what happened so btrfs is still functional but it's not widely used anymore and then of course i'm just going to mention them dos or the windows world has their own file systems like ntfs vfat fat32 things that you're familiar with if you're in the windows world and linux can usually read and even write to most of these file systems but when we're talking about linux file systems we're generally talking about linux specific ones and ext is awesome i'll be honest i like ext there's generally three there's e xt 2 ext3 and ext4 now this isn't necessarily like one is better than the next they each have their own features but ext4 is the newest and has the most features i'll be honest i almost always pick ext4 as my default file system not because it's necessarily better than any other option but because it's so widely used that means there are a ton of tools and utilities and tutorials online to get data back if you have some corruption so ext4 is my file system of choice mainly because it's used in so many places when it comes to creating a file system basically you need a partitioned hard drive so that you can you know have a partition or that empty field in order to draw the lines for your parking lot or put that file system on now i'm going to say lsblk and we're going to see these are the different block devices on our system i have two partitions created i have sdb1 and sdc1 these are just 10 gigabyte partitions on 10 gigabyte drive so it takes up the whole drive and we're going to format each of them all right so i'm going to say oh here's another really cool thing about linux the file formatting or the hard drive formatting programs all start with mk for make fs file system and then just hit tab a couple times and you're going to see all the various tools for creating the different types of file systems so let's say we're going to make a butterfs file system so we would say mkfs.btrfs and then what partition we want to create that file system on in our case dev sdb1 alright so it's created it it says we have 10 gigabytes idu's one that's the path okay so it looks like it did that without any problems let's create another uh file system so uh mkfs hit tab a couple times so i can see our options now like i said i usually use ext4 so i'm going to say ext4 dev sdc one which was the other partition that i had created on this system and it went through and it created that file system and now a really cool thing if we use the lsblk but we add the dash f it's going to show us the file systems that are on the particular block devices as well so if we do that we're going to see up here we have sda1 which remember i said red hat and sent to us use xfs well sure enough there it is xfs is used sdb1 is that butterfs file system that we created and then down here is an ext4 file system that we created on our system so that's all there is to creating the various file systems and really you can pick whichever one you want but you do have to have a partition in existence before you can create a file system on it and while yes there are a whole bunch of file systems that you can use on linux and it supports a bunch i really recommend that ext family of file systems if you have a choice and you're just trying to decide what one to use largely because it's used so often there's so much support if something goes wrong you can find a lot of help online if we want to be able to access the data that's on the drives or the partitions of the drives that we put into our system we have to mount them into our local file system now we can do that manually using tools like mount or u-mount or we can use the etc tab file to do it automatically on boot but conceptually what's going on is that we have a new hard drive or a partitioned hard drive and we want to incorporate that into our file system now it's really important to note that when you mount a partition or a hard drive it goes into a folder and that folder becomes what is in that drive now we couldn't mount it on this folder because this folder already has things inside of it it has to be mounted on an empty folder because it doesn't make sense to have all the things in this hard drive in this folder and then have these other folders alongside of it so it has to be in an empty folder that we mount a partition or a drive so we could actually take this hard drive and mount it into this folder if it's an empty folder and that's what we're going to do on our system is find an empty folder and then this all the files inside this drive will become part of our bigger linux file system starting at this mount point now i'm here on an ubuntu system and i have the regular route mounted hard drive and then i also have an additional 10 gig drive and if we do lsblk we can see that we have well a bunch of things here loop loopback devices but we have down here sdb1 okay this is a 10 gigabyte partition and it's all set up and ready to mount but it's not currently mounted on our system so in order to mount that i actually want to mount it in a folder if we look inside mnt there's a folder called 10 gig and inside that folder is nothing it's an empty folder so i want to mount it on there now if we do blk id we can see a little bit more information about the device itself so here is dev sdb1 now here is the universally unique identifier for this partition uh keep note that this is something that's specified about the partition we'll look at that in a second but we know that it's type ext4 okay that's good to know and we know that this is where the actual partition lives so we can say mount now i'm going to say dash t for type ext4 usually mount can automatically figure out what kind of file system it is but if we know there's no reason not to do that so i'm going to say mount dev sdb1 and where do i want it to mount on mnt 10 gig press enter and now if we go into mnt 10 gig we're gonna see sure enough now there's a lost and found thing this is the root level of that hard drive but now it actually lives in our file system right here we can type mount alone on the on the line it will show us where it's mounted right so right here it's mounted on mnt 10 gig so let me type cd and then we can unmount it by typing u-mount mnt 10 gig it's not unmount it's u-mount all right so there now that that's how we manually mounted and unmounted if we wanted to have it mount automatically on boot we would edit the file etc fs tab all right and i'm actually going to stretch this out so we can see everything on here we can see there's already an entry in here for the root file system sda or s yeah sda1 but notice it doesn't specify it by its device it specifies it by its universally unique identifier which we could get by using that blk id command it'll get us that uuid for the particular partition now we could use the uuid for a dev sdb1 and we could put that in here to mount it but i'm actually going to use let me make another entry here i'm actually going to say that the file system is on dev sd b1 we can specify it by device or by uuid or we can even use the drive label if we want but in this case i'm just going to use the the device itself and then tab over the next field here is the mount point so i want this mounted on mnt 10 gig tab over the next thing is type it's an ext4 options i'm just going to say defaults and then the last two fields are dump and pass now dump is an old school backup program it used to dump the files to a backup this is really deprecated it's not really used anymore at all so dump you're going to want to put 0 for dump the last line though pass this means do you want it to run a file system check 0 means never run a file system check one means run the file system check first so you put a one on the root partition any other partition that you wanna have checked when the system boots up you're gonna put a two so i'm going to put dump of 0 you always put 0 for dump and then pass is 2. it's the second most important because the root is the first one you want to have scanned and then everything else is going to be 2. so you can have like five different partitions mounted they could all have pass of 2. now there's more to it to get it to scan automatically on boot but when you're setting up the fstab file this is where you do it so save this and now we could just type mount minus a and it's going to mount everything that is specified in fstab and we can look by saying mount and it'll show us that sure enough it remounted that because we defined it in fstab and if we reboot the computer it's going to automatically boot it or it's going to automatically mount it as well it's really easy to mount partitions using the manual tools it's also pretty easy to use the fstab file to specify it and you can either specify it by device name or the uuid that we can find out using the blk id program to scan a linux file system generally you use the tool fsck or just fisk as it's often referred to now the real key though is to have it scan automatically periodically on boots so you don't have to manually do it because here's the deal in order to run fisk the file system itself has to be unmounted that's not really a problem for secondary or tertiary drive mounts like the home directory or something like that but the root directory it's pretty difficult to unmount the root directory and scan it unless you're in the boot up process or you've booted from a cd or something so i want to talk specifically about how to set up the system to scan the the file system on boot including the root file system so that you can you know have it automatically maintain itself now when it comes to scanning automatically there's a few different flowchart things that go on the very first thing the kernel looks for is inside your fstab file if the pass setting is set up if it's a zero you know if you have your your dump and pass and if the pass is set to zero then it won't scan it just absolutely will refuse to scan it doesn't even look any further it just stops right there and continues booting up the system if however you have that partition set up with either a one or a two a one for the root partition a two for any other kind of partition if you have it set up then it looks at the drive itself and it says okay has the maximum number of allowed mounts been reached and if that threshold has been reached it will scan the drive before it boots up if not if it hasn't met this maximum yet then it's not going to scan it's going to increment it's going to say okay i'm going to mount one more time and add it to the number of times i've been mounted but i'm not going to scan it even if the pass is set up to scan it's not going to scan it if it hasn't met the max now the max by default is negative one which means it's never going to scan because that's just the way of saying like don't ever automatically mount so by default you're never going to get an automatic scanning which is a little bit frustrating because you do want to have your system automatically scanned so on ubuntu here i have dev sdb1 mounted on mnt 10 gig okay so this is another partition this is not my root partition but it's mounted on mnt 10 gig all right if i look inside of the fs tab file we're going to see that on this partition that i have automatically mount on boot it's set up with a pass of 2 which means that it is going to check to see if it should scan automatically okay it's not the root partition so i don't have it set up with a pass of 1 but since it's set up with 2 it's going to check and say okay if it's time i'm going to scan this so let's get out of here and how we can see what the maximum number of allowable mounts before it will scan is is to use the tune 2fs we're going to do dash l for a listing of dev sd b1 and it's going to show us this now what you want to look for in here is this maximum amount count remember i said by default it doesn't ever scan that's because this is set to negative one and you're never going to reach that because that's just the way of saying disable it if we want to have it automatically scan every so often we need to change this so we would say tune to fs dash c for count and i'm going to say every 10 months i want it to scan and dev sdb1 okay so now the maximum amount count is 10. we can look at that by doing the same command over and we can see now the maximum amount count is 10. okay so what does that mean well every time the system boots it mounts the partition well we could speed that up we could say umount dev sd b1 mount dev sdb1 we do that and now if we look it's going to increment it by one because we unmounted it and remounted it so if we do that a bunch of times and now we look back and see okay our mount count is now 16 and our maximum count that is allowed is 10. well why didn't it automatically scan well it only does that on boot so if we were to reboot the system it would go through the flow chart and it would say okay you have your pass set to two so that means i need to check the drive and say okay drive is your mount count higher than the maximum amount count allowed before a scan and it will be so if we do a reboot and once it's booted back up we look and run that tune 2fs again so sudo tune 2fs l devsdb1 now we're going to see that the mount count is down to 1 which means that it scanned it it ran fisk on boot before it was mounted and it reset the mount count to zero and then of course it mounted it so now it's one so now it's not going to rescan that on boot until the mount count gets above this maximum amount count and then it will do it automatically and figuring out what number to set here can be kind of tricky because here's the deal if you're on a laptop you might reboot fairly often so you're going to want to have that number kind of high so it doesn't scan every single time but if it's a big file server maybe you rarely ever reboot it maybe once a year well in that case you might want the mount count to be very low so that every time it gets rebooted it does scan for consistency's sake and then one more thing if it's a huge partition with just millions and millions of files it's going to take hours or days to scan so maybe you don't want to scan it every time it mounts it really depends on your situation so plan out how often you want to scan and the type of situation that you're in to make the best choice for your particular partition and honestly the best choice for you might actually be just to run fisk manually just unmount the partition or boot from a cd or usb drive if you need to scan the root partition maybe you never want to automate the process you only want to do it manually for some cases that's fine too regardless it's important to know how the system works so that if you think you're automatically scanning you really are lvm or logical volume manager is basically like a software version of a storage area network it allows you to take a whole bunch of physical devices and lump them into one big group that allows you to kind of carve out slices of storage for the use in your local system it consists of a bunch of parts physical volumes volume groups and logical volumes they use the word volumes a lot in there but nonetheless it's a way of taking raw storage and combining it into a thing that you can just slice up and expand and contract and add more things to it without disrupting the existing services here's how it actually works or looks in practice so we have we start with physical volumes i'm just calling them pv and we're going to say that we have four 10 gigabyte drives now in the standard practice these could be actual hard drives they could be raid devices or they could be partitions on a hard drive it doesn't really matter what they are they're just chunks of storage and you create physical volumes out of them and then you combine those physical volumes into volume groups and so then all of these are combined into a volume group and if you add 10 times 4 you get 40 gigs so then you have this volume group which is like just a big bucket of storage and that bucket of storage has no protection now i know that's not like a feature right but i really want you to know that if you just have 10 gigabyte drives all in a bucket so you have 40 gigabytes of storage and you're just gonna take and carve out chunks of that that doesn't offer you any protection so if one of these drives fails all of a sudden you could have your end result be completely corrupt and useless so physical volumes being raw devices and not having any redundancy is a little bit scary so anyway just wanted to throw that in there once you have this volume group you carve out a slice of it and it can be a small slice like here i just said this is about seven gigs i just kind of like spatially guessed how much of 40 gigs that would be um but you can carve out a big chunk so you could have like a 30 gig slice or you could have like a 32 gig slice that would use like two plus a little of another drive basically you don't know underneath what's going on the volume group is just a big chunk of storage that you carve out a slice of and then this slice is called a logical volume that logical volume is what you format with a file system and mount on your local hard drive or on your local file system and what you install linux on so it's a long step process but it allows for a lot of flexibility and so i want to show you what it looks like in practice even though it's not this robust system with multiple physical volumes if you have centos installed they actually use lvm even if you only have one drive so i'm going to show you here if we look at the etc fs tab file what we have here is our device is called dev mapper centos dash root okay now this is like dev mapper what is that well this is where logical volume manager creates those logical volumes for us to use and you'll see here this dev mapper sent to us root is mounted on the root of the hard drive or root of the file system it has xfs as a file system and it's installed same thing down here they have another carved out slice called centos swap and that's mounted as swap space on the system so if we look inside devmapper we're going to see we have sent to us root and sent to us swap now i want to show you really quickly a handful of commands so you can see what's going on if we look at pv display this is physical volume display it's going to show us what's going on behind the scenes right like what makes up the bucket of data that we're going to use we only have one physical volume i know it seems weird right why am i making a bucket out of one device well you can expand it if you want so they're giving you the room to expand later if you want to do it after it's already installed so we have one physical volume and it's a partition it's dev sda2 all right it's in a volume group named centos and that gives us it looks like about 19 gigabytes of storage okay so this physical volume is inside a volume group called centos so our our bucket with all the data is called sent to us and it looks like we carved out root and swap from that bucket so let's look at that really quick we're just going to look at lv display logical volume display and we should see two and sure enough we have two logical volumes i'm going to scroll up a little bit here our first one is named swap it's in the centos volume group because that's the only volume group we have it happens to be two gigabytes in size and it lives in dev centos swap this is an interesting thing you can you can use dev mapper and the name of it or you could use dev the name of the volume group the name of the logical volume as like in folders here so that's another way that you can reference it and find the actual logical volumes and then the same thing down here dev sent to us root the logical volume name is root it's carved out of the centos volume group and this one looks like it's about 17 gigabytes so if we look inside dev sent to us we're going to see sure enough there we have root and swap as our two different logical volumes that we could one is formatted right this is formatted with xfs and this is just swap space i know we didn't go through the process of actually creating all of the different parts but hopefully you understand exactly what lvm is doing taking physical volumes whether partitions or hard drives or raid devices combining them into a volume group and then carving out logical volumes that you can use as regular devices on your system and it just allows for flexibility kind of like a software-based san in your own computer building an lvm is actually one of the most straightforward things that you can do when it comes to block storage devices on linux it's surprisingly consistent all the way through the process for you know the names of the tools and it really is kind of fun and once it's built you can expand it by you know adding more drives to the system or you know stretching your existing volumes but let's go ahead and actually create from start to finish an lvm system on our linux device now here i am on ubuntu and if we do lsblk we're going to see that well we have a bunch of stuff but down here this is the drive that our system is installed on so we're not going to touch this one we're going to use these four devices so sdb c d and e which are 10 gigabyte devices notice i don't have partitions created on these now you can create partitions some people prefer to use partitions for their physical volumes in an lvm some people prefer to use the raw devices either one works fine they work the same the advantage of setting up a partition is that if somebody else comes to your system they're going to see that there's partitions on the system and they're going to know that something is already done there whereas if you leave them raw devices they might think oh look empty drives now it's kind of far-fetched and you're not just going to like start formatting drives in somebody's system but that's the reason some people like to use partitions i'm just going to use the raw devices and to turn these raw devices into physical volumes we use pv create and we just make a list of the devices we want to use so dev sdb devs dc dev sdd and dev sde so it created them now we can do pv display if we want and it's going to show all the devices that we have now each of them is 10 gigabytes and the name of it is just the device itself dev sdc so the next step is to create a volume group now to do that we just do vg create which is very very nice uh the first flag or the first command argument here is the name of the volume group so i'm going to call this bucket because it's just a big bucket of our hard drives right of our data that's available so i'm going to call the volume group bucket and then we just make a list of the physical volumes that we want to add to it so those same volumes we just did before and then press enter our volume group bucket was created we can look at that by doing vg display there's four metadata areas meaning we have four devices our current active pv are four we have 40 gigabytes of storage just about all together because each one was 10 and now we have this thing called bucket that we can carve a slice out of if we want and use that slice and put a file system on it so to do that we're going to use i'm sure you guessed it lv create i love the consistency of these tools it's really nice here we do dash capital l and how big of a slice we want in this case let's do something that's going to span all four disks so i'm going to say 32 capital g for 32 gigabytes and then dash n the name that i want to call it i'm going to call this big underscore slice and then we have to tell it where we want to get the data from well in our case it's in bucket the name of our volume group right there is bucket so what we're doing is creating a 32 gigabyte slice called big slice out of the volume group bucket press enter logical volume big slice is created so we can do lv display and sure enough lv size is 32 gigabytes so we know if that worked the name is big slice it's in bucket the lv path is dev bucket big slice which is exactly what we would expect because we've created this volume group called bucket and now this logical volume lives inside there so the last step is that we would actually use this as an actual block device or as a hard drive in and of itself and then we could just do something like mkfs ext4 dev bucket big slice and there we go so now we've done that we could mount it somewhere and every time the system starts it's going to be available in dev bucket big slice and then if we put it in fstab it's going to be mounted on boot and then there are other tools that we can go and like lv extend if we wanted to make it bigger so we could do something like lv extend dash l i'm going to say plus 5 gigs dev bucket big slice and now it says the size has gone from 32 gigabytes to 37 gigabytes it was resized and we did that without adding anything to the system we just used the tools to change the size of our logical volume the thing to remember again about lvm is that it provides flexibility in your system it doesn't provide any redundancy so if you have one physical volume fail it's going to crash the entire volume group and logical volumes are going to get messed up so you want those pvs to be like a raid device if you're worried about something going wrong underneath and losing data but setting up lvm as you can tell is very simple and honestly even kind of fun raid is a redundant array of independent disks or drives and basically what it means is that it you take a bunch of drives and put them together and you end up with a larger pool of storage now the cool thing about raid is that it doesn't just pull things together like lvm and give you a bunch of data it allows you to do some pretty neat things with performance or redundancy so i want to talk about the different raid levels that we can offer using raid specifically linux rate linux has a software version of raid which is very powerful very robust and surprisingly efficient so basically i want to make sure we cover all the different raid types and to do that i want to show you this big scary dragon now here's the idea i'm going to i'm going to talk about raid but instead of a redundant array of independent drives we're going to say they're a redundant array of independent knights of the round table okay and in true 80s fashion our knights are going to be just squares like from atari's adventure anyway here's the deal with raid 0 our drives are set up in a stripe which means that they work together reads and writes happen across two drives very very quickly and so with raid 0 or a striped array it's very fast right you can get dual writes and dual reads at the same time the problem is let's say that the knights are attacking the dragon and the dragon is attacking the knights if we lose one of the knights well then the dragon can get right through to the king because if you lose a single drive in a striped array or in a raid 0 all of a sudden your data is gone because half of your data is written on that other drive so while it's very fast they better kill the dragon quick because if one of them fails all of their data is gone and the king gets destroyed now raid 1 is kind of the opposite it still has multiple drives but they're set up in a mirror which means every time you write something to the first drive you write it to the second drive so you have a complete copy of both now because they're in a line like this they can only attack the dragon one at a time right they can't both attack the dragon like over here in the striped setup but the advantage here is if one of them dies there's still another full night there protecting the king same thing with your data right if one of these dies you still have a full set of your data because you've been writing it to two drives the entire time so a mirror doesn't give you any speed increase because you're not spreading the writes across here you're actually writing all of your data two times once to each drive so you don't get any advantage over a single drive when it comes to speed but you do get that advantage of either drive can die and you still have full protection for your data now raid 5 this is a little bit different raid 5 uses a parity disk and how that actually works is like some digital magic that is beyond the scope of this nugget and to be quite honest it still stumps me a little bit but conceptually how it works is you have multiple drives three minimum and any of these drives can die any one of them it doesn't matter which one and all of your data is still in place so you get the advantage of being able to lose any drive in your array the disadvantage is you lose one drive's worth of storage now what do i mean by that let's say these are all five gigabyte drives okay if these are all five gigabyte drives together that would be 15 gigabytes but you lose one drive's worth of storage for that parity magic and so if you have three five gigabyte drives you're only going to have 10 gigabytes of usable space but the advantage is it's writing to multiple drives as it's going along and if one of the drives dies you still have all of your data represented in the remaining drives now if you lose two of course then your date is corrupt and and you're done but you can lose any drive and it's not just like this is the drive you can lose no you can lose any one of these and all of your data is still there it really is magical and so that means that you can lose a knight and the dragon is still going to be stopped by whatever two of the knights are still protecting the king hopefully that makes sense with the different raid levels and just briefly i want to talk about there are some hybrid levels as well so raid 0 could be raid 0 1 where you have like four drives and what happens here is you have a stripe of mirrors your drives are mirrored and then striped across the mirror or raid 1-0 which is a mirror of stripes or vice versa but basically four drives and you're mirroring two two of them and then striping those two mirrors and vice versa you're going to stripe them and then mirror those two stripes with raid 5 there's actually a raid 6 which is cool but it requires another drive and then you can lose up to two drives and still have your data there's two of those parity drives in place now the downside is you lose two drives worth of storage on your full array but it's awesome because you can lose more than one drive now even if you don't follow along with my awesome dragons and knights kind of scenario hopefully the raid levels make sense now my trick my little mental trick to remember what it is i look at the number after raid and i say how many disks can i lose because for years i would confuse raid 0 and raid 1. but here's the deal with raid 0 you can lose zero drives right because if you lose one you lose your data with raid 1 you can lose one drive and you still have another drive raid five i guess you could lose one of five i don't know it kind of falls apart but rate zero and raid one are the ones that i would always struggle with so i think of that number as how many drives i could lose and still have my data you can go to the store and buy a raid card like a hardware raid device and then you can use that on your system and you'll be able to have hardware raid but linux has a really awesome and powerful software raid program that will use kernel level tools to allow you to create your own raid devices without needing any specialty hardware at all now there's a couple things we need to discuss like partitions versus using raw devices i want to make sure we cover all the configuration stuff but conceptually it's really easy rather than having a hardware based card we just use our regular sata controllers and then our hard drives can all work together in a raid array that we choose now when i talk about partitions versus raw devices let's say we have 200 gigabyte hard drives but they're from different manufacturers now they both say that they're 100 gigabytes however if you look close they might have slightly different number of sectors and slightly different size so this one says it's 100 gigabytes but it might be actually 1028 megabytes and this one says 100 gigabytes but it might actually be 1022 megabytes now they round for marketing purposes and that's perfectly fine and usually on a system it doesn't matter but if you have like this drive fail in a raid array and you need to replace it with another drive and you try to use this drive and all of a sudden oh you created a raw disk device raid array unit and it has 1028 megabytes of space and you try to replace it with another 100 gigabit drive but this one only has 1022 megabytes you're not going to be able to work it because this isn't big enough so generally what people do is you take and make a partition inside of your drive that is slightly smaller than the hard drive itself so it might be 99.9 gigabytes and then the same thing when you have a new drive you're going to have enough room to create a 99.9 gigabyte partition and so even though this drive itself is slightly different the underlying partition is going to be the exact same size so you're going to be able to use it to replace a failed drive in an array so that's why we generally use partitions even though using a raw device would work until you need to replace it with a smaller drive that is your replacement now here on our ubuntu system if we do lsb lk we're going to see we have four 10 gigabyte drives that are installed in here we're going to make a right array with those this sda1 this is our root partition this is where our system is installed but these drives down here are the ones that we're going to use to create our raid array now i've already partitioned the first three you'll notice it's a 10 gigabyte drive and the partitions are 9.9 gigabytes awesome we're going to do the sde the last one together and so f disk dev sd e and the first thing well we can press m to see all of our different options but i'm going to go kind of quickly i'm going to say o to create a new partition type and it's going to be a dos one it doesn't have to be dos but i just decided dos and then i'm gonna say n for a new partition and i want the partition to be primary so default number one default the start point on the drive default and here's where rather if i choose default it's going to be 10 gigabytes in size and that would be fine as long as our replacement was the exact same kind but i want to do it slightly smaller than the drive itself so here i'm going to say plus 9.9 g and press enter and now we have it 9.9 gigabytes in size so any 10 gigabyte drive we'll be able to replace it with because we'll just create a 9.9 gigabyte partition inside now the one last thing if we do t for type and press enter it's going to say okay what partition type do you want here it says it created a new partition with type of linux but if we type capital l we're going to see all of the available codes here now this is not a format this is just like a flag to give the kernel a hint as to what sort of partition this is supposed to be so the one that we want to put on here is actually f d for linux auto raid so i'm going to type fd and now it says change type of partition to linux auto raid or linux right auto detect i'm going to press w to write this change to the disk and now if we do lsb lk we're going to see we have all of them here now that partition type is just to give the kernel a hint if you put these drives in a new system it's going to say oh look at those partitions those are part of a linux array or of a raid array so we're going to treat it as such so it's just a hint but it works even if you don't change that partition type all right so now it's pretty easy to create the actual raid device we're going to create a raid 5 device with four 10 gigabyte devices well about 10 gigabytes and so we should end up with about a 30 gigabyte usable space with our raid 5 array now the tool we use is mdadm and then we're going to say dash dash create because we want to create a brand new one i'm going to say verbose just so we can see it do things as we type it in and now what device do i want to create well the devices are dev md and then the number of the raid device you're creating so we're going to start with md0 because that's our first device and we don't have any raid devices on here yet i'm going to do dash dash level equals 5 i want it to be a raid 5 device and then dash dash raid devices equals 4 because we have 4 devices and now we need to list those devices out and we're going to list the actual partition so dev sdb1 dev sdc1 dev sdd1 and dev sd e1 and press enter boom it created it that quickly now we can see the details of it if we were to look at md stat this is the virtual file system proc and this is going to show us the md stat which is the current rate arrays in our system so here we have it's currently a raid 5 array it shows our devices here lots of information it says recovery that's because it's building the array but we can use it while it's currently using the array which is really awesome okay so we look into dev grip for md we're going to see there we have md0 so we have a device all created and we can now use this as a hard drive in our system before we do that though i want to save this configuration of this raid 5 array into our system so that on boot it knows exactly what sort of array to build to do that we just do md mdadm detail scan and if we do that it's going to show us the configuration so that's the configuration for our current array what we want to do is save that so i'm going to redirect it into etc mdadm md mdadm dot conf so we're going to save that result into this file and now every time we boot the system md0 is going to be created and then we just treat it like any other hard drive on our system so mkfs dev md 0 and boom it created it and now it's part of our system and it's going to be about 30 gigs in size let's see ls blk and we look down here md0 shows up as 29.6 gigabytes about 30 gigabytes of raid 5 storage on our system if you're thinking that was a little bit too easy well you're right linux raid is awesome it's super simple to set up it allows you to use the regular drives in your system and set them up as a raid device so as long as you can save that file into mdadm.com that detail scan that we did and you can check for the progress or the status of your current rate array in md stat you are really set that's all it takes to use raid on a system using nothing but software provided with the linux kernel installing tar balls sounds like some sort of prank you might pull on somebody in high school but really this is the way that we would install software on linux for years before package management systems came out now you can still download tarball files of source code for programs and install them although it's not terribly common anymore now we used to refer to this as the three step we would extract compile and install and this is the process to convert source code into an executable program now i'll show you how to do that really quick i've downloaded already a very simple program as a tarball file it's called sun weight it's just a simple program that waits until the sun goes down and then executes so if you have a script that you want to run at sunset that's kind of a cool tool to use to do it but the first step is to extract it so we're going to say tar minus zxvf sun weight we'll go into the folder that's created if we type ls we'll see there's a couple files in here this is the source code and also a make file now sometimes there's more complicated things like dependencies sometimes there's going to be a config file so we'll run config and go through that process and it'll tell you if you need other dependencies since we don't have a config file i'm just going to type make because make will compile it into an executable program now you'll notice we got some warnings but we didn't get any errors a lot of times if it's a big program it'll say oh i need this dependency or oh you forgot this library and you'll have to download and install those dependencies before you can compile it but this one is very very simple if we type ls again we're going to see now we have a result here we have the sun weight program now we can execute it right here by saying dot forward slash sun wait and we'll see sure enough there it runs and we could use this to wait until sunset to execute a program if we want to install it though we have to either copy it to our user local bin folder or sometimes they'll include in the make file an installer so we could say something like make install this one doesn't actually have that ability to install it it's a very simple program so if we wanted to install it if you get an error like this you just simply say okay i'm gonna move sun weight to user local bin and now if we type sunway it's going to execute because it's in our path now i told you that three-step process is really really simple and it is you just basically type make and it compiles and then you have a binary that you can install either using a script or just putting it in your user local bin file there is a big disadvantage though if you compile things from source and that is there are no update mechanisms for getting a newer version if you use a package manager it'll update old software but if you just compile it yourself and install it manually there's no way to update it and that can be a real problem especially when security concerns come up so while it's important to understand how to use tarballs to compile and install programs it's not really the best way to go about it if you have any other options deb files are the way that programs are packaged up in the debian and ubuntu world now there's a couple different ways that we can manage the subsystem and there's a little bit of confusion as far as what tool to use now behind the scenes they all use d package which is like the the lowest level of interaction with dev files and i'll show you why this is not what you use on a regular basis in order to actually install packages on debian or ubuntu there are three different options though for installing packages using the proper system apt apt-get and aptitude now i want to talk about them because they do the same thing but it's just a matter of being replaced by something better so apt just apt all on its own is the newest program to interact with the app system it's new it's simple this is the one to use so just use apps i'm going to try to use apt if i can aptitude you may find online instructions people telling you how to install packages with aptitude this is older but it still works i don't recommend you use it though it's a little bit strange when it comes to dependencies but it's been outmoded and now apt is the way to go now here's the other thing apt-get has been around for a very long time it's the oldest of the three it still works but i don't recommend you use it because again apt is by far the best way that you can go about installing packages here's the real problem though i've been installing packages on debian and ubuntu for so long that sometimes if you're watching me in a nugget i might use apt-get by mistake just because it's a habit it still works there's nothing wrong with doing it but the proper way to go about installing packages is to use apt that's what i'm going to try to do and that's what i'm going to show you now now on our ubuntu system if we type ls we're going to see i have a deb file so it's kate kate is a text editor that works in the kde environment and this is the installer the dev file now remember i said that d package is the program that is used behind the scenes that's how you interact with dub files here's the problem so i'm going to say sudo d package minus i for install kate.deb it's going to try to do it but here's the problem it doesn't resolve any dependencies so if i want to use d package to install it i'm going to have to find every one of these deb files on the internet download them install them one by one find out if they have dependencies and it can be a real mess so thankfully the apt system takes care of all the dependencies for us so i have to erase this so i'm going to say pseudo d package minus r kate and it's going to undo the mess that i made and now we're back to square one rather than downloading the deb file we can use the app package management system and just say sudo apt install kate and it's going to look for the latest version and get all of the dependencies and you can see there are a ton it would have taken me a week to come up with all these dependencies so if i say yes it's going to install them all and it's finally finished that took like over three minutes i sped it up so i didn't have to sit here with me the whole time but now all we have to do is run kate because it's installed with all of its dependencies on the system and here it is our little text editor kate now another nice thing about using a package management system is we can keep things updated so we could say sudo apt update which is going to download the latest repository information to find out all the updates that are out there for us to install and then once we have the freshest versions of what's out there we can use that same program apt and say sudo apt upgrade and looks like we have a couple things we could upgrade hit enter and it's going to keep our system up to date that easy not worrying about dependencies it does all that on its own so while all three of these will technically work for installing and updating packages you really want to use just the simplest one which is apt it's the newest and it's the easiest to remember rpm is the red hat package manager and it does just that it manages packages on a red hat based meaning red hat or sent to us or anything else that uses the rpm system that's how it manages their dependencies and their installation and their programs and updates etc now there's a few tools we need to know how to use in order to really utilize rpm yum is kind of the de facto standard but there's a new kit on the block called dnf when i talk about that and then of course rpm itself is not only a package management system but it's also the tool the low-level tool that we use to actually handle packages now the cool thing about rpm is there's no two steps required when it comes to installing now what do i mean by the two-step process well here we have our multiple programs i just mentioned now yum is yellow dog updater modified which may seem silly but just for a brief second yellowdog was a linux distribution that ran on powerpc or old apple hardware and its claim to fame was actually that it started this yum program for managing packages now the operating system itself didn't do well after the power pc platform kind of faded out but yum is still around today and in fact that's the program that we use on almost every rpm distribution it handles dependencies um it is it updates the repo information as it's installing and upgrading so unlike app you don't have to say like yum update and then yum upgrade when you upgrade yum updates before it does anything else now dnf which and i'm not kidding here it stands for dandified yum but dnf is the new program it's in fedora right now it's not incent os but it's going to replace yum it's just a rewrite it has some features that are that it worked better but similarly structured and how it works and then of course rpm is what happens behind the scene there's a program called rpm and this is the low-level tool but it doesn't handle dependencies so we generally don't use the rpm tool on its own let me show you what i mean i'm on centos here and we're going to use yum to install packages but first i want to show you rpm now if we look i have downloaded this program called kate this is just a text editor that works in the kde environment but if we were to install it with rpm we would have a problem if we say rpm dash i for install the name of the package it's going to say i can't do that because you have 110 billion different dependencies that aren't installed so what you'd have to do is find every dependency every rpm install them one by one these probably have dependencies of their own that we'd have to track down but thankfully that's where the package management system yum comes into play so we could simply just say yum install kate and it's going to go to our repositories it's going to update the cached information like show us the latest information from those repositories and then it's going to search for kate it's going to find all of the dependencies right here say it's looking for all the dependencies and the dependencies of the dependencies and now it says would you like to install it along with the 77 dependent packages okay i'm going to say yes actually i'm going to say no if we install it then it would just download install all of those packages and life would be good and we would have kate installed but what i'm going to do is now show you that's how we would install a package but if we want to update the system all we need to do is type yum upgrade and it's going to query all of our repositories it's going to download all the package information dependency information that we need and now it's just telling us if we want to get our system updated we're going to have to download 116 packages install one new package and it's going to take 308 megabytes of space do we want to do that i'm just going to say yes and it's going to go through the whole process of downloading all those rpms it's going to use the rpm tool in the background install each one of them one by one until they're all updated and all installed to the latest version so yum is very simple to use and knowing that it comes from you know a distribution that's no longer even a valid distribution i don't know that's just cool for me that you know yum is still around even though yellow dog hasn't been a distro for many years now i don't have a fedora system to show you dnf first hand but it's a very similar program to yum it's going to work very very similarly so if you find yourself on fedora just use it very similarly to how you'd use yum and you'd be fine one of the big things i want to point out though is that there's no two-step process like there is with the apt environment meaning you don't have to update your repositories before you install it you just run yum and it's going to update and upgrade all the information before it does any installing or upgrading one of the nicest things about the app package management system is that you can add repositories which are just different groups of software packages so if something's not in the standard ubuntu or debian system and somebody else has written something you can add a whole repository of new software and then the app system can use it just like any other package on your system it's pretty cool now we're going to look how to add something and there are a few gotchas that we need to look at but thankfully it's a fairly simple system when it comes to configuring new repositories so i'm here in the etc apt folder on an ubuntu system if we do an ls we're going to see there are two things i want to show you there's sources.list sources.list.d this is a folder and there's actually nothing inside there right now but we could create a new file.list in here and it would be read just like this systemdefaultsources.list so it doesn't matter where things go i'll just add something to the existing sources.list so we're going to look and you'll see there's already a bunch of repositories added each one of these lines is a repository containing software that the app system can install now some of them are commented so we would just uncomment the existing ones like here this is a partner archive we could uncomment this and then this would be an active repository but if we want to add a third party repository we can like i said either add it at the end of this file or create a new file in that sources.list.d folder and call it like new program.list but i'm just going to add it in here and i'm going to add the opera browser repository in here let's say we wanted to add the opera browser it looks like this it's okay if every field here doesn't make perfect sense to you this is just the format it says what kind of a package it is where it's stored what folder in there and then what version of the software we want to actually add the stable and then it's non-free as far as like what what type of repository it is so this is the line right this is where the opera browser is stored on the opera website so if we save this we would be able to do apt update no dash apt space update and if you look it is updating them all but i do want to show you back at the top here it gave us a little warning slash error oh and it repeated the error down here so i'm going to show you here it says this repository is not signed so what that means is we don't have their key because we don't want there to be a man in middle attack where somebody you know takes over opera.com and then starts sending us bogus packages so they do signing key signing in order to make sure that we get the right software so if we want to use the repository we have to add their gpg key to our system now it's not difficult to do we just do kind of a two-step here we have to download the key itself so wget so we're going to download the file and then pipe that into app key add all right so we'll do that it's going to download the key and it's going to add it to our system and it just says ok we can see if it's there by typing apt key list and it's going to list all of the keys in our system and if we look we should be able to find sure enough here's the opera one so there's the upper key that we installed now if we do app update notice there's no errors at all it updated and now we could just install it using our apt system app install opera stable and it would install our needed packages you know it would resolve dependencies even if they're in other distributions and it would install it for us just fine i'm going to say no because i want to show you another way you can add a repository and this is kind of a cool thing that ubuntu added it's called a ppa or a personal package archive there's a particular text editor that i really like it's called adam atom and they have a ppa which is a repository and to add it you just say add apt repository ppa colon and then where it lives or what user it is web up date team and the name of the repository is atom press enter it's going to do a couple things it's going to add it to our sources.list but it's also going to download that key for us so all we have to do is say press enter and it's going to install the key it even does an apt update for us and then we would just do apt install adam and sure enough it would be able to do it from that package or from that repository that we just added it can be a bit overwhelming once you have to start adding gpg keys for the repositories that you put into your sources.list but ppas make it really really really simple they do everything for you i really like ppas i think it's been a wonderful addition to the way we handle apt packages repositories in an rpm system using yum are very similar to that of an app system adding them is maybe even easier and editing the config is about the same just with different files let me show you what i mean now the main configuration file in centos is going to be the yum.com file so let's look at that it's in etc and it's yum.com here's where we can do a couple things one we set the main configuration things like where the cache is stored and things like this one thing i want to point out in here is this gpg check now you can actually turn off yum's ability to verify using signed key pairs and this will just allow it to install any repo or install from any repo that you install in your system it's not a good idea to turn this off though because again this is a safety check to make sure you're not getting a man in the middle attack but the main section is at top you can put new repos in the bottom or like it says right here you can put them in etc yum.repose.d as individual files with the format file dot repo that's usually what's done but we could add them here let's look inside the yum.repos.d folder so let's get out of here go into etc yum.repost.d and here's where all of the repos that are currently installed on our system live so let's look at one really quick look at the base file here and we're gonna see how it is set up so there's the base configuration up here the name of it the release where the actual files are stored gpg check you can turn this on and off for individual repos as well so if you set up a repo and you don't have signed key pair you could turn it off for just one repo and then this tells us where the actual key file if it's turned on lives so we can put our put it in there manually so that yum knows where to look to find that key now normally these aren't added manually but you could type all of these in these are all individual repos that are defined in here in this bracketed section usually what you do though it's really elegant is let's get out of here you just install the package so one of the really popular packages is called apple which stands for extra packages for enterprise linux and to do that we just do yum install apple release i'm gonna say yes to install it and now if we look in here we're gonna see here we go the apple.repo has been added so now if we were to install packages apple would also be one of the places that we could pull packages from we can just cap this to look at it and we can see it's enabled the gpg check is on it tells us to use this key file for checking the signatures and that's all there is to installing a repository in yum and since yum automatically updates we don't have to like update the cache when we install a package it's going to automatically fill in the blanks for us yum is an incredibly awesome and powerful package manager and we can tell just by how easy it is to add a repo and edit that config either using the yum.com file or the individual files inside etc yum.repo.d apt and yum are certainly the most common package managers out there but there are a few less common package managers that you should still be aware of now arch linux uses a program called pacman for managing their packages and open souza uses zipper now i'm not saying that the mascot for zipper is a purple horse but i'm just saying maybe it should be nevertheless it's not too difficult to use them even if you're not familiar here i have two terminal windows to two different linux distributions the first one is arch linux which uses the pacman package manager so if we just do pacman minus h we're gonna see here are a list of the commands now it's not immediately clear how you go about installing a package unfortunately it's not just install it's actually capital s for sync we kind of want to sync the system into a state that we wanted so to install a package it isn't difficult once you know what to do just pac-man minus capital s and let's install vim my favorite text editor you have to actually spell pac-man correctly pac-man minus capital s vim it'll say do you want to install it and say yes it's installed it and now if we type vim you can see sure enough there's vim my favorite text editor now over here in open souza it's a little bit different here we use a program called zipper zy pper and you do zipper minus h and it'll show us all of the help commands that are available this one it is pretty easy you just do install or you can shorten it to just in so we could say zipper in vim we'll install them again press enter it's going to retrieve the repositories online just like yum or apt would do and then it's going to install the vim package for us and we'll be able to use it on open souza it'll ask us so we want to continue i'm going to say yes and it installs all of the packages and the dependencies so now same thing here we get startup vim and sure enough there's vim this time on open souza pac-man and zipper are really two of the more popular alternative package managers but there are some others if you're on slack where you're going to have to install things by hand using tar.gz files but these two along with apt and yum will get you through most systems managing local users on your server is really easy and there's a bunch of tools that make it even easier now there's the standard command line tools that allow you to add modify and delete accounts and there's this super cool script that i really like which makes adding a user very very easy there are a whole bunch of different facts about a user that is stored in the system though full name username password all of these things plus some others that aren't even listed are on there and it's important to know that the tools will manipulate all of these but you don't have to specify every single thing every time especially things like office number those aren't even really used anymore but there are possibilities that you can add specific information in the local group now i just want to go right to the command line so we can actually start adding users and at first we're going to use the tools that you know kind of come with it like the low level tools user add user dell in user mod so first of all i want to say user add because we don't have any extra users on our thing i'm going to do dash h and that'll give us the help screen now you can see there are tons of options for adding users but the format is pretty much the same user add whatever options you want to add and then the login or the username for the new user now we're going to use just a couple and i want to show you the problem with using user ad as opposed to that fancy script add user we have to determine like a home directory and a user shell and all sorts of things like that so let's say we wanted to make a user we're just going to say now notice i'm root we have to be root to make a system user but we would say user add and i want to do minus d home susie minus s for shell i'm going to use the bin bash shell and her username is going to be susie press enter it's all done now there is no password for susie we'd have to actually do that with the password command uh we'll actually say p a swd susie alright so now she has a new password and we can here let's start a new window ssh suzy at susie and we're logged in as susie oh but see it says unable to change to directory home susie there's no such file or directory well dog on it we said that that's our home directory but by default it doesn't make that directory we actually have to make that directory or there's another command dash m which will create it as we're adding the user so it's possible again to use this tool and set all of those different flags up as we create the user if we go through all of these things but it's much easier to use the add user script and i'll show you what i mean add user frank and now it says okay i'm creating the frank user added a new group for frank created frank's home directory copying all the files from etsy scale and now it's asking me put to give frank a password so i'm going to do that now the full name of frank his room number which again we don't use room number work phone all those things anymore really but full name is nice to have in there and is this information correct i'm going to say yes and so now we have a really nicely set up user so if we were to open a new tab again and do ssh frank at localhost log in with frank i have to type frank's password correctly and now you'll see sure enough if we do pwd we're gonna see we're in frank's home directory he has a perfectly usable account because that ad user script goes through and remembers all the various things that we need to do so let's get out of here now there is another tool we can use user dell so we can say user dell let's do minus h so we can see the options there's really only one option that's ever really important that is minus r which removes the user's files so we could say user dell minus r for remove frank and it removed all those files that said there wasn't any mail form so i couldn't remove that but now frank's home directory is gone now we could modify susie's account if we use user mod so let's see user mod minus h and we're gonna see same thing we have a whole bunch of tools and this is just to modify an existing user so we could do something like change her shell like right now it's been bashed but we could change her shell so we could say user mod minus s let's say bin false susie what this is going to do it's going to change her shell so if she tries to log in it's going to fail so let's open up a new tab and try to log in as her ssh suzy at localhost it logged in and then immediately logged out see it says it still doesn't have a user account we are a user directory we didn't make her a home directory but then it says connection to localhost closed that's because as soon as it logged in her shell is been false which immediately exits and then we're back to being logged out we're logged in as bob here so the ability to modify a user account is really important but if you're going to add a user i highly recommend you use that script you can certainly use the manual tools with all the flags to add that account but add user just makes it so easy by remembering all the steps and i'll be honest i often have a difficult time remembering is it add user or user ad add user user ad for my own sake i like to think okay alphabetically add user is first and that's the first tool i want to use for adding a user so that's just the trick i use regardless of what tools you use to add users it's important to understand that modifying them on the command line is fairly simple and not that difficult to learn local groups on a linux system are fairly straightforward to handle but it's important to understand the difference between primary groups and secondary or supplementary groups on each individual user now every user on the system is going to have their own personal primary group now usually that's going to be the same as their username so bob is going to have a username of bob and he's also going to have a primary group of bob and that's usually how it goes it could be a it could be a different group that says primary group but almost always that's how the primary group is going to be on a linux system and then there are all of the supplementary groups which happen to be like things that he belongs to like maybe he's in the admin group maybe he's in the sales group marketing third floor public all of these groups are in addition to his primary group and it will give him access to certain folders on the system that he might not have access to if he didn't have these different group memberships now in order to actually create groups the tools are very straightforward just like adding users with much fewer options so let's actually look at that and i want to show you how to manipulate users in groups on a system now here we are on an ubuntu system and we can simply say group add public it's going to add the group public to our system we could say group add sales and it's going to do that now we could look at the different options group add minus h you see there aren't too many options we could specify a group id number if we wanted we could do some things with group mod to change some of those features like the group id and then of course we could delete them with group dell and the same sort of thing you know we just delete them as we would use user dell on a system to get rid of a user so this is pretty straightforward but the part about primary and secondary can be confusing so right now if we type groups bob we're going to see that bob is in all of these groups the first one listed is his primary group so bob's primary group is bob and then these are all of his supplementary or secondary groups now i want to show you user mod minus h because it shows us how we can manipulate groups right here so lowercase g is how we force a user to a new primary group so we could change bob's primary group using the lowercase g flag on user mod in order to change his secondary or supplementary groups we use a capital g but here's where the gotcha comes into play if we do like minus capital g public it's going to delete all of his other supplementary groups and only make him part of the public group and that's not what you almost ever want so there's this nice dash a which means append the user to another supplementary group without getting rid of the additional groups that he already belongs to so let's actually do that in practice so we're going to say user mod minus a for append minus capital g public bob okay and now if we do groups bob we're going to see that bob still belongs to all those secondary groups and also to the public group which is exactly what we wanted to have happen because if we wouldn't have used that a this is what would have happened if we would have done user mod minus g let's do sales for bob this seems like it's going to do the right thing no errors but then if we do groups bob we're going to see he still has his primary group of bob and then just sales and so we'd have to manually go back through and add him to each individual group which is really a pain in the butt so you don't want to do that you want to always remember to use the dash a now in function the difference between a primary group and a secondary group the primary group is what is used if bob were to create a file so let's open up a new tab here we are bob say this is in our home folder if bob were to touch a file and we do ls minus l we're going to see the file that we just created is owned by bob and the group membership is bob's primary group of bob so that's really the difference between primary and secondary is when you have a primary group that's what the group membership of a new file you create is going to belong to now creating groups is really easy with the group ad group mod group dell tool so i didn't even go into that very much in depth the real important takeaway is the idea of primary versus secondary or supplementary and how to take individual users and put them into groups without deleting all of their other supplementary groups that they already belong to i hope this has been informative for you and i'd like to thank you for viewing figuring out what users are on your system and what accounts they're using and what they're doing is really an important part of forensics but also it's an important part of a sanity check like why is my computer running slow or what user am i logged in as and there's a handful of simple command line tools that we're going to go over that will just help you figure out the users on your system now the first scenario this happens a lot if you log into embedded systems like routers and stuff where you don't get a prompt that tells you who you are you just get like a hashtag here well what you can do is say who am i which seems a little bit silly but it will just give you the user that you're currently logged in as if you're starting to use sudo and su and you're sshing from one computer to another sometimes just figuring out what user you're logged in as is really important so that's it seems silly but it's something that i use in nuggets even you'll see me use that quite a bit another one if we just do who it'll show us who is logged into our current system it actually gives us quite a bit of information we have bob frank and looks like susie is logged in twice and it tells us when that person logged in and where they're logged in from so bob is our current user and i'm logged in on the display zero meaning i'm using an x windows session here and then frank and susie are both logged into localhost probably ssh and then susie is logged in from a remote computer probably over ssh as well so who tells you that w which is like what although you don't have to type the rest of it it's just a w gives you the same information with a little bit more so these are the users where they're coming from when they logged in how long they've been idle and then what again that's where that what came from what they're actually doing so here i'm using a gdmx session looks like frank is sleeping on the job frank judging you buddy and then susie's just logged into a bash terminal so these are tools that give you information another one is pinky which seems like a silly tool but it replaces the older tool finger that has kind of been abandoned about 10 years ago but this gives you even more information but not the what's going on this gives you the login name full name when they logged in how long they've been idle where they're coming from and a combination of these really usually any one of them will give you the information that you want but these are all here to give you information on who's logged into the system now if you want further information about it you could also do id susie and it will give you information about susie's id including her user id her primary group id what groups she belongs to looks like she only belongs to the susie group so let's do an id on bob and yeah bob belongs to a whole bunch of other groups so it'll give you all of that and then lastly lastly there's the command last hahaha if you type last it gives you a history of the people that have been logged into the system recently so it looks like we go back all the way to march 7th here today's march 13th and i logged in a bunch of times just so we would get some results here and we can see who logged in where they logged in from and when they logged out or if they're still logged in so you can see i logged in at 1001 logged out at 1001 up here logged in at 1003 still logged in from the local host computer and that is suzy so there's a lot of tools that you can use to figure out who's logged into your system how long they've been there where they're coming from what they're doing and these really might seem like throwaway commands like why would i need to figure out who's logged into the system but knowing the simple little tools like who and w and last can really be convenient when you're tracking down what's going on in your server who's logged in who was logged in did somebody log in bob said he logged in did he really log in well check out last and it'll tell you the point is there's a bunch of tools that are available that'll give you information on users on your system whether they're logged in not logged in or have been logged in in the past passwords and group memberships are things in linux that are stored in text files just like everything else but the problem is we don't want people to have access to our passwords and so there's kind of this elegant system of shadow files that has been invented this is fairly recent in the world of technology it used to be everything was stored in a single file called etc password but things have been changed now so that everybody doesn't have access to seeing the encrypted password so how it works now is this let's say this is our user bob now bob when he logs into a system has to be able to see what his home directory is so he needs to have access to a lot of information about his account however we don't want bob to have access to everybody's encrypted passwords so that's kept over in another file called a shadow file so what we have in the password file is we have bob's username and then literally just an x that takes the place of where the password used to be stored and if this x is here then the system knows okay i need to go over and look in the shadow file for the actual encrypted password of bob and then it does system authentication with root access as opposed to bob's access which is to read the password file but he doesn't have access to everybody's encrypted password the reason we don't want that is if he were to do like a brute force attack if he had access to everybody's encrypted password he could just keep hammering away at it until he finally figured out what the password was we don't want that so we don't want every user to have access we just want root or the system to be able to authenticate and check out you know the encrypted password files so anyway there's a neat and elegant system of how this works now notice obviously the etc password file is readable by all the et cetera shadow file is only accessible by the root user but when it comes to editing those there are special tools to make sure that we do it properly now first let's actually look at the file so if we do an ls minus all of etc pass sswd we're going to see that sure enough it's owned by root and everybody on the system can read it right only root can write to it but everybody can read it but if we look at the etc shadow file we're going to see that only root can write to it and only people in the shadow group can read to it everybody else in the system can't even read it so our encrypted passwords are protected now there's also if we do ls minus l etc group we'll see the same scenario where this is the group definitions for users on the system and the same settings permission wise as etc password has and if we do ls minus l etc g shadow we're gonna see it's the exact same thing as the shadow file now groups don't normally have passwords associated with them but they can so the shadow system does the same with the group file as it does with the password file now if we wanted to edit one of these files we could do something like sudo vi etc pass s wd and it's going to let us edit this file using our text editor but this is not the ideal way to go about it because we want to do it the shadowy way right we want to be able to edit it and then be sure that the shadow file matches so let's get out of here let's not make any changes to properly edit these files what you do is say sudo vipw this is part of the shadow package and it's going to ask us what editor we want to use if you want to use nano this is the easiest one like it says right here easiest you can use this i prefer to use vim so i'm just going to choose selection number two but number one is perfectly fine and it opens up and looks exactly the same and we make changes here and we could like go down here you'll notice all the things about bob are listed in this line right here's his username his password is just a placeholder here as an x because the actual encrypted password is in the shadow file but his user id is group id his full name his home directory his shell we can make any changes we wanted here and then we would just go and save the changes and this is what it would tell us it would say okay you've modified the etc password file you may want to modify the shadow file too and to do that you do vipw dash s for shadow so the same thing we would say sudo vi pw dash s and now this is the actual shadow file that we're editing and we can go through and if we wanted to make changes here this is bob's encrypted password now this is obviously not bob's plain text password his plain text password is just the word bob we know that but this is what it's like encrypted but since this is only accessible to root there's nobody who's going to be able to do a brute force attack to try to decrypt this because they don't have access to the passwords encrypted themselves so we'll get out of here and now the exact same thing with the group password or the group file in the group shadow file is done too we can say sudo vi gr and this is going to edit the group file and sudo vi gr s and that's going to edit the g shadow file all right so that's the proper way to go about editing those files manually if you man if you edit these files it's going to do the same thing as if we did like user mod and changed somebody's home directory it just does it by editing the underlying configuration files so yes group and password files are still just text files but there is this elegant shadow system that allows us to make sure that the right people have access to the encrypted passwords and not everyone on the system can see everyone else's encrypted passwords quotas are the way that we make sure users or groups don't overuse the hard drive we don't want a particular user a particular group to use up too much of a hard drive and stop other people from saving files now there's soft quota limits and hard quota limits the difference is with a soft limit you're warned every day hey you've gone over your limit hey you've gone over your limit whereas if you reach the hard limit you're no longer able to save files at all now there's a couple things that we have to do to get our system ready for using quotas and keeping track of things but the first thing is we have to make sure that our partition is mounted correctly so on our system here i have a disk mounted or a 10 gigabyte drive mounted on mnt disk okay now if we look into our etc fs tab file i'm going to show you how you mount it so that quotas are enabled so here's our drive it's dev sdb1 and it's mounted on mnt disk it's ext4 i've used default mounting options and then i've added a comma and usr quota now we could also put grp quota we put another comma and grp quota i'm just going to do user quotas group quotas work the same way so we'll learn one and we'll know how to use both but we have to make sure that it's mounted this way so if you're making this change you want going to want to reboot your system to make sure that it actually takes effect and then if we type mount we're gonna see that sure enough mount disk or mnt disk is mounted with quotas enabled specifically user quota management okay so we know that the drive is able to support it but out of the box quotas are not turned on so what we need to do is first of all scan the existing drive for files owned by a particular user so we need to actually do sudo quota check dash a for all partitions that support quotas dash u for user owned files press enter it's going to go through it's going to check our drive and now if we look in mnt disk we're going to see sure enough now there's a quota file that has been created and it shows just you know all the files on the drive who owns them which is none right now but we're going to change that in a minute now the other thing is so that prepares the drive but we actually have to turn quotas on so we're going to say sudo quota on dash a for all supported partitions so now quota quota ing is actually turned on but if we want to set a particular quota for a user we're going to have to use ed quota so sudo ed quota and i want to do this for the user bob on our system so we're going to do it for bob and then we get this list now it's going to show us all the file systems that support quotas in our case that's just this one devsdb1 now i have to explain really quickly there's two kinds of quotas we can do we can set up quotas for inodes or we can set up quotas for block usage now an inode means a file so we could say how many files a person can store on a particular partition but this isn't all that useful right i mean what if they have two files but those two files are like 27 gigabytes each so rather than set limits on inodes i tend to like to set them based on blocks now by default these are one kilobyte blocks so this zero means how many are currently in use and there's nothing on the drive owned by bob so that's set to zero right now but i'm going to make some changes here i'm going to say i want the soft limit to be 500 kilobytes and i want the hard limit to be 1000 kilobytes okay or like one megabyte this is not practical number you'd probably do something bigger in real life but we're going to save this say yes and now as bob if we go over to mount disk ls we're going to see that's in there now there's no usage currently but let's say i were to create a file now to do that i'm going to use dd it's okay if you don't use the dd command basically we're going to say an input file of dev zero output file of file one block size equals one kilobyte so that we can know exactly how many kilobytes we're using up and count equals let's say 400 so this should make a 400 kilobyte file if i press enter and do ls minus l we're going to see sure enough we have a 400 kilobyte file on here and this is fine this isn't meeting our quota at all we haven't done anything bad we're not even up to our soft quota but if we do this again i'm just going to push the up arrow and change this to file 2 and press enter okay it's done the same thing we do ls minus l now there's 800 kilobytes stored on this particular disk what this means is it hasn't stopped us from creating it but every day we're going to get an email from the system that says uh hey you've gone over your soft limit you really need to delete some files and we'll maybe do that maybe we won't do that but here's what happens if we try to create another 400 kilobyte file which will take us over the limit right because 400 plus 400 plus 400 would be 1200 but we only have a thousand kilobyte limit so what's gonna happen if we press enter it says that there's an error writing file three the disk quota is exceeded so let's do ls minus l and see what happened so it looks like it went along and it was creating fine creating fine it got to 200 kilobytes and all of a sudden it couldn't write anymore and that makes sense because 800 or 400 plus 400 is 800 plus 200 it is a thousand so we hit our hard limit and that's exactly how quotas work on the system once quotas are turned on it's really a hands-off kind of thing they take care of themselves the emails go out automatically every day if the user goes over their soft quota and at the hard quota it stops them no matter what so quotas are easy to set up once you remember to use quota on to turn it on make sure that it's mounted with the proper options and run that initial quota check so that it knows what files are on there so it knows when it is or isn't getting close to the actual quota that you set user profiles are where initial settings are set for a particular user like if if they're going to set up like aliases or they need their path variables those are the sort of things that profiles will do and there are system-wide profiles and also individual profiles and it can be a little bit overwhelming because not every linux system is the same now there are some commonalities usually there's an etc environment file and that file will often but not always set up the path variables so that the users who log in get a particular path set up now there's almost always an etc profile in the system-wide file in a profile is something that is run on a login shell so like the very first time you log into a system like if you're logging into a gui that first time you log in you will execute the profile these are settings that only need to be executed one time like it doesn't matter if you're going to open a new terminal you only need to set these settings one time and then there's also going to be one of these not both of these it's usually either one or the other either the etc bash rc or etc bash dot bash rc uh this one is pretty common in ubuntu this one is pretty common in centos but nonetheless these are they serve the same function so you're going to have one or the other and these are profile settings that need to be set every time you open a shell so let's say you're already logged in to x windows like in a gui session you're in there and you click on an icon to open up a new terminal window well that's not considered a login shell this is just considered a sub shell of your main system login so you will not execute a profile you will only execute your bash rc now this is again system wide so that happens to everybody when you log in you get the systemid profiles applied and then every individual has the possibility to have these individual files in their home folders now they all start with a dot so they're all hidden you have to do ls minus a if you want to see them but these are the same things up here it's just if you have any changes or additions you want to make to the system-wide settings you put them in your own personal folder and you put it in like bash rc and that'll run every time you open a sub shell everything in here will be set up and then the dot profile or bash underscore profile depending on which system you have you're going to have one or the other one of these but this is executed the very first time you log in just like the system-wide profile your personal profile only gets executed that initial time when you log into the system any sub shells will only apply the bashrc files but that's the way it works i'll show you really quick how it's set up on a system but the hierarchy is really the important thing to understand now this is bob's home folder i did an ls minus la so you can see all of the things in here he has a dot bash rc file and he also has a dot profile that means that these are going to be applied after the system-wide settings because the system-wide settings are given to everybody and then any personalizations like if you have an alias that you want to set on your own you would put it in your own personal bashrc file now inside etc there are those common files like ls minus l grep4 profile we're going to see we have etc profile and also etc profile.d this is a folder if we go in there we're going to see there's a bunch of sh scripts all of these are included in the dot profile in fact i'll show you what i mean vi dot dot let's look at that system-wide profile file and if we look all the way down on the bottom here it's going to call in all of those files inside profile.d so when i say that the profile is executed by everybody not only etc profile but also etc profile.d everything in here is going to be executed as well to every user on the system now there's a couple important things to remember don't worry so much about what exactly is the name of the file when you look in the etc folder you're either going to find an etc bash rc file or an etc bash dot bash rc file don't worry too much about which system has which whichever one is there is the one that you need to use now as far as the hierarchy goes it's important to remember that the system-wide stuff is executed for everybody so everything in etc is executed for all users and then if you have changes or additions you put them in your personal dot profile or dot bash rc user profiles are pretty easy to track down and once you understand how the system-wide and individual settings work it's a snap to figure out which comes first if you're working on the command line a text editor is going to be an invaluable tool because pretty much everything in linux is text based now you should use nano nano is the editor you should use it's a wonderfully simplistic straightforward intuitive text editor that works just fine and then there's vi vi is clunky it's hard to use it doesn't make a whole lot of sense and it's the editor that i use almost exclusively now i know that doesn't make any sense but here's the deal vi has been around for a very very long time like since the beginning so even though it's difficult to use i've managed to learn to use it and it's just what my fingers do with muscle memory so i encourage you use nano unless you've been using vi long enough that it's the only option that seems to make sense now there is one scenario that you may want to learn at least the basics of vi sometimes you're going to come across a system that doesn't have nano installed most systems do but if you end up on a system that only has vi these couple commands are going to kind of save your bacon so here's the deal this is what makes vi so confusing there are two modes there's command mode which is what it starts in an insert mode which is what you use when you actually type text now the way that i can kind of describe this is if you're sitting down and typing you're going to be in insert mode because insert mode is where you insert text and delete text and use your arrow keys to you know go around and change text but then if you need to do some command like save or quit or anything like that you're gonna go into command mode and the way i think about it is let's say you are on a standard word processor if you're typing you're in insert mode if all of a sudden you need to reach for the mouse and click on something you're going to be in command mode so while it's not a perfect analogy if you want to do something like save you're going to want to go into command mode so that you can save and quit now since there's no mouse it's all still text things that you're doing but think mentally okay i'm in the mode where i'd be using my mouse to save things instead of just typing out text now to go back and forth that can be confusing too so you start out when you open vi you're in command mode if you want to start typing something you press either i for insert or a for append meaning like do you want to insert right where the cursor is or to the right of where the cursor is but either one is going to work fine so either i or a and then if you want to get back into command mode you press escape so those are your magic keys to go back and forth right i or a i usually use i to go into your typing text mode or insert mode escape to go back so that back and forth that's how it works now the actual commands to save or quit or save and quit are right here and they may not make sense but if you're in command mode you're going to press colon and then type w and then q and press enter that's going to save your document and quit a lot of times people get stuck in vi and have no idea how to get out it can be so frustrating so this will get you out also if you want to quit without saving like you've accidentally made changes and you didn't mean to you press escape again to get into command mode and then you press colon q exclamation point and press enter and that will that will exit without saving and then if you just want to save halfway through a document you can just do colon w enter and it's going to save but you'll stay in then you'll press i to go back into insert mode and continue making edits i'll show you really quickly well i'll show you what nano is and then i'll show you vi just so you can see it in practice like i said almost every distribution is going to have both nano and vi so i'll show you nano first if we do an ls we'll see i have this text file dot so if you just type nanotextfile.txt it's going to open the editor with this and you can just use arrow keys and you can start editing right away this is like you would expect any text editor to work okay so press enter it's going to insert blank lines and then if you want to save you can look right down here we have control x to exit and you can do other things too there's all sorts of commands but i'm going to show you the basics here control x to exit and then it says would you like to save your changes and you can say see the options here y for yes and for no control c for cancel i'm going to say yes and then it says what file name would you like to write well it'll default to the current text file but if you wanted to save it as like copy 2 you could i'm just going to hit enter and boom we're done the text file has been edited it's very simple very easy to use and again i recommend you use it now vi i'm going to look at the same text file with vi so vi text file text and here we have we're in what mode are we in we're in command mode now we can still use the arrow keys to get around but we can't edit any text or insert any text if we want to add text we press i and then look down here it says insert so this is a little cheat it tells you that you're in insert mode if we're gonna go back into command mode press escape i'm gonna go into insert mode press i escape to get out insert mode and once you're in insert mode you can type text and then if you wanna save it again you have to press escape and then colon w q enter and then boom we've saved the file we can look and see the text file has been changed all this changes were saved and that's how you use vi it's confusing but that's how it works so again use nano it just makes sense it's easy that's what i recommend you use but if you have to use vi at least now you know the couple shortcuts that are going to get you through so that you can actually use it to edit text now remember i said i use vi all the time and it's true but the funny thing is that it's followed me into things like word processors so sometimes in my microsoft word documents even on the very bottom you'll see colon wq because my head i just automatically do that when i'm done editing text anyway use nano but vi is fun and it's a good skill to have i hope this has been informative for you and i'd like to thank you for viewing viewing text files is an extremely common thing for a system administrator to do on a linux system so we're going to look at a bunch of tools that allow us to examine text files in a way that allows us to view but also to search and i'm just going to go right to the command line so that we can see these things work in real time now i've created in my folder here a file called two cities now this is just the public domain tale of two cities this is the first chapter i'll just type cat so we can look at it two cities see it's just a tale of two cities the first chapter all right so let's clear the screen now the first thing i'm going to show you is the head command and what it does it'll show you the first lines of a text file so the head or the beginning of it so we can just say head two cities and it's going to show us the first 10 lines of the story so the best of times the worst times that's part that we're familiar with now we can change that how many lines it shows us if we were to do head dash and 20 it's going to show us 20 lines the first 20 lines of the file so see it's a little bit longer now and it's shown us all 20 lines now head isn't usually as commonly used as its companion which is tail so let me clear the screen oop and it spell clear right clear the screen now if we were to do tail two cities this shows us you've probably already guessed it the last 10 lines and we could do the same thing with the dash n and a number we could decide how many lines of the file we want to see now this is really useful if you're looking at log files and that's almost exclusively where i use the tail command if i'm looking at a log file i just want to see the last things that were written to a log file so i'll do a tail of the log file in question and i'll see what was added to the very end so i don't have to look like 27 megabytes of text for all of the logs just the last little bit of it and so that's a very useful command and again you can use n20 if you want to see 20 or whatever number you want to see 10 is the default now the other ones i want to show you are less and more we'll start with more this is the older command so let me clear the screen if we were to type more two cities this is going to show us the entire file and if we wanna scroll through it we press the spacebar and it'll go to the next page spacebar go to the next page the enter key will go line by line but this is that's it so we're all the way to the end of the into the first chapter but that's how more works you just kind of go through it like that you can also search but it only searches down and it's not one that i use very often anymore because it's been outmoded by the much more powerful although it has a more diminutive name less so actually let me clear the screen so if we were to say less two cities now we have what looks like a similar kind of interface but we can use our arrow keys to scroll up and down page up and page down work so we don't have to worry about like hitting the space bar to go down a page we can spacebar will go down a page but then we can scroll back up with the up key now the other really nice thing about less and more does this to an extent but less is even more powerful if you type forward slash in a term so let's search for france press enter it's going to take us to the first entry of the word that we searched for it's going to highlight it and it's highlighted all of them so if we were to press forward slash and enter again it's going to repeat the same search and here we are francis found again france is down here again forward slash it'll take us to that one put it right to the top of the screen so we can search through an entire text file as well so it's very very powerful to use the command that seems like it would be less powerful because it's named less but really it's a lot better okay so to get out of here and this actually confused me for a long time if you just press q just the letter q it'll exit the less command and get you out of it so less more head tail they're very commonly used i usually use less and tail more commonly than the other two but that's just because i want to see the end of a log file and i want to be able to go up and down when i'm scrolling through a text file and search really powerfully it's not a really tough nugget because these are pretty straightforward tools they're all useful and you'll probably find yourself using them fairly frequently on the command line sometimes when i'm at the grocery store i wish i could search for where things are like i think pizza sauce should be right next to spaghetti sauce but it almost never is well thankfully when it comes to searching for text in a linux system there is an awesome tool that allows you to do just that narrow down what you're looking for with tool called grep now greb does use regular expressions or regex and if you're interested in the you know very fine-tuned filters you can get with regex i cover that really great in the linux foundations course but today i want to talk about searching for strings of text using grep because it can be a real powerful way to get the information you want really quickly now i said that grep uses regular expressions so if you want to make sure that it's absolutely searching just for strings of text you can use the dash capital f flag and that means just fixed strings usually you don't have to do that because if you just search for a string it's going to generally find it in the file but occasionally your string might be regular expression characters and you can cause yourself some headaches so if you want to be safe use dash capital f i usually don't because it's usually not an issue but i just want you to be aware that if you do grep minus capital f it's going to just search for strings now there's two different ways that we can use grep we can say graph the string from a file and it's going to search the file for the string that we specified and that works really well but there's also another way you can do it you can do this cat file or anything that has a text output like ls or anything that's going to output text and then you can use the pipe symbol and kind of push it through grep and search for a string i'll show you why this is a really powerful way to use grep because it seems a little backwards like why wouldn't we just say you know grep this string from this file i'm on an ubuntu system here and i'm just going to search a log file okay so i'm going to say grep now we can say dash capital f or we can leave that off i'm just searching for strings i just again want you to know that capital f is going to force it to just use strings but i want to grep for dhcp from the var log syslog file press enter and it's going to find all of the lines in that text file that have dhcp in it actually even highlights the dhcp which is really convenient now there's another place we could get some system log information and that is using the dmesg command but here's the problem that's a command that has output but it's not a file we can't grep the d message command so that's where the pipe symbol comes into play and it works really really well we could just say d message and instead of just having it print to the screen we can use the pipe symbol which is usually above the enter key in a us keyboard grep i'm going to use minus f this time we don't have to necessarily dhcp and then it's going to take all of that output from d message and grep for dhcp and sure enough there's two lines that have dhcp now another really useful way that we can use the pipe symbol let's clear the screen because i want to show you that first one that we did right we we actually grew up for dhcp from that file and we got these results let's say we had just pages and pages of results and we wanted just to look for things that mentioned init init what we could do is kind of like chain greps along we could say grep dhcp from var syslog and then pipe those results into grep init and press enter and now we're just going to get the lines that were in this result that also contained the word init and so here now we've filtered all the way down to these two lines of text from the log files so even if you're not getting super fancy with regular expressions you can do some really powerful searching of strings using the grep tool for things like log files or any kind of text that you want to search for and remember you can chain those grep commands together so that you get a really fine filter looking for exactly what you're looking for every application in the linux system has three sort of like pipes it has standard input standard output and standard error and basically it's just a way to get information in and out it's an io type situation for every individual app now i have just an application here uh drawn out and i want to show you the difference between the three so standard input is pretty easy to understand right this is like if you're putting something into a program we use the pipe symbol if we're gonna pipe something into it or we can use less than if we want to just assign a file to the standard input now a lot of programs don't accept things on standard input but some of them do so if you've ever seen me pipe a command into another command what i'm doing is piping the results of one command into another command so it can work on it i'll show you how that works on the command line but then there's two other pipes and one of them is the standard output this is what happens if you type ls and it shows you know the contents on the screen that's the output that it shows you is the standard output now there's also standard error if an error occurs it also prints the things out on the screen but they're different pipes now we don't realize the difference because they both end up on the command line that's like the default place for standard output and standard error to go but you can treat them differently so if you want to redirect the output of a file of an application into a file use the greater than symbol if you want to redirect the standard error or like you know an error message you have to use two greater than because it's a different pipe and you have to redirect it separately so let me show you what i mean first of all let's talk about standard input now i have in here a file called file.txt i'll show you what's inside of it okay so this is what we have okay just the text file with some text in it now if we wanted to use grep to search for text we could just say grep text from file.text and it would show us the text that's in there but we could also redirect to the standard input rather than telling grep what to use so we could say cat file.txt and then pipe the results into standard input of grep and then have grep look for text we should get the exact same results now what we've done though rather than telling grep you know what file to choose from we just piped the results of cat into standard input and then grep used that as its input for grepping for the word text now that is using the pipe symbol we can also say grep for text and i want you to use file.txt as your standard input now this looks very similar to this up here but it's drastically different because what we've done is we've used redirection so this is redirecting standard input this actually functions exactly the same as this one because here we're using the pipe symbol to redirect standard input here we're using the less than symbol to redirect standard input so that's how you can do standard input it's not something you do as often apart from with this scenario i do this a lot you know piping one thing into another so that you can get the results from there now the other thing is standard output and standard error so i have a really quick way to show you so we say ls and we get these are the contents of ls we could redirect that by using greater than into a file called results.txt and we should get no output because rather than redirecting the output to our terminal window here it's actually redirected the standard output into results.txt so if we look there's a file now called results.txt and if we look at results.txt it has the contents of that ls command right it just dumped the contents into there here's a problem though what if we did this let me clear the screen what if we did ls lsf and we tried to redirect the standard output into results.txt why did we get the error message here and let's look at results well there's nothing in results now because there was no standard output this is an error there is no file called ff for us to use ls on if we wanted to redirect an error we would have to do lsf to greater than error.txt if we do that ah nothing appeared however if we look at in the file here now we have a file called error.txt if we look in that sure enough that was the error message we redirected it using a standard error redirector using standard input standard output and standard error redirection is something you're going to find yourself doing a lot because you want to see the results of things when you're not there to see it happen on the command line that's basically what log files are right they've taken errors and redirected them into a log file for you once you understand how input and output works with an application on the command line there are some really cool tricks and tips that we can learn to make life a little bit easier now there's a handful of things we want to look at but i've drawn a diagram so we can actually get a real good taste for what each thing is now devnl you may have heard of people call it the black hole or the bit bucket and basically devnull is a location on your file system that you can copy anything to and it will disappear forever now that seems like a weird concept i know but if you have like extraneous logs that you don't really want to ever see you just want to like throw them immediately in the trash devnl is where you want it to go for example you would not want to copy important information and redirect it to dev null because it will just disappear forever so devnet is just a place that everything disappears when you copy it there now t is an interesting command it doesn't seem like it does very much but it accomplishes a task that's remarkably difficult to do without using the tool itself so here's how it works we take some sort of text like output from a file or something like that and we put it into teas standard input and then all it does is dump that same information out of its standard output but it also writes it to a file so it does like a t in the road or a fork in the road it lets you see what it is right on the standard output right in your command window but then it also copies it to a file so that's what t does and it's really nice if you want to see what's happening but you also want to have a record of it and keep it into a file and then arguably the most complicated one but also the niftiest one maybe is called x-args now how x-args work is let's say you have a different program like ls or something and you do the command and so you see stuff come out in standard output and then we're going to pipe it in or redirect it into x-args standard input then what x-args does is it takes it and says okay what program do you want me to use this information that you just piped into me on and you tell it like application number two and then it executes application two and it uses that information as the arguments for the second command now the reason this is really powerful is not all applications can accept things from another program on their standard input so x args basically takes and forces a program to accept something from standard input by accepting that standard input itself and then putting it as an argument onto the application now let's look at all of these really quick because the cool part is you know when you actually do it now first of all devnl so we're going to say echo hello and it'll put it to the screen if we do echo hello and then we redirect it to no boom it's completely gone right that's how devnet works it's just a place that never fills up it's kind of like a teenage girl at a pizza party sleepover right it just never gets full you can put as much as you want in there it's just gonna disappear now we've redirected standard output we could also redirect standard error and then we should see hello because there was no error there right if we did something like this uh let's say ls documents and ff we should get both standard output and standard error sure enough here's our standard error there is no ff but here is the contents of documents so we got both now if you wanted to do something cool and redirect both standard output and standard error into one place you can do this you can say ls documents ff just like we did and i'm going to redirect standard output into dev null and here's the magic part and then i'm going to redirect standard error into and one oh my goodness what is this well the ampersand one is a way that we can tell it that what we want is standard error to get redirected into standard output so the one is standard output so what this does is all of our standard output is getting redirected to dev null so all of our standard error is going to go into standard output which is of course going to devnet so this should give us absolutely no results and sure enough both standard output and standard error have gone into devnet alright so that's just a really cool thing that you can do with redirection and then devnet is just a place that never fills up all right let's clear the screen i want to show you so here i have a couple files or one file let's look and see what's in there all right just a bunch of different words in the text file so what we can do is say cat file.txt i'm going to pipe my standard output into t's standard input and then i'm going to call it copy.txt and now we should see the contents of file.txt sure enough it printed to the screen but then the t command also created that copy.txt that also contains this so if we look sure enough there's a copy of our stuff and then last i'm going to show you x args and it's hard to come up with a real good example but i think i have one all right what we want to do is i want to create a folder named everything in this file so i want to have a folder named red named yellow named blue and name tuna fish well it turns out to be kind of difficult to do that but we can do it really simply if we say catfile.txt pipe that into x-args and then args is going to mkdir and then it will put the standard output of this command and kind of like plunk it right there so if we press enter and we do ls look at that we have a folder with each one of those names which is really convenient right it was able to do like what would have taken us quite a bit of typing to do it just did it by piping it and then it kind of pastes the results right at the end of whatever command you tell it now we could also do something cool if i want to clean up my mess instead of mkdir i'm going to say rm dir and then they're gone standard input standard output standard error they're really cool things and there are some additional tips and tricks that make using them even more usable and more beneficial on the command line dealing with text on the command line is something that's kind of fun to do to be quite honest and there are some tools that are pretty nifty to play with so let's look at a couple of them right now we'll just go right to the command line now i've already set us up with a few files i file one and file two so let's look at them just so we know what's in them that's what's in file one and that's what's in file two just a list of words so i wanna show you a few things now you'll notice that these are not in alphabetical order okay so we could use the sort command so i could say sort file1 dot txt and it's going to return the contents but notice now they're in alphabetical order chicken fish monkey turtle so what we could do is if we wanted to save a file call it sorted then we could redirect the output to sorted.txt and then if we look at sorted.txt now it's going to be a file with them in alphabetical order so sort does just that and if you look at the commands for sort like do the man page for sort you can see that it does some other things you know you can actually sort with options like you know do i ignore case what about numbers what about you know what if it's a bunch of dates so sort's very powerful but that's basically what it does it takes a text file and then it outputs that text file sorted however you tell it to do so let's clear this screen all right the next one i want to show you is word count it's pretty simple it's just wc and it stands for word count so we could do word count of file1.txt and it'll give us three fields it says 4 4 and 27 in our case what this means is it means there are four words there are four lines and there are 27 characters now if we just want to know one of those things we could just say wc minus m for character count file1.txt and it's just going to show us that there are 27 characters in file1.txt now the last two i want to show you are really the most interesting so there's cut and paste which i know sounds like a gooey thing but let's actually look at our file again just so we know exactly what we're dealing with so file1.txt this is what we have these lines if we were to use the cut command we could say cut i'm going to do cut by characters and i want it to cut out character oh let's just say 1 from file1.txt if we do that we should see just the first oh i did file two my goodness i'm like those aren't the first letters at all so let's look at file one that dxd that makes more sense so the c in chicken the f and fish teen turtle m in monkey okay so we have those uh we could do more than just one character and we could do more than just the first character let's say we wanted to do cut minus c the third fourth and fifth character in the file file one dot txt so now we should see [Music] and what it did is it took the third fourth and fifth character but you see fish only has four characters so there is no fifth character so it just did sh for the fish line all right but see that's that's what cut does it'll actually take it right out of the middle of the file which is surprisingly difficult to do if you don't actually use the cut command and paste does kind of the exact opposite so let me clear the screen because it's kind of full and i want to show again file 1 and file 2. now let's say we wanted to put this file file 2 after file 1. right we wanted to say chicken lips fish whiskers turtle feathers and monkey flippers well that's kind of difficult to do if we cap them together it's just going to put one at the end of the other so that's where paste comes into play we can say paste file1.txt file2.txt and it's going to output sure enough it put a tab between them and then we have chicken lips fish whiskers if we wanted to do this and redirect it into a file and then we look at join.txt look at that our file now has those two pasted together now i know that most of what we did using these commands was really just playing around but really playing around is one of the best ways to learn to use a tool and you're going to find that every once in a while one of these tools like cutter paste especially are going to be extremely useful because it's kind of hard to put things next to each other in a text file or cut out the middle bits of a text file without using simple tools like this awk and said are text manipulation tools that for some reason most people are afraid of and i honestly don't know why yes they can be very complicated you can do a lot of powerful things but you don't have to you can do some very simple yet still powerful things with awk and said now what do they stand for said just stands for stream editor and awk i actually had to google this because i had no idea i mean i've been using awk for decades but i didn't really know what it stood for it turns out it's the initials of the people who first wrote it and i'm not going to try to pronounce them all but that's what the a the w and the k mean basically awk is a data extraction tool it allows you to pull out certain bits of data from text and set of course is just a stream editor which allows you to edit things without interacting with it directly so let's actually go to the command line so we can see it work now i've already prepared a couple files so if we do ls we're going to see we have file 1 file 2 and join so let's just look at them this is file 1 this is file 2 and this is basically the two of them joined together i actually use the paste command to do that so i have these three files and i'm going to use said and awk to do things with these files okay so first of all said is a stream editor which means you can put files in and you'll it'll output the edited version and how it works i'm just going to cat one of these files so cat file1.txt and i'm going to pipe that into said for stream editor and here's where i'm going to set up the rules i'm going to substitute so i'm going to say s and then forward slash what i want to search for is the word monkey and a forward slash what i want to replace it with i'm going to say dolphin and then a forward slash ng for global that just means that if it occurs more than one time i want every occurrence of monkey to be substituted with dolphin and then i'm just going to press enter and what we get is chicken fish turtle dolphin because it took this initial file and it substituted monkey for dolphin and really that's what said does it's a stream editor it allows you to edit things as it flies through there so you can make changes to text as it's being manipulated so you can put this inside of a script and do things without interacting on a like a a gui like you'd want to open it up with vi and edit it out and change monkey to dolphin or anything like that so that's what stream editor or said does now i'm going to clear the screen and we're just going to look at joint i'm going to say cat join just so we can see it now awk takes a text file and it will allow you to pull out bits and do things with them so i'm just going to type this out and then we'll see what i'm talking about so i'm going to say awk and i'm going to put single quotes open curly braces print dollar sign 1 close curly braces close single things and i'm going to use joined.txt as the file so what this is saying now this is maybe why people are scared there's a bit of an odd syntax here but we say awk and then this is what we want awk to do with the file i want it to print out the first field now it will auto detect that these fields are separated by a tab they could just be spaces and it'll auto detect it but basically this is on line one this is field one field two line two it's field one field two so this should print out the first field of every line so let's press enter and sure enough chicken fish turtle monkey it printed them all out now we can do more than just one thing at a time so let's go back over here what if we wanted to do dollar sign two dollar sign one now we should get a printout of whiskers fish and and flippers monkey and let's see what what turns out here now it is gonna be a little hard to read and i'll show you why see it did do that right here we had chicken lips and it gave us lips chicken whisker fish feathers turtle flippers monkey now the problem is it took those fields and just mushed them right together so we could add another thing in there we could kind of build this out longer and we could say i also want a space in there so hopefully that makes sense it's going to print field 2 and then it's going to print this space and then it's going to print field 1. so let's see if that's what we get sure enough lips chicken whisker fish feather turtle flippers monkey and it doesn't have to be used just one time right we could do this we could say i want dollar sign two dollar sign two which means field two field two and now we should get a duplicate of each one lip slips whiskers whiskers so awk just takes bits of data and allows you to manipulate them and do what you want with them honestly the only thing i could think is that the syntax for awk and said intimidate people but once you get used to doing it and especially with stream editor the syntax is very similar to the vi editor when it comes to replacing and substituting things in a text file so hopefully you get used to them and you're not afraid of them because they're super powerful and awesome tools to use especially in scripts because there's no interaction required you can put them right inside of a script and they work without you entering more data hard links and soft links or symbolic links as a lot of people call them are very similar in what you get on the file system as far as usability goes but they work drastically differently so let me explain the difference between them so when we have a hard drive we basically have every file that takes up a certain number of sectors on the hard drive itself so this one let's say takes up three blocks this one takes up four blocks this one takes up seven blocks and these are the actual files on the hard drive but the file system actually only knows where those files live because of the file allocation table so this is kind of like a table of contents right it says okay file one is actually right here and it extends three blocks this one i'll say okay file two this actually lives right here on the hard drive and it extends four blocks and then the same thing here file three and you know it extends all the blocks here and so on and so forth for all of the other files on the system now there is a difference though let's say this is a symbolic link okay a symbolic link doesn't point to the hard drive at all it actually just points to a file in the file allocation table so this is a standard you know table of contents link right this is pointing to this spot and these blocks on the hard drive but the symbolic link just points to filename.txt on this system itself now there's the other kind of link and that's a hard link so you probably notice there's another purple file here so let's say this is file two and it's pointing to you know these four blocks this could be like file 12 and it points to the exact same spots on the hard drive so it points to the exact same file location and the same number of blocks it's basically the exact same file but it has two different reference points in the file allocation table that can be really confusing but what's cool about it is let's say you accidentally delete this file well that's okay it's still on the hard drive and you can still reference it from this file here so let me show you what that looks like in practice in our system here we have let's do an ls minus l we have myfile.doc now if we wanted to do a symbolic link we would do ln minus s for soft or symbolic the source is my file and the destination is going to be my linked file.doc do ls minus l and we're going to see it actually shows us exactly what's happening my linked file dot doc is just pointing to the name myfile.doc in fact it's just pointing to this name itself so if we were to say move my file.doc to my new file dot doc and then we do an ls minus l this link is broken because it still points to the name my file.doc and that doesn't exist anymore so this is now a broken link on our system so symbolic links are kind of dumb they don't take up much space but they're kind of dumb in that they don't follow a file if you move it or rename it all right so that is a soft link now a hard link works differently a hard link if we were just to say ln without any flags my new file dot dock to my file dot doc and then we do ls minus l well one we've fixed the symbolic link right because now this file points to a file that exists now so all of a sudden now it's pointing to this file my file.doc and you'll notice it's the same size as the other one and if we were to move the original so we're going to say move my new file to my cool file dot doc it doesn't break the hard link that we made both of them are still there they're still fine they're their own independent file name in the file allocation table they just happen to point to the same spot on the hard drive and we can see that if we do ls minus li for inodes it's going to show us the spot on the hard drive that it's actually pointing to and sure enough these have a matching inode whereas this symbolic link has a completely different inode because it's you know just a file that only points to a file name but these two have the exact same spot on the physical hard drive now another cool thing this number here which you've probably never even thought about before but this says how many linked files are on this inode now this says there are three now of course we see two right here but that means somewhere on my file system there's another hard link to this inode now i did actually make it before we started so we could use the find command to find that i'm going to say find i'm going to look in my home directory in the same file flag and i want to find the same file as my file.doc or it could say mycoolfile.doc because these are the same files so it doesn't matter which one i have find look for a match for and press enter and it's going to find all three of them sure enough i have a hidden file right here that i that i did earlier before the nugget started and this is just one more hard linked file to this same inode we could look really quick ls minus li l i a so we can see the hidden files and sure enough hidden file there's that same inode reference that the other ones are referencing now honestly soft links are generally used more because they're easy to see you can do an ls and see where they're pointing to so they're a lot more convenient but hard links do have their place because each file acts as an independent file you can delete one and it doesn't ruin the reference that the other one has so hard links point to inodes soft links just point to file name references of other files on the existing file system when you're trying to figure out the location of files on your system there's basically two ways you can do it there's the find command and the locate command and both do pretty much the same thing but find is quite a bit more powerful yet has some limitations over locate i'm going to just show you how they work because trying to explain the pros and cons just seems a little bit silly when we can just actually see how it works in action now if we look in our documents folder i have a few files here i have new paper and it looks like it has camel caps here capital n capital p old file and research paper dot doc so let's actually use the locate command first because it's simple so we say locate and then what we want to look for and it can just be a substring so if we say old file and press enter it's going to give us the full path of old file.txt notice i didn't have to search for the entire file name i just searched for old file.txt or old file and it found old file.txt now we can do the same for let's see locate research underscore and it should find research paper and sure enough research paper now there was one more file in there if we type locate new paper no it doesn't find it oh did i spell it wrong well let's look over here new paper no i did not spell it wrong and that's where the limitation of locate is locate uses a database that is cached on your system which means it's super duper fast for searching for the names of files however the cache is only created once a day now we can force an update we could say sudo update db press enter and it's going to update the database of all the file names on the system and now if we just do up arrow and locate new paper now it's going to find it because we updated the cache of all the files on the system so it's very very fast but it has that limitation that it uses cached data now the find command is more powerful but it has the limitation that because it searches in real time it's a lot slower so how does it work pretty much the same way we're going to say find and you tell it where you want it to search so we can say search the root directory and now rather than just a file name find does a lot more things so we're going to say i want you to search for a file named let's say star new paper star and this should find all of the files that have new paper in them okay so i'm going to press enter and oh my goodness what is all this permission denied well find actually goes through the entire file system because i said search in the root directory and i actually don't have permission to look at all of these things so it's going to search through every single folder on the whole system and let's see did it actually find the file it should have but we have to look through all the error messages and if we scroll up sure enough it did find it all right it did find it just like the locate command but there were a lot of permission denied errors now we could do something like redirect the error right we could say two greater than dev null which will just pipe the errors into our dev null bit bucket and sure enough there it found it was pretty quick but it wasn't as quick as using locate now there are some other really cool things that find can do we can say find in our current home directory and what that'll do is it will allow us to search in just this home directory so it doesn't search the entire file system so that can be pretty convenient so if we say find dot and we're gonna look for the name new p oh i didn't do the stars new p it should find it for us sure enough it did find documents new paper and then it does this other thing we can actually say dash delete and it will delete it and how can we see if it's deleted if we look in the documents folder whoa it deleted that file so find does more than just locate files however it has some finicky things like this it's going to be using regular expressions to search for the files and while it's more powerful it's slower and it can be annoying when we do things like get errors from permission denied things like that so while conceptually find and locate do the same thing they do them in different ways the important thing to remember about locate is that it's always going to use old data unless you run that update db command now find is much more powerful but it works in real time so it's slower and there that means that there are pros and cons to using both tools while it's certainly possible to set up network shares for copying files from one server to another generally if you're going to copy files over the network from linux server to linux server you're going to use either ssh or really scp which stands for secure copy but it uses the ssh protocol in order to do that copying or rsync which actually will sync a whole bunch of files across the network so we're going to look at doing both but it's important to realize that ssh is the same program and the same protocol that we use to connect from one computer to another to reach its terminal so let me show you what i mean now i have two computers set up in our lab i have this ubuntu computer and i have this sent os computer they're both on the same network so i'm going to ssh from one to another i'm going to say ssh to centos it's going to ask me for bob's password on centos and then all of a sudden now i'm logged in to that remote computer sent to us in fact if we go into the desktop folder and we do an ls we're going to see over on sent to us we have things called like cool picture cool pic 2 and these things are on the remote desktop all right i'm going to exit and that's going to bring me back to ubuntu now if we look on the desktop folder we're going to see there's nothing in there because on our local ubuntu computer we don't have those things let's say we wanted to copy something over well let's go into our desktop folder again there's nothing here nothing up our sleeve we could use scp which is secure copy and this uses ssh right so we would say scp from centos now we could also specify a different user now it's the same user for us but i'm still going to specify i'm going to say bob at centos colon then the remote path which is going to be home bob desktop and let's pick one of these files i'm going to say cool picture cool picture.jpg and i want it to copy it to dot which means our current directory i'm going to press enter it's going to say okay what is bob's password bob and now if we do an ls we're going to say look we have cool picture that was copied over the network using scp which uses the ssh protocol and it was copied over and now we have a copy of it here locally we can do the same thing we could say scp a local thing and rather than the destination be our local computer we could do it backwards right we could say scp coolpicture.jpg to bob at centos home bob i'm going to copy it just to his home folder and so now it's sent one over to bob's home folder now we could copy everything all at once if we wanted by using the rsync command now rsync is pretty cool in that it will even recurse directories if we wanted to so we could actually do this we could say rsync i'm going to say minus a so it does all the things including recursively going into directories i'm going to say v so it does it verbosely so we can see what it's doing so rsync dash av and the remote thing is set up just like with scp so bob at centos or we don't have to say bob add if it's the same username we could just say centos colon and then the path so i want home bob desktop i want that entire folder copied to here so what we should end up with is a new folder inside our ubuntu desktop folder called desktop because it's going to copy this folder to our current thing and it's going to have recursively everything in the remote desktop folder let's see if that works press enter it's going to say what is bob's password bob and so now it says receiving all of these files and look sure enough there's a folder called desktop if we were to look inside there now i look all those files that were on that remote computer are now on our local computer as well we used rsync and it will transverse all of the directories recursively and it will copy it over for us now this might seem like a throwaway nugget something that is just nice to know but you're not going to use i'll be honest i use scp and rsync almost every single day copying files back and forth using scp is so easy and so fast you don't have to set up servers it's just a way to get one file to another server without having to worry about installing anything because it uses ssh which is already installed on all of your servers so whether you just want to copy a file or two with scp or you want to do recursive directories with rsync it's really easy to copy files over the network in linux pretty much every distribution out there uses system d to manage the services like the various programs that are installed like web servers and stuff on their system and system ctl is the command line tool that we use to manipulate and manage those services now there's a couple concepts we need to understand we need to know enable and disable versus start and stop enable and disable is basically talking about when the computer boots up will it automatically start the service and so you can have something that you can start and stop but that doesn't mean it's automatically going to start or stop when the system boots up that's where the enable and disable comes into play now it's really really easy to tell what a service is doing by default and we can change it without much more difficulty at all so i'm at a computer right now and i've installed apache 2 on this centos machine so httpd is the package name and i've installed it however if we go over here we can see localhost it's unable to connect it's not running so the first thing we would do is say systemctl status httpd and it's actually giving us more information than it first appears see it's telling us that it's actually inactive which makes sense because we can't get it to load but more importantly it's saying that the service itself is disabled and the vendor preset meaning like when you first install it it's set to disabled so we can change that because if it's disabled it means it's not going to start when the computer boots so even if we rebooted this computer it still wouldn't load because it wouldn't start by default so we can say system ctl enable httpd and press enter and now if we do that status we're going to see that it's changed okay it says it's enabled even though the vendor preset is still disabled this means that you know when we installed it it was disabled but we've changed it now so it's enabled but you'll notice it's still inactive or dead now if we did restart the computer it would automatically start up but we can start and stop it independently from whether it's enabled or disabled as not we can just say system cto start httpd and it's going to start our service we can look at status httpd and we can see sure enough now that it's active right it's running and it's still enabled so when we reboot it's going to start running automatically now even if we left this disabled we could still have started it using the start command however when the computer rebooted it wouldn't start automatically so if you want it to always start up you have to make sure that it's enabled even if it comes disabled by default just a quick look boom it's running and sure enough it's right there running for us and it will run when the computer reboots because we've changed it to enabled i really like systemd because systemctl is kind of the one-stop shop it's like the swiss army knife for managing services on a computer that's controlled with the systemd startup init service sys5 or sysv or system five it's called a lot of different things but this is an older way that linux systems would put themselves in various modes or run levels that determine the type of system whether it's a gui system whether it's just a standalone network system and we can switch those various modes we can set defaults to those modes but it's important to understand what the modes actually are and there's a whole list of them unfortunately the modes are different in debian and centos or debbie and ubuntu centos souza they actually use the various levels differently so i just want to briefly go over the difference so that if you're on one system you kind of understand what's going on so centos actually separates them the most so let's go here first we have run level zero and run level zero is basically if you go into this mode it halts the system this is like a way to power the system down run level one is single user mode there's no networking or anything and there's no asking for the root password this is the way that you would recover a root password on a sys5 computer mode 2 is multi-user with no network 3 is multi-user with network 4 is not used at all with centos and 5 is the multi-user gui system like if you have x windows installed and then lastly run level six is reboot if you switch into run level six it will then reboot your computer and it will go into whichever default is set now debian ubuntu are similar halt is the same reboot is the same single user mode is the same the difference is here two run level two is the full multi-user system just like run level three is here and then if there's a gui installed on the system the gui will start up there's no difference in debian ubuntu between having a gui system and having a not gui system when it comes to run levels that's only if the gui is installed so run level 2 is pretty much what we use all the time when we're in debian and ubuntu run levels three through five don't do anything at all they're just not used so that's the big big difference between the two we still have reboot we still have halt we still have single user mode but it's how they handle the other things that are a little bit different now switching between them and setting the defaults are exactly the same now here's the gotcha it took me a long time to find a system that still uses sys5. this is outmoded and not used in any modern distributions but if you go if you find one that is still in use like this is centos version six it will still use it so what we can do we can say run level and it will show us what run level we're in we're in run level five and the previous run level we're in was just a new boot okay now if we want to switch between run levels we can say tell init and then the run level we want to switch to so i'm going to say 3. this should drop us out of a gui and into a text only environment you can see here there's no gui there's just this text box right here so i'm going to log in so we can go back if i want to go back into the gui system i can say tell init 5 and it'll get us right back into the gui system and here we are in the gui system if i start up a terminal and we say run level we're gonna see we're currently in run level five our previous run level was three now if we switch into run level six it's going to reboot if we switch into run level zero it's going to just halt the computer and power it down if you want to change the default and the default is just what run level it automatically boots into we need to edit a file so i'm going to become root and we need to edit etc init tab and this file has a couple things we can edit but really the main thing is all the way down at the bottom which is the default run level now it gives us a little bit of a cheat sheet here and this is actually really really good advice the halt mode run level 0 do not set your init default to this because it'll boot up and immediately halt and that's not what we want same thing with setting it to run level 6. it will boot up and switch immediately into run level 6 which is reboot so it's going to be in a constant reboot loop so you never want to set the default to that ours is currently in run level five for the default and it's right here now we could change that to three save this file and now when we reboot the computer it's just going to automatically go into the text only mode i'll just show you really quick what happens if we telling it into run level six it's going to reboot that's what it does so if you run across an older system that uses run levels specifically system five run levels you need to know what the various modes do and remember it's going to be different whether it's ubuntu ubuntu and debian or centos and red hat and then you need to know how to switch the modes using telonit and how to set those defaults and most importantly what not to set the defaults to namely run level 0 or run level 6. i often find myself on a system thinking what if i need to switch between modes like i have a gui machine that i want to get rid of that gui interface so it's just a server with the text mode or vice versa well switching modes and setting the defaults is done one way with the sys5 system but if you have a newer system d initialization system it can be very very confusing especially if you have that cis 5 background thankfully there are pretty simple comparisons when it comes to how it used to be and how it currently is now if you're not familiar with init 5 or with sys5 init that's all right we're just going to talk about what the various modes are basically we start with run level zero which has a correlation in the system d world as a boot target called power off now boot targets are basically just modes right these are modes that computers are are set to so that they can function in a specific way and while it doesn't seem like a mode it's a really easy way to shut your computer down by switching into the power off mode now there's also one this is single user mode in the world of systemd it's called rescue mode this is like insist five if you want to like reset your root password you need to switch into single user mode well same thing with a boot target it's just called rescue mode then there's mode three which in a centos system is going to be like a non-graphical user interface with networking support that translates to just multi-user target in system d uh the gui mode 5 translates to graphical target and of course 6 is like the the compatriot to run level zero and this is how you can reboot your system by switching into run level six or boot target reboot now switching between them is actually easier than it is with the old sys5 you can actually just use a command line tool instead of editing that init tab file that you have to do with sys5 now i'm on centos version 7 here because centos version 7 uses system d whereas centos version 6 uses sys5. now the first thing you need to do is be root so i'm going to quickly become root once we're root we can type system ctl which is the way we do most things with system d but system ctl get default and this is going to tell us what the default mode is and that makes sense because our default mode here is graphical target and you can see we have a gui interface now we could change that we could say system ctl set default to multi-user.target and see now it's changed that so if we say get default it's going to tell us okay now it's multi-user but notice it didn't change we're still in the gui environment well that's because it just changed the default if we were to reboot this computer it would reboot into a text only mode now if you want to switch between modes or between targets with a computer that's already started you simply type system ctl isolate and then the name of the target in our case let's say isolate multi-user and it should drop us directly into sure enough the text only mode now if you're already used to the world of run levels you just have to kind of think what the different targets that correspond to it are but if you're not familiar with run levels like this is something that happened before your time in linux that's okay because honestly boot targets make a heck of a lot more sense than the run levels did because they actually have their description right in their names and while it's important to understand both sys5 and system d you should know that all systems going forward are going to be system d so you're gonna have to know about the various modes how to switch between them using isolate and then how to set and get the default so you know what happens to a system when it boots up so you don't have your rack servers booting up to a gui environment because that just doesn't make any sense services are the various programs that are installed on a server that are going to run and serve out like web pages or whatever you might have installed they're called services and if you have sys5 on your computer the way that you manage and start and set defaults for those individual services are by using specific programs in the et cetera init.d folder now there are tools that we can use to manage those specifically service and check config and i want to show you how they work because the services are determined to start and stop based on the run level of a particular what the computer is set to so if like it's run level three a certain system might start and if it's run level five another service might not start let me show you what i'm talking about here on our system this is centos 6 which has sys5 if we look in etc.d these are all the various services or programs that are installed on the computer we can see things you know like post fixes the email server sshd is our ssh server and we can start and stop these by using the service command so i can say service sshd start and it's going to start the service i can say service sshd stop and it will stop the service i can actually do status to see what it's currently doing so right now it's currently stopped it says but what i want to do is change how it starts or stops on system boot and what we can do is say chk config dash dash list sshd and it's going to show us what sshd is going to do on every run level so run level zero it's off one it's off two it's off three it's off four it's off five it's off and six it's off so this means it is not going to start up on system boot regardless of what run level the system is starting at now we can set it so that it will start for all of the run levels one through five it's never going to start automatically for zero or for six because those are those are halt and reboot and that would just be silly but if we wanted to start on all of them we can just say chk config sshd on and it's going to set them to on for all of the run levels and we can do that list command again and we're going to see now it's on four two three four and five actually it doesn't do it for single user mode so if we just say on it's going to set it for run level two three four and five it's going to be turned on but we can do it individually too so first of all let's turn it back off so now they're all set to off if we look see they're all off again we can do a single one so we could say chk config dash dash level 3 on and now if we look it's just gonna be on for level oop gotta get the format right sshd on and now if we look there now let's turn it on for run level three but the other ones are still off so chk config is the way that we change how it boots up whereas the service command up here is how we change it immediately if we wanted to start or stop we can use the service command but if we wanted to start on boot we need to use the chk config command because that's going to change the behavior at the various run levels thankfully when you install packages they create their own entries in the et cetera init.d folder we don't need to make scripts or anything in there and the programs install things in there so that the service and the chk config commands know exactly what to do in order to start and stop or configure what happens on boot with a given system that's running sys5 modern linux systems use system d to manage their services things like their web server their ssh server and the same tool is used to start them stop them enable them on boot and that's that swiss army knife that catch all tool for system d system ctl so i want to show you how to go about starting and stopping individual services but also how to affect what happens on boot when a system boots up what happens with particular services but first there is one thing that can be frustrating about system d and that is that the service files can be scattered all over the hard drive so for example inside etc systemd system we're going to find a couple service files like anything that ends in dot service is going to be a system d service file but you'll notice like there's no ssh here well that's frustrating well let's actually search for that say locate sshd.service you'll find that this is actually located in user lib systemd system that's where sshd.servicelive so there's several places that you can find the service files whereas with sys5 it was always in the etc.d folder here there are several folders that are going to house system files that are service files for your system so that can be frustrating but nonetheless we regardless of where they're stored we can still use systemctl to query them so we can say for example systemctl sshd let's do a status and this is another got you if you're going from sys5 right into system d the frustrating thing is normally we would say like service sshd status well now it's backwards now we have to say system ctl status sshd ger it's frustrating but you get used to it unless you go back and forth from systems then it can be a little bit frustrating but nonetheless on systemd we have to say status or start or stop and then the service name whereas it's backwards with sys5 anyway we have a lot of information here so we can see that it's active which means it's running so it's currently started there's more information here though if we look up here it says it's loaded it's enabled and enabled in systemd world means that it's going to start on system boot and there's even some more information here it says vendor preset is enabled now what that means is when we install the sshd daemon it's going to automatically be enabled now that doesn't mean it's going to start unless we restart the system but it means that it's going to be set to start on the system boot that's what the vendor preset is now we can change this easily we can say system ctl disable sshd and now if we go back and say status sshd we're going to see now it's disabled and the vendor preset is still enabled but we've changed it now so that it's not going to start on boot however this is another important thing to note it's still running because we've changed what happens on boot but we haven't changed what's currently happening on the system so if we reboot it's not going to be running but if we want it to not run we actually have to tell it that so we have to say systemctl stop sshd and now if we were to say status now we would see it's no longer running it's inactive and it's not going to start on system boot but let's change that because we definitely wanted to start on system boot so system ctl will start it up sshd and system ctl enable sshd and now if we do systemctl again this is a catch-all tool status sshd we're going to see it's back to how it should be it's running and it's enabled which means it's going to start on boot so regardless of what service we want to start stop or enable or disable that system ctl tool is what we use for just about everything in the world of system d really the only gotcha with going from sysv to system d when it comes to services you have to remember to use the systemctl tool and then you have to remember that the actual command goes before the name of the service and that's backwards from sys5. so we have to say like systemctl start sshd whereas with sys5 it was backwards but it's easy to get used to and i love having a one-stop tool to do all of the things that when we're planning servers on our network it's important to know that servers serve things they serve things like ntp which is a network time protocol ssh a secure shell so you can get into the computer remotely dns is domain name service which translates things like cbtnuggets.com into an ip address dhcp hands out ip addresses on a local network so you don't have to manually assign them a docker is a containerization system that allows you to run services in siloed environments and then of course configuration management tools allow you to centralize the individual configuration of servers so there's basically several kinds of services that we're going to install in our networks we have centralized things which are going to run on like one server for your entire network and then individual which are going to run on all or most of your servers so here we have things like i mentioned before dhcp dns configuration management server these are going to run on one server on your network you don't need more than one dhcp server more than one dns server apart from redundancy or high availability but generally speaking you only need this in one place individual computers all have to have an ssh server installed because you want to get into every server on your network right this is just something that's going to be installed everywhere your configuration management client is going to be installed on every computer it's a service that needs to be there so that it can take advantage of the configuration management system like chef or puppet or ansible so that it can you know work together to keep those servers in line docker is a containerization program that runs on an individual server and if you're gonna have a lot of different servers out there you may have docker installed on multiple computers so that it can host services for you now ntp is kind of the in the junction point of my venn diagram here and that's because ntp or network time protocol is the service that keeps your computer in the proper time like you know if you have some clock skew where it's a little bit too fast or a little bit too slow ntp will keep your server running in the proper time now there is a centralized ntp server very often on your network and all of the other computers or servers on your network will then query your centralized ntp server but here's the deal this same ntp server software is actually the client software as well so ntp does two things one it queries an above computer for the time you know in the case of the centralized one it's in the cloud and it also can serve out that time information to its peers or to people on your local network so usually we have a centralized ntp server but we don't even have to right all of these ntp server machines could query right out to the cloud and bypass a centralized ntp server it's just nice to have a centralized place that you have one time frame that your entire network is based on but it's the same server program so that's why i kind of put it in both camps here as individual and centralized now when you're setting up your servers on your network it's important to think through how it should work and it's actually gone through this change as computer hardware and technology has progressed there used to be a time where you would have a bare metal server for every service on your network if you had a dns server it would literally be a physical server sitting in your server closet and it would sit right next to your dhcp server whose sole purpose was to serve out dhcp same with ntp or you know a web server but then we said you know what now computers are getting to be so fast that we could put multiple services on a single computer because the problem here is it was very expensive right if you had to buy a physical server for every service you wanted to do it got expensive fast so what if we bought a decent sized server and then we installed dns software we installed dhcp software ntp software web software whatever we wanted to do it would all run alongside each other and be really happy the problem comes what if you need to run an update on one of these like we need to take dns offline so we can update it and maybe restart the server when we restart the server all of our services go down because we've put them all in one basket right all of our eggs are in one basket it's also a single point of failure basically it's messy and then the dawn of virtualization happened and this is where server closets got really awesome really fast that's because we had large computers you know basically the same large computer it's not that computer hardware got all that much faster but virtualization technology allowed us to instead of just installing a dns service software we could install a virtual server inside here by taking a slice of the resources from the bigger server and install a completely new server virtualized in there that would run dns and we could do that alongside another virtual server and if we had to take this offline or or restart it it wouldn't affect the others because they were their own standalone virtualized servers it was really really awesome it still is a very powerful way to go about protecting your different services from each other on the network the problem is and this is something that i've fallen prey to is sprawl potential it's really easy to spin up another server oh i want to do this i'll spin up another server oh what if we did this sir spin up another server that's where docker comes into play and i won't go too much into docker other than to tell you what docker does is it takes a server and it has a single operating system running linux and then each service has its own like isolated pocket it doesn't have its own operating system it's not like a virtual machine all it does is have its own little slice of the running system where it has its own file system and it can run its own little service here and it doesn't affect anything else because it's walled off so containerization is an even better way to take better advantage of server hardware even than virtualization when it comes to virtual servers so yes there's a lot to think about when you're installing servers on your network and where to put them but the nice thing is whether it's a local service that has to be installed on every computer or a centralized service that you just install like in one place for your whole network planning has gotten a lot easier because you don't have to worry so much about putting all of your eggs in one basket we've been able to isolate individual services without the need to buy brand new hardware so planning is kind of fun and more flexible than it's ever been before conceptually we pretty much understand how a web server works you send a request and the web server sends back the web page but when you add ssl or tls it really does add a layer of complexity but that complexity is for a good reason because it can secure the traffic so nobody knows what is going through your internet connection like bank account information and stuff like that so it's very important that we have ssl encrypted secured traffic now the process is going to be a little bit different than just a standard web page and part of the thing that you want to make sure you have if it's like for a bank or something is a certificate authority now let me demonstrate exactly what goes on when you try to get a web page from a web server let's say this is our web server now if you're not talking about ssl basically the guy in the computer here says hey i would like to see your web page and then the computer says okay here is my web page and that's pretty much the entire process there's no encryption at all but when you go to like your bank's website you're going to set up an ssl session so that all of your information that goes back and forth is encrypted so basically here is how the process works the client sends a message to the server and it says hey i would like to start an ssl encrypted session with you and then the web server responds okay here is my certificate this says who i am and that i'm valid and look here's a picture of my kid playing softball maybe not that bar but it sends a certificate describing who it is to prove that it's who the server says it is that it's not like some man-in-the-middle attack now the only way the end user knows that it's real is because he contacts a certificate authority which is a centralized trusted place that signs certificates basically it's this person's job to contact this web server and make it prove who it is and then once it proves who it is it gets its certificate signed by the certificate authority so then let's say this guy's name is bob bob says okay i see that at the bottom of your certificate it was signed by somebody who i trust so i'm going to trust that you're really who you say you are so then after that identification has been verified bob then sends his encryption key to the web server so the key is sent from the end user to the server and then the server uses that key that bob sent and that is what is used to encrypt the actual data that is going to go to bob's computer so the actual encryption uses bob's key that he sends to the server after the server proves who it is and then they use this tunnel back and forth and that's how they communicate using bob's key now if you set up a server in your own network you've probably heard of something called a self-signed certificate now that is exactly what it sounds like when that initial request comes from bob and he says hey server i would like to set up an ssl connection the server does respond with a copy of its certificate it says here i am this is all my information uh this is the stuff that you know describes who i am i'm promising that i am the person i say i am but there's no signature on the bottom from the certificate authority it's something that that the server signed himself so bob has to like just trust that this server is who he says it is now if it's on your own local network that's usually fine and it's okay to accept a self-signed certificate but if it's over the internet you don't want to accept a self-signed certificate because there's no way to be sure that it's actually the server it says it is and if you trust a self-signed certificate and it ends up being like a man in the middle attack you could be sending all of your banking data to a server that isn't who it says it is and that's very dangerous so a self-signed certificate encrypts the exact same way the problem is you're not a hundred percent positive who it is that initially set up that certificate so there's a lot of trust involved whereas if you use a certificate authority trust is taken out of the picture because you trust the certificate authority it's built into your web browser so the actual ssl encryption is the same whether it's a certificate authority or a self-signed certificate but you don't know if it's the server it says it is unless you have that certificate authority that signs the server certificate that's why it's very important especially on the internet to make sure that you don't get an error about not having a certificate signed by an authority but the process is the same either way it turns out that local network server roles have changed fairly dramatically over just the past few years now don't get me wrong things have come a long way from when we used to have a sneaker net so if we had a file we'd have to put it on a floppy disk and then you know carry it to the cubicle next to us and pass it on like that but we don't even use local servers as much as we used to now what am i talking about well i'm talking about the introduction of cloud computing now let's look at file services for example we used to have and then we actually we do still have local file services that you know if we want to save files in a local centralized place we do things whether we're on windows or linux or mac if we want to serve to windows computers we can use the samba program which allows us through file sharing that is hosted on linux but is accessible from a windows machine it's a free way very stable very scalable that we can actually share files with windows computers same thing with nfs which is network file storage and this is applicable for linux mac windows and then if you have old school apples that only use like the apple talk sort of networking stuff well there's neta talk which uses the native apple file sharing but this isn't even used anymore so much because now macintosh computers can very easily do windows shares using samba or nfs but netetalk is still around if you like that native apple file sharing stuff the point is linux computers can do local file sharing very very well the thing to think about if you're implementing a network though is should i and that's where cloud services come into play because while sure you can serve things locally on a file you might want to consider something like dropbox or onedrive or google drive that works on almost every platform and allows you to not only sync things between computers but also have an online backup which is really really vital and it saves a ton of money if you don't have to buy the servers to actually store all of your files if it's stored on a cloud service that you usually pay a service fee for you're going to save that money on maintenance and hardware purchases and etc etc so think about cloud services every time you're thinking about local services there are some cases you'll want local services but some cases it just doesn't make any sense and there are complementary cloud services to almost every one of our local services that we can offer i want to mention the ways that you can serve them locally because it does make sense sometimes for example a print server is going to be cups common unix printing system this works across the board if you're sharing a printer with mac or linux it's going to be using cups and even windows can print to cups servers it's like a centralized place but honestly most printers now pretty much have a robust ability to share and queue jobs on their own so we don't always have to use a centralized cup server we can all print to the same printer and it's just handled well netatalk also has printing if you have an old school apple computer you want to use that for but again that's not even used very much anymore how could your local printer be used with cloud services well it doesn't make sense until you think about no configuration printing right with google print or airprint these are ways that you can actually send your print job over the internet to a printer that may or may not be connected directly so it's something that isn't going to replace common everyday office printing but it's something to think about if you want to be able to configure or especially print from mobile devices having cloud solutions is very powerful now mail is a special case because a lot of times we want mail to be as secure as possible and that means we don't necessarily want to give the ability to another company to host our mail but remember with great power comes great responsibility keeping your locally hosted email files whether it's using postfix or xm or send mail can be a full time job because we want those to be really secure not only so people can't read our emails but so our servers aren't compromised and used to send out spam to the entire world so even though having that fine control over security of your own mail is important think about how nice it is if a huge company like google or microsoft or yahoo would have to worry about the security aspect so you can actually just focus on communicating with it so there's a lot to be said about using a third-party company for email even though you do lose some of that local control and then lastly i want to talk about a proxy now when we talk about proxies in linux we're talking about squid or squid guard which is an add-on to squid and generally proxies have historically taken the load off of your internet connection so a bunch of computers can actually request things one time from the cloud and then the proxy server kind of takes that and distributes it internally but our internet connections are very powerful now so we don't often do that now when we're thinking about proxy a lot of people mistakenly call a web filter a proxy and what a web filter does is it stops you from going to like pornographic websites and that's what squid guard does there's a lot of commercial products that keep a list of sites that shouldn't be visited by people and there are other solutions too like open dns is a way that you can set your dns server your upstream dns server so that it doesn't resolve sites that you don't want to see like pornographic websites they won't even resolve properly so your users can't get there now when it comes to actually caching or proxying large things there are some big companies like akamai that will cache entire video libraries of like netflix and stuff because that's a way that they can save bandwidth between isps but in general we don't use proxies as much as we used to although there are still a lot of use cases for things like squid guard or open dns for blocking unwanted websites that are going to waste time or expose us to things that you know we may not want to be exposed to in the company or in a school so while it's really important to know that there are local services that you can provide on your network using linux i encourage you to think through before you install a server on your network to do a particular task see if it really makes sense to host that locally or if going for a third party service might make more sense authentication services and database services are both things that we often do on local computers i want to talk about their purpose and their importance but it also might seem a little weird that i group them together and that's because we usually think about authentication and databases as local services that run on robust servers and that's for a reason it's because they're very very important to what we do on a regular basis and i just want to talk about while we oftentimes still run them on local computers there's still an argument to be made for putting those in the cloud as well now generally if you're talking about linux you're thinking about a sql service if you're thinking traditionally right this could be my sequel or maria database which is basically mysql only newer postgres there's a whole bunch of other sql servers that would normally run on a really robust server on our network now there are tons of other database services some of them are you know are non-sql and and some of them are good for certain types of data and bad for other types of data but there are tons and tons of database services and generally when we have a database server it's going to be on its own computer and that's just because database servers tend to be kind of robust so you're going to have something like mysql maria postgres they're going to be on their own server and authentication services are similar as well because we need to have a central place to authenticate all of our users now when you log into like gmail you're going to put in your username and password and that's all stored in a central place on google servers if you're on your local network there are tons of ways that you can store user information on your own local network in fact for years i would use nis on a linux server and i would use all of our computers in the network would authenticate to this one centralized server and it was so important that all user information was stored on there that i had a redundancy so that in case one of the servers went bad i would still have a backup that was live high availability so there's lots of ways we can do it but the vital importance of having our own server or even multiple servers is important but here's the deal you'll notice i have all of these things around the outside you can use open ldap to host your user accounts on your network and it's going to work with a bunch of different programs but honestly i would say 95 of the time user data is going to be stored on an active directory on a windows computer even if you're a linux person in a linux shop it seems like ad has taken the cake as the king when it comes to user authentication now the one alternative to that is if you're going to use an online like saml or i don't want to get too much into programming but there are ways that you can leverage online authentication for your local stuff like single sign-on and things like that so that may take the place of active directory on your network but if you're if you're talking about user authentication active directory is almost certainly going to be where a medium to large or even small office is going to host all of their user accounts so it's important to know that you can host authentication services on a linux machine but you may not end up doing that because active directory is probably going to be somewhere and it's a great place to centralize your users and your computers so the whole point of this nugget is really twofold one authentication can be done on linux and i did it for over a decade where everything was hosted on a local linux machine and i used nis for authentication i could have used openldap but generally you're not going to do that on a big network because you have more than linux machines that need to authenticate now the other thing though is database servers i want you to know that they're almost always going to be on their own server because they use a lot of resources a lot of memory a lot of cpu a lot of disk io so if you're setting up server roles on your network think about a database server as having its own need unless you host it out on the cloud in which case you're just paying for somebody else's resources which can often be even more effective centralized logging and monitoring is great for a big network because it's easy to lose track of a big number of servers but honestly it's great even if you have a tiny little network now i'm going to look at combining syslogs but i also want to peek at snmp because both are vital for combining information from multiple sources centralizing them if you will so that it's easier to see and easier to use for making predictions and adjustments in how your infrastructure works so first of all i want to talk about central logging okay now there are a ton of devices out there a lot of them can be you know servers you know and lots of servers on the rack and you don't want each one to have its own set of logs you want to combine them but there's other things now that can create logs but may not have the storage area to keep those logs on themselves like security cameras routers motion detectors smart bulbs printers all of these things can create log files and if you can redirect them to a centralized server that's going to allow you to comb through data in a very efficient way and in fact sometimes this is the only way you can get data from certain devices that have absolutely no storage on them but can generate logs and information now when you combine them all together it is pretty neat how it works the centralized log server is going to just have one big log file however each individual device is going to put its own name in the log file so you can sort by whatever computer is adding it so even though you have one log file that is going to contain everything it's easy to separate out the individual devices to see what they're doing using tools like grep or there are some really fancy devops tools that will allow you to sort data from combined log files when you combine log files like that you end up being able to see some trends or see some relationships that might not otherwise be easily accessible for example if none of these devices are able to access a dns server well maybe you have some problem with the network in that portion of your company and so you can help troubleshoot based on what logs are being submitted now snmp is slightly different and it stands for simple network management protocol but it's a little bit of a misnomer because traditionally this was used to not only read data but also remotely control devices using this network management again it was a two-way street protocol now there are still some instances where you can use this to manage devices but mainly it's used for pulling data and what i mean by that is let's say you have some data that you'd like to concatenate together for an informative data pie that's just delicious you're trying to make graphs or something you can use snmp to pull data from a device one of the most common things that i use it for is i have a router that i connect to the internet with and i would like to see some interface statistics for what's going on how much data is going through it and things like that snmp can pull that data it's not really like individual servers pushing data to a centralized logging it's kind of going the other way it's kind of using snmp to pull data from individual devices and what that looks like in practice here is this is actually the home page that every time i load up a web browser this is what loads up and i have a couple convenient links for me you know to go to see the weather in my area that sort of thing but i have these graphs that show the bandwidth usage both in my house in the town and we own a farm as well that has fiber internet connection and this shows the connection between them now you'll notice there's a lot of matching between the two and that's because i will often back all of my townhouse data to our farm because it's an off-site storage location so that makes a lot of sense but this is just a way that i can see what's happening on my network and i pull this from my routers using snmp now it's important to realize that while they accomplish sort of the same thing centralized logging allows all of your servers to push data to a centralized server so you can comb through all that data in one place whereas snmp is generally used as a protocol that you can pull data out of a server and do something with it like make cool graphs or whatever it is you want to do vpns might be something that you use every day but don't really understand what's going on so vpns or virtual private networks are really just a way to connect to a private network that isn't accessible from the internet itself now talk about the concepts then i want to talk about what options are available if you're using linux because vpn isn't just a one-size-fits-all thing there are several different protocols and stuff that you can use to connect but first of all conceptually what is going on well we have two networks let's say these networks are separated by being in different countries okay now you want to have this computer be able to interact with this computer or this server or something on the remote network but you don't want to open those ports up to the internet right because that's unsafe so what you do is your router or your linux server or something will establish what's called a tunnel and this tunnel is just a layer of encryption that goes from one side to the other and then inside that encrypted tunnel it just sets up a route like it would any other route on your network so your router just sees inside the tunnel and it sees a route using standard network addressing like let's say the vpn internal route is 10.10.0.5 or something and it just sees this as another route so that this computer can route information across the tunnel to this router and then get into here so it works the same as traditional routing the only difference is the router or like i said the linux server or whatever it is sets up this tunnel that blocks anybody on the internet from actually seeing what's going on in the route so that tunnel is set up and then the route goes inside the tunnel so nobody sees the traffic now this setup that i have here is called a site to site vpn and what it means is since there is a standard route set up here anything on this network is going to be go is going to be able to go over here to this network and anything on this network is going to route over to this network as if they were in the same building it's just a standard route inside that encrypted tunnel now the other type of vpn is going to be just an end user connecting to an office and if if you're a remote worker or you're like a road warrior you're going to use this a lot your computer is going to establish the same type of tunnel to block all of the internal network traffic from the internet itself once it sets up that tunnel then just your computer is going to be connected to this internal network okay so you're actually going to as if you plugged into an ethernet port in the wall next to all of your other employees or your other fellow workers who happen to be in the headquarters so what this does is it puts your computer inside this local network by establishing a tunnel and then setting up routing protocols that will put you inside there so it's a little bit different than a site to site because these computers are probably not going to like serve data from your computer it's going to work a little bit differently but conceptually the same thing is happening your remote computer is now able to connect to computers inside the remote network and it's all protected from the internet by using a vpn now i talked about different protocols to do that and there are a bunch there's openvpn which is an open source program that allows you to establish these types of connections uh there's ssh which can establish a tunnel for you to do stuff there's l2 tp there's ipsec there's all these different protocols and programs that will allow these tunnels to be created and a vast majority of these will actually run on a linux server inside the network so if your router itself doesn't support vpn that's okay you can port forward into a linux server and the linux server will handle all of the vpn routing now there are lots of nuances when it comes to vpn like can you only connect through the remote business can you connect to the internet and also to those things is all of your traffic routed through the vpn even if it's a slow link so there's a lot to think about when setting up vpns but once you understand what's going on it's a lot easier to plan how you're going to implement it so that it so that it can best serve your users or your multiple branch offices so they can communicate to each other i hope this has been informative for you and i'd like to thank you for viewing containers aren't really new technology they've been around for quite a while there have been options like lxc and you've probably heard of docker and also kubernetes comes into the mix which is actually like an orchestration tool that takes care of docker but while containers aren't new they are kind of the new kid on the block when it comes to devops and containerizing applications is something that's really really popular now conceptually they're a little bit like a virtual machine but they're they're different enough that it's important to understand the difference so let's say we have a traditional virtual machine which i'm going to say is the left hand side of this slide so how it works is you have the the big computer you know and then that computer or that host has its own operating system and then on top of that we carve out a section of the host's cpu and memory and and cards and hard drive space we carve that out and then on top of that we install another operating system inside this virtualized environment and then we can put applications on that now containers work in a different way they run right inside the host operating system so you just have an app running in another app running and another app running and this seems like the traditional server model right where you just install linux and then you install applications like apache on top of it the difference is with a containerized application they are running directly on the host computer but they have not really their own operating system all they have is like their own file system and they're jailed off or they're completely separate from the file system of the host operating system itself so they run in their own little world but they're still running directly on the operating system it's just like they're sectioned off in a little container but they're still running on the host operating system and that's really what makes them efficient you'll notice over here um yeah it's drawing it's not like the actual technology but it takes up a lot more hardware and storage and slices of the host system itself if you're going to install an entire operating system on top of virtualized hardware it just you're not you can't put as many things on one host operating system plus you also have then this operating system to maintain along with this operating system these don't have their own operating system so you don't need to maintain anything except the application itself and it works really cool that jail system is neat so i want to show you how that works and let's go actually over to a virtualized environment so here i am in i'm running ubuntu and i have docker installed so we're not going to get into how docker works there's a whole course on how docker works that i taught which is one of my favorite courses but first of all we're just going to start a docker container and we're going to put ourselves inside of it so this is just a little bit of free info we're going to say docker run dash it for interactive tty let's say we're going to run the ubuntu image and i want to run bin bash which is the shell command that i want to run so now boom it it was that fast right it created the container that quickly and i didn't like pause the video and wait for it it actually went that quickly and now we're inside this container which acts like its own operating system because remember it's jailed off but it's still running on the system and we can demonstrate that see i'm not i'm no longer on cbt docker now i'm on this internal container and we have our own file system but let's run a command in here we're going to run something that'll show up with cpu usage so i'm just going to say dd input file equals dev 0 i'll put file equals dev no and this is just something that's going to keep running keep running keep running and not really do anything except use up resources on the computer and i want to do that so we can come over to cbt docker here run the top command and even though this is now in its own container completely separate from cbt docker the operating system you can see look dd shows up as another command running inside this computer because even though it's separated in a container it's still using the same operating system the same kernel the same hardware that everything else along this system is it's just separated so it doesn't interfere and dependencies won't interfere with other apps it won't interfere with dependencies on the host system it's just super efficient so whether you're looking at lxc containers or docker containers which is what we looked at it's important to understand that containers are a lot like a virtual machine except they don't use all of that hardware and they don't completely section themselves off and most importantly they don't have their own operating system that's where containers make things much more efficient and much easier to deal with with much less overhead meaning you don't have to maintain the operating systems of individual apps it's really easy to confuse the concepts of clustering and load balancing because they kind of are the same it's kind of like the question is a hot dog a sandwich i mean they're both meat between bread but are they the same thing not exactly not quite but they both function similarly and that's what clustering and load balancing is like the difference is though clustering is an actual computer term for computers working together to do one task that can be split up whereas load balancing is more of an i.t concept that can be accomplished in the multiple ways so when we have a cluster basically we have a bunch of computers like we have here that are working together and then there's like a cluster manager it can be a computer it can be a software on you know one of the cluster computers but it actually keeps track of which computer in the cluster is doing what part of the task and these are designed to work together they know about each other these computers are a team and they really do well if susie here sends them a task that is designed to be broken up into multiple pieces so people can work on it at the same time so some jobs lend themselves to clustering solutions whereas some of them don't but basically they know about each other the cluster works together as a team to break down a bigger task into a bunch of smaller ones they can work on at the same time a load balancer like i said is more of a concept it load balances meaning it it has a big job and it splits that job up and lets different computers do the job but usually these computers don't know or care about each other at all they don't know each other exist the load balancer itself whether it's software or a hardware device it knows the entire big load that susie is sending to it and it splits it up and says okay she has 12 jobs so you do three jobs and you do three jobs and you do three jobs they don't know each other are doing jobs as far as the computers know there's only three jobs to do it's the load balancers job to keep track of everything so it's a little bit different they don't really work together they each work separately and accomplish a bigger task that the load balancer itself knows about now i said it's a concept it can work multiple ways right a very common way it's done is the load balancer will be in front of some web servers and there's a whole bunch of people that want to hit that web server and so the load balancer says okay you go to this one and now you go to this one and the next request will go to this one and it splits up the load so that one computer isn't doing all of the work now we can do that very simply on a linux machine if the linux machine is set up for round robin dns now this is not a great way to load balance just conceptually this is what it's going to do so here is my dns configuration on my network i have the domain name web it's actually set up for three different ip addresses okay so you'll see it's actually web web web but they have three different ip addresses and what my dns server is going to do is then round robin and it will split up the load between these three let me show you how that works here we are on the command line so i'm going to say ping web and we'll see we get the response from the web server notice the ip address here is 216 58192.238. if we do the exact same thing now boom if we get responses but notice it sent it to a different computer together so we got these from another computer and if we do it again we'll get still that third one if we do it a fourth time it's going to wrap around and give us the first one again so this is technically a load balancer we're using dns round robin load balancing and it's conceptually splitting up the load of the pings so that each computer that is responding only gets a third of the requests so i guess it's fair to say that all clustering is load balancing but not all load balancing is technically clustering because clustering is a specific way that computers work together to accomplish a task now is a hot dog a sandwich i gotta leave that one up to you i have no idea but i hope this has been informative for you and i'd like to thank you for viewing cron jobs are pretty much the linux equivalent to like the task scheduler in windows now there's some really cool things that we need to understand and that's how to set up the scheduling which is kind of complicated but also very powerful and i'll be honest it's kind of fun the other thing i want to point out though is that there are pre-made folders that you can just drop scripts in and they will execute at a regular interval i'll show you those but first let's talk about how we set up the schedules because it can be intimidating but like i said it's not that bad and it's actually kind of fun so the scheduling fields which we'll look at in practice are separated into five different fields so we have minute hour day of the month month of the year and day of the week and how it works is for example this first line that i have has all asterisks and this means everything so every minute of every hour of every day of the month of every month of the year of every day of the week it's going to happen so this means every minute for all of eternity whatever task we schedule with this string is going to execute so every minute it's going to do whatever you tell it to do now there are some shortcuts we can use for example down here i have asterisk divided by five this means every five minutes now we could actually spell it out we could say zero comma five comma ten comma fifteen all the way to fifty-five but i don't really like to do that because it's a big mess and you know one field would be this entire big string of numbers so rather than do that we can just say asterisk divided by five and this is going to be every five minutes during the third hour of the day so this means at 3 am 305 am 3 10 am 3 15 am but once it gets to 4 am it's going to stop doing it okay so this is during the third hour every 5 minutes every day of the month every month year every day of the week so this means every day it's going to do this but only between 3 a.m and 3 59 a.m and every five minutes okay now this one is very very very specific this says two minutes after the fourth hour so two and four means at 402 a.m on the 13th of july but only when that 13th of july lands on a tuesday the day of the week goes from zero to six so two is a tuesday right sunday is zero monday is one tuesday is two so this means every july 13th at 402 am if it happens to also be tuesday so this is only going to execute every few years when july 13th happens to land on a tuesday and then down here this one zero minutes after the hour so this means that six a.m precisely on every day of the month every month of the year days one through five so this means monday through friday at six a.m it's going to execute whatever task okay so this is basically the way of saying six a.m every weekday because we've specified one through five on the days of the week over here pretty cool right let's actually see how it works in practice now i'm on a centos system here and i'm root because we're talking about the system wide cron jobs if we go into etcon dot d we're going to see we have a few files in here now any file in here is going to be read by the cron daemon so let's actually look at one let's look at systat all right because this is already in there so we're going to look at systat and here we can see a couple things are scheduled here are the five fields that we just talked about so here's the first field so every 10 minutes of every hour every day every month year every day of the week so this is going to be every 10 minutes it's going to execute as root so this field talks about what user it's going to run as and then the rest of it is what it's going to actually do so we have the five scheduling fields who it runs as and then the last part however long it is is what it's going to execute so down here these are commented out but let's pretend it's not this would be every hour at zero past right because this is like at one o'clock two o'clock three o'clock four o'clock it's going to run as root this command down here at 23 which is 11 so 11 53 p.m every day see all these are asterisk so every day as root it's going to execute this so you can either add to any of these or really the best thing to do is create your own right just create a file and then put that scheduling the username and what you want it to execute and it will do that now the one other thing i wanted to mention really quick if you go back into the etc folder and let's do an ls and just look for cron we're going to see there are a bunch of folders in here there are cron daily cron hourly cron monthly cron weekly and if we go in there let's go into cron dot daily we're going to see these are just executable scripts these are not timed things right let's look at one real quick so vi log rotate notice there's no like startup star star star star star star there's no scheduling in here this is just an executable script that we want to have execute every day so this is going to do the scheduling of everything in here once a day same with monthly same with hourly the kran deny is a way that we can tell a specific user that they're not allowed to use the cron daemon for personal use uh cron.d we looked at actually cron tab is a single file let's look at that one really quick that'll be our last thing etc tab and this is the same sort of thing we can put things in here if we want it even tells us like very specifically all the things the five fields the username we want it to run as and then the command to be executed this is the same thing as creating a file in the eccentric cron.d folder you can just add things here and they'll automatically execute now the pre-made folders are very convenient for dropping scripts in that you want to have execute every so often but really the coolest part about cron is just how flexible that scheduling system is and you can figure out how to do that by you know manipulating all of those different fields like we looked at in this slide here so i encourage you to just try to figure out how you would specify a particular time and you know just play with it it's a lot of fun to do when it comes to scheduling events you have a couple options as a personal end user on a linux account you can use crontab personal crontab which you invoke by typing crontab minus e or we can use the at daemon which is a one time thing it's not like for recurring events it's for events that happen just at a specific time so let's go right to the command line because it's not difficult to use either one and scheduling tasks is something that's really really nice to be able to do so here we are on a centos machine i'm logged in just as a user notice i'm not root and the first thing i want to do is look at my personal cron tab now this is a little bit different than a system wide crontab so first we type cron tab minus e and it's going to bring us into an editor okay and now this is my personal crown tab again it's not system wide and it's slightly different because i still have five fields if you're not familiar with the five fields look at the system-wide crontab nugget because it'll explain how this works but we have every 10 minutes of every hour of every day of every month of the year every day of the week so here we have every 10 minutes it's going to do something now notice there's no field here that specifies the user in the system wide we have to specify what user it runs as but since this is my personal crontab it obviously runs as me so we just have the five fields and then we have the command that we want to have it execute at the time scheduled here so what this what happens here is we have echo and this text string and append it to a file called homebob timetracker.log every 10 minutes so every 10 minutes it should add a line to our field so let's see if this is actually running because it's been here a while let's quit here if we do ls we can see oh there is a file time tracker.log and if we look at it well sure enough it looks like about a half hour has gone by since i created that crontab entry and it's been adding to this file if we do ls minus l we can see the last time that was touched was at 18 10. so if we waited around until 1820 it would do the same thing again it would add another line to it so that's how you do a recurring event using a personal crontab now if you just have something you want to have execute one time let's clear the screen we can use the at daemon and first i'm going to do something really quickly so we'll say at and then can specify the time and this is what's nice it's very flexible we could say tomorrow we could say next week and it will interpret all of those different commands it uses a lot of fuzzy logic to figure out what you want i'm going to say at now plus one minute and then we're gonna get this at prompt which now allows us to execute something so i wanted to do echo this was a one-off and i want to append that to home bob time tracker dot log press enter and now we could do another thing we could have like a whole list of things we wanted to do at now plus one minute but i'm just going to do control d and that will put it in queue so it says job4 is in queue and it's going to execute at 18 13. okay now if we type atq oh it already happened dog on it first of all let's look cat time tracker look at that it did it right it put it to the end of the file but i need to do another one so at now plus one minute again and i'm gonna say echo hello into home bob time tracker dot control d atq ah there we go okay did it in time so what this shows us is the queue of things that at is going to run so job number five is scheduled for thursday june 20th at 1814 and the user bob is who's doing it so if we keep pressing at q once the time rolls around it's going to execute that and then go away because it's a one-off right we could do multiple things we could say at tomorrow then i want to say echo test tomorrow i spelled it wrong but that's okay on home bob time tracker.log control d on a blank line now if we do at q we're going to see well look job 5 executed but job six is going to wait until tomorrow and it does it the same time tomorrow so 24 hours from now and tomorrow at 18 14 it's going to do that command test tomorrow okay now let's say we don't want to do that we want to change our mind well then we can say at rm job six and now at q is going to say there's no jobs because we've deleted job number six that was going to execute tomorrow but if we look at time tracker look at that sure enough hello was put there along with that this is the one off and if we wait around until the next 10 minutes pass our cron job is going to add another 10 minutes has passed on to the end of this file and that's how we can schedule things with crontab for a repeating task like these or just a one-off task like this by using the at daemon it's great to be able to do system-wide things using cron but i personally like the fact that you can do it as a personal end user using cron tab minus e and it's going to do it just as your user so you don't have to become root or worry about escalating privileges it's going to just execute it as you even if you're logged out same with the at daemon even if we're not logged in it's still going to execute it at the given time working with multiple processes on linux is really really easy because you can put them in the foreground or the background and interact with them however you want now there are some tools that may not seem intuitive at first but once you get the hang of using them they're really really easy and there's also some keystrokes that we're going to have to learn and a couple tricks that will allow us to do things that we normally couldn't do so let's go right to the command line because this is the kind of stuff you have to experience in order to really understand so first of all i am on the command line and i'm going to show you how to put a process in the background now i'm just going to use a simple process called sleep the sleep command if you're not familiar with it it just pauses right so if we say sleep one it's gonna sleep for one second and then it's gonna be done so i'm gonna say sleep for a whole bunch of seconds which will like i don't know that's probably a couple hours or something and then i'm going to put the ampersand after it now what's going to happen is notice it's done a number one and then this is the actual process number so our job number is one now we can do that with another one sleep two two two two two put that in the background and now we're gonna see we have job number two in the background with this process id and they're just running in the background if we type jobs we can see sure enough there they are there's sleep one one one there's sleep two two two job one and job two now if we want to start going and using one again like let's say we wanna bring one in the foreground we just type f g and then the job number so fg two and it's going to bring us it tells us what the command is that's running and here we are we're just at the command line here if we wanted to stop this we'd have to do control c it stopped that process and now if we do jobs see there's only the one job running it's actually really cool now what if we did a job like we did sleep 33333 and press enter and i'm like oh man i really wish that was in the background i didn't mean for it to just like hold my command prompt here well what we can do is do control z and it stops it and puts it in the background so see it says stop if we type jobs we can see this first one that we did is still running but this one is stopped so if we want to make it run in the background we have to say b g for run in the background job number two and now if we do jobs we're gonna see now they're both running in the background so bg and then the job number will start that background task running fg will bring it out of the background and bring it right to our interactive terminal okay so that's kind of neat it's a way we can create them we can put them in the background if we've started them and we don't and we want to put them in the background just ctrl z will suspend it briefly and then we can tell it to go in the background by doing bg and then the job number so let's do foreground one i'm going to do control c foreground two control c and now jobs we have no more jobs running now there is a problem because if we have a job running in the background like i have a command here called my hello all right and all it does is every two seconds it prints hello on the screen all right so i'm gonna control c what if i wanted to put my hello in the background and it's going to run in the background i was still going to put it out on the screen every hello but it's running in the background if we do exit it's still running in the background but the problem comes where if we log out and then we log back in we open a terminal window and we do a ps aux grip for my hello we're gonna see it's no longer running this is just actually the grep process that it actually found here but notice it's not running in the background anymore and if we wanted it to stay running just putting it in the background wouldn't work we'd have to use a program called no hup because here's the problem when we log out of the system up here it actually sends a hang up interrupt to all the running processes and the hang up or hup tells it to stop running well we can run a program by saying no hup which means don't hang up when the user logs out what program we want to run so my hello and then ampersand in the background it's going to put it in the background just like before we can see it jobs there it is running but it says it's ignoring the input and appending the output to nohup dot out alright so i'm going to exit i'm going to log out and then i'm going to log back in open up a terminal window i want to notice two things one if we do a ps minus aux grip for my hello it's still going to be running see here it is it's still running because we ran it with no hup but here's another cool thing if we do an ls see this nohup dot out let's look at that all of the output was put into that file so all of the hellos were put into that file so we know what's going on plus when we logged out it didn't stop running so not only is it really easy to handle processes when you're on the linux command line it's also possible to do a couple cool things like no hup if you want to make sure it doesn't quit when you log out and also control z of a running process which will put in the background suspended and then you can run bg to make sure that it continues executing in the background finding specific information about local devices on your system can be a little bit challenging but thankfully there's a bunch of tools that will help us along the way d message is one that we can use that will kind of see how things are going in real time if you plug something in d message is going to show you the results of plugging that in but then there's a whole suite of ls tools if you will they start with ls it's ls usb lspci lsdev msblk ls cpu all of these tools are going to be able to be used to find out information about hardware on our system and there's a nice trick to remember these so that you don't have to actually remember pci dev blk cpu all of these different things so let's check that out and see if we can find the various hardware that is in our system you have to be root to use most of these some of these you can use as an end user but since we're going to be looking at all of them i became root just so that we get a better view of what's going on now first of all d message you just type d message on the command line and it's going to show you things as they happen and this is the location that we'll see things happen if we make changes to the system like if we plug in a usb drive or plug in a sata drive or something like that a new mouse it's going to show up here and it's going to give us information about the particular device but if something's already plugged in or you just want to see like what's built into the system that's where the ls tools come into play so let's clear the screen and what i like to do is just do ls and then hit tab a couple times and that's the trick because really tab completion is vital here there are so many of these commands now we know ls just means list like the file directory but they've used this same tool or the same keystrokes to prefix a bunch of commands like ls blk for example which will show us the block devices it will list the block devices on our system if we do that it's going to show us all the block devices that we have like the floppy drive all of these are virtual loopback devices that were created down here these are our actual drives on our system sda we can see right here it's mounted on forward slash sda1 is then we have a bunch of other drives down here that aren't being used right now but they're all 10 gigabytes in size we used those and we set up raid before but this shows us all of the block devices so if we do ls and hit tab a couple times again let's look at the next one we have ls block we have ls cpu now i actually really like this one because it will give you all of the information about the cpu in the system so we can here we can see here that it's a 64-bit processor little endian it's an intel it actually shows us the actual processor itself the model number it shows us the clock speed all sorts of stuff that we have vtx enabled so we can do virtualization and it shows us all of the awesome things that our cpu has including all the flags that it supports when we're compiling things anyway that's ls cpu if we hit ls and tab a couple times again i'm going to do this every time because it's the quickest way to see what's there we have lsdev i want to do this you have to be root to do this and what this shows us this might not be as useful as it was years and years ago and we would have to troubleshoot hardware more frequently but what this is going to show us is the the device like what device it is like our floppy disk our keyboard a pci port here and it's going to show us what direct memory access number it uses what dma what irq number it's using uh the io ports meaning like in memory what io parts of memory or what parts of memory is it using for i o and it will show us all these things now if we're trying to find conflicts on our system this might be useful but i'll be honest i've never once had to use this in practice it's important to know that it's there though ls dev will show you all the devices on the system now if you do ls tab tab one that i actually do use fairly often is lspci where is that up here lspci so oh pci will show us all the pci devices so here we have a vga compatible controller this is the acpi controller the ide controller so these are the pci devices that are plugged into our system uh we can do ls usb now i don't have anything plugged into our usb ports but if we did ls usb would would show us what's there same thing ls pcmcia if you're a laptop user you might have things that show up if you type that and one last one we'll look at is ls mem and this will just give us the range of the memory and how it's being used and where it is in our system but really the big big pull away from this is ls tab tab and it'll show you all of the different ways that you can look at the devices on your system so whether you're trying to get an inventory of the things that are currently on your system or you want to see changes as they happen in real time there's some really simple built-in tools on our local linux machine for detecting and looking at specifics on devices you ever try to solve a problem only to make it worse yeah me too for example we had a wall with a nail in the wall and that nail would get loose after hanging stuff on the nail for a long time because it was just in the thin layer of drywall this nail just went in the drywall and then the more we hung stuff on it the more it got loose so i thought i would make things better by getting rid of the nail drilling a hole in the drywall and then getting one of those really nice hollow wall anchors that you could then put a nail or a screw into and it would be really nice and sturdy the problem is i drilled the hole way too big the hollow wall anchor didn't work at all and then i just had this gigantic hole in the wall that my wife was really upset about that's kind of what happened in the world of linux virtual file systems and let me tell you what i'm talking about here because the idea of virtual file systems has been around since the unix days basically it's a file system that's created when the system boots up and one of the really popular ones is called proc and what this was used for is process information like of running apps and running programs on the system they would store all of their runtime information in this virtual folder in memory called proc and then people said hey that's a really neat place to store things like process id number 181 for that application and 2 5 5 6 for that application what if we also put things about like the cpu in there or the network card in there and then people started adding things to the proc folder it started to get a little bit confusing and so we thought hey wouldn't it be a great idea if we separated all of that kernel information so that process information was stored in the proc folder and kernel information was stored in assist folder they're both virtual file systems that work the same way but some organizations seem to make a lot of sense right now the same thing with dev dev is a folder that we thought why don't we start putting things like information on different devices like hard drives or mouses mice mouse mice mouses anyway when we put those things into the dev folder oh here's the issue the proc folder had been around for a long time and it already had a mix of things in it so we said okay we'll keep all of those things and then we'll just put new things in the sys folder the problem is there's not that many new things so while the sys folder is very well maintained and very neat and organized it doesn't contain a lot of the things that we use on a daily basis because they were already in the proc file system and now they're there for backwards compatibility now that doesn't mean proc and sys are any less useful if we go into the proc folder and do an ls we're going to see here are all of those numbers these are the process ids that i talked about right and if we do like an ls of oh the process id 47 inside here it's going to be all of the various things that this particular application is doing like it's it's memory io what's mounted uh the status all of the things about this running thing are in here so if you want to find out information about whatever process id number 47 is you can look at these various things but notice over here on the right hand side there's also a bunch of things that aren't process ids and that's where like mdstad meminfo cpu info let's look at that so if we look at cpu info this is going to just be a complete breakdown of the cpu in the system right now that's running this is our current cpu which is really useful information but it might make more sense if it was over here in the sys folder because the sys folder is very well laid out right there's not a whole bunch of crud here this is laid out in very nice sections we could go into oh let's say the kernel folder and inside the kernel folder it's separated into various things about the kernel like the current configuration all of these things are nice and organized but it's not all inclusive meaning it doesn't contain a lot of the information if we scroll up that's still stored in the proc folder so here's how it ends up working a lot of the interactive stuff is going to be in the proc folder because it's always been there but a lot of the information about the running kernel is very organized well in the sys folder and that's where a lot of times programs will look if they want information about the current running kernel like do i work on this program well i don't know let's see you know information about hypervisor and kernel modules and all of those things it'll look in the cis folder so generally you'll find yourself in the proc folder more often all right now one other place that is worth looking in is the dev folder and this just shows about system devices right these are hardware devices that are plugged into the system either like soldered in like the real-time clock or hardware like hard drives that we put in here and this is going to be information binary information generally available to the system talking about the various bits of hardware that are plugged in now this is fairly well organized too and a lot of times programs will look here to find out information like hey is there hardware that is going to be compatible with my software well this is where it would look for hardware information so while it sounds like it's a big mess honestly it's not that bad there's three main places to look the proc folder which is what we see here the sys folder which we see here and then the dev folder which we see here all of them contain different information sys and proc have a little bit of crossover there's going to be some kernel information insist that's already living in the proc file system but if you're going to be interacting it's mostly going to be in the proc file system and applications are going to be hitting the sys file system pretty hard now i know that's a lot of information about linux virtual file systems the key takeaways are that we need to know information about the running kernel are going to be in generally one of three places in the virtual file system that happens when the computer boots up the proc file system the sys file system and the dev file system if you can look in all three of those places you're going to find what you're looking for but they might be spread out in places that don't make a whole lot of sense just because we keep things there for backwards compatibility i hope this has been informative for you and i'd like to thank you for viewing on a linux system the printing is going to be handled by cups which is common unix printing system which interestingly is actually owned by apple of all people but nonetheless this is what every modern linux distribution is going to use to handle the printing now the nice thing is there are some backwards compatibility command line tools so we can print things from the command line and if you don't have a gui to actually install the printer like if you don't have gnome or kde or anything like that installed you can use a really nice web interface that will allow you to interact with the cup system and install modify and do all the things you need to do to a printer but we're going to look at the web interface really quickly just to show you where it is and i want to show you how to use the command line tools so you can print from a command line system even if it's a headless system on a rack now i'm logged in here on my local ubuntu computer on my local network and you'll see up here i went to localhost port 631 now that's important because cups listens for incoming connections on port 631 and if you go there it'll allow you to log in so you can do things like administration for example we are here and i have one printer installed we can click on it'll give us information about this particular printer called office underscore laser we can see it's a socket-based connection on this ip address on port 9100 and basically it's installed on the computer i actually installed it using the web interface instead of going into the gui on this computer and installing it with the ubuntu tools i used the web-based interface to make sure that it would be able to install correctly and it did once we're here we can actually do things you know from firefox like print the page but if we're on the command line things are a little bit different there are a couple tools that start with l p and if we hit lp and then hit tab a couple times for tab autocomplete we're going to see some of those here now they're not all things we're going to look at right now but mainly lpr is going to send commands to the printer lpq is going to show us the queue of things waiting to be printed so if we type lpq we're going to see office laser is our default printer it says it's ready but there are no entries meaning there's nothing that's currently being printed now we can use lpr to print things in a couple different ways we could actually just say echo this is a test and we can pipe that into the lpr command and if we look over here on my printer it's going to print this page right out and we can see sure enough it says this is a test so it'll do text just like that now we don't have to do line by line we can also do something like in my folder here i have a full file called my document so if we look at my document you can see it's just a string of text things and we can say lpr and then the name of the text file my document we'll print that and as it prints out on the page if we look at lpq we're going to see we have a job right here job number 10 is that we could use lprm if we're quick enough and get rid of job10 oh it's already completed but that is how we would delete a job that we had already sent so if there's a whole bunch of things queued up and something's wrong we can use lprm in order to fix it and if we come back over here we're going to see sure enough it printed out that entire file for us right there on the page so it's really easy to print using the command line once it's set up with cups and thankfully cups is really easy to set up too because it's all web-based and that web interface honestly is very powerful we can use it to queue up jobs to stop jobs do test pages we can use cups for gui environment printing but the most important thing to realize is that we have access to the cup system by using the command line tools that are installed on our system udev is the userspace device manager that has replaced dev fs and older linux systems now what's really cool is it uses sysfs which is a virtual file system that has information about the hardware and then it follows rules in order to keep devices with consistent names now what i mean by that is if you've ever had a linux system with multiple anything but we'll say hard drives the order that you used to put them in the system depended on the name they would get so the first one the system recognized would be sda then sdb and then sdc but on the next boot if they came up in like a different order while all of a sudden this one might be sda and this one would be sdb and this one might still be sdc and those are real pain in the butt so what udev does is it creates devices based on specifics with the hardware so it'll take its like uuid and it will create a solid consistent device that that's always going to be so if you put them in a different order or the buses come alive in a different order on boot they're still going to be the same devices which makes things a lot simpler when it comes to mounting drives and things like that now it also allows us to do some other things like create rules so first of all i want to show you some of the things you can do if you're root you have more access to do this so we're going to say u dev adm this is the administrator command that allows us to do some things with the udev system so the first thing i want to do is say udev info let's actually look at dev sr0 which is actually our dvd drive and it gives us all sorts of information specific to this drive so we'll see a couple things like let's scroll up we see this is n means the name so this is the name that it has assigned it this is the place in the pci device where it exists this is all the specific information about it and if we look lsdev grep sr0 we're going to see sure enough it's right in the dev folder sr0 it's dynamically created when it detects it and puts it in now sr0 is not terribly useful at least for me to recognize a hard drive or a dvd drive when it's in the system i can see it mounted right here but that doesn't help me sr0 doesn't mean dvd to me so we could do something like make a rule that creates a shortcut to it every time it recognizes it in the system so we're going to go into the etc udev rules.d folder and in here let's see what there is there's one called snap core rules but i'm going to create a new one i'm going to say vi we'll say number 10 so it loads it first sean dot rules sean rules nice anyway it has to end in dot rules and then we're going to create our own rule now the format here is something that you just kind of have to get the hang of but the kernel is equal to sr0 meaning this is what the kernel knows the device as it's in the sub system is equal to block which it actually showed us when we did that command and then what i want to do is create a sim link and i want to call that sim link my underscore dvd okay so what we've done is created this rule anytime it has a device called sr0 in the block subsystem meaning like a block device like a hard drive or a cd-rom i wanted to create a sim link called my dvd in the dev folder so i can reference it there instead of remembering sr0 so let's save this now we could reboot the computer or we could just do udev adm trigger okay and now if we go into the dev folder and we do an ls we should see in here look at that my dvd it's created that sim link to well in fact let's do ls minus l grab dvd my dvd is a link to sr0 it looks like there already was a dvd shortcut to sr0 but we created a new one by making our own udev rule called my dvd and that's pointing to sr0 and we can use this when we're referencing the device in anything that we want like mounting and that sort of a thing and the cool part is this isn't just a sim link sitting on our system this is something that's going to be created every time the dvd is recognized and set up by the system so on boot even though the dev file system is a virtual file system it's always going to have that sim link my dvd pointing to sr0 now we just graze the surface of the things we can do with udev rules we can actually make things happen when certain usb drives are plugged in that sort of a thing but one of the key things i want you to take away is that it's really smart and it uses sysfs in order to know information about the drivers and everything else how it interacts with the kernel it uses the information in sysfs to create the entries in the dev folder i hope this has been informative for you and i'd like to thank you for viewing you
Info
Channel: freeCodeCamp.org
Views: 298,514
Rating: 4.9759483 out of 5
Keywords: linux training, linux administration, linux server administration, linux admin course, linux server administration for beginners, linux server administration tutorial, linux system administration, linux commands, linux command line, linux network configuration files, linux kernel modules, linux managing local users, linux configure yum repository, install rpm packages, install yum packages, cbt nuggets, shawn powers
Id: WMy3OzvBWc0
Channel Id: undefined
Length: 326min 46sec (19606 seconds)
Published: Tue Mar 09 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.