Learning Awk Is Essential For Linux Users

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
one of my favorite command line utilities is a program called awk awk is a text processing utility that means you give it some text and it can grab certain columns certain rows certain fields from that text for you you can tell it to go search for a certain string patterns in the text and even replace those string patterns with other strings it's a really powerful program and it's one i use all the time i probably overuse awk i use awk everywhere especially in my shell scripting because i'm really comfortable with it and it's one of those programs that once you learn awk you wonder why you didn't learn it sooner because it's such a powerful program so let me switch over to a terminal emulator and let's play around a little bit with awk so i'm going to run the ps command just because this command gives us some output that we can manipulate the text a little bit with and it's nice and in columns and that's kind of what we want columned information because then alt can go in and grab certain columns from this output for us so if i take the output from ps and then run it through alt and then single quotes and then give it inside braces and action the action i wanted to perform is i want it to print and then space dollar symbol one that stands for the very first field the first column so print the first column from each line so i should get the output of p id and then these numbers here and if i hit enter that's exactly what we get now if instead of the first column you wanted the second column you know you just change dollar symbol one to dollar symbol two and how it determines what the columns are spaces or columns in awk that's what it is by default you can change the field separator from spaces to any character you like but by default if you don't change it it treats a space as what separates the columns for the print command dollar symbol 0 if you do print dollar symbol 0 that means print everything essentially that's essentially like cutting a file and uh or in this case we're just doing the same thing as the ps itself it's like we're not manipulating anything we're just printing the output exactly how it came to us if i did print without anything at all that's essentially print dollar symbol zero again it's kind of like a cat so one of the files that people love to use alkon on gnu slash linux systems is the slash etsy slash pass wd file that is a file that lists all the users on your linux system so if i get a cat on slash etsy slash passwd this is what the output looks like there's not a lot of spaces to it there are columns i mean it is separated into columns but the columns they are separated by colons here remember alk treats spaces as the column delineator so what what you would have to do here is you would have to do something like awk dash capital f for field separator and then inside double quotes let's tell it instead of using spaces as the field separator let's use the colon as the field separator and then let's do inside single quotes and then the braces print is the action and dollar symbol one that should give us a list of users on the system and then we of course we need to specify where we're getting all of this from we're getting this from the slash it's a slash past wd file and that is a very convenient way to get every single user on the system on a linux system so you use that all the time you'll use it just you know system administration or sometimes through scripting sometimes you want to get a list of all the users in a script now let's take that same command so i'll do an up arrow and instead of just printing the very first column what if i wanted you know several columns well you can do that you can print the first column and then i could do dollar symbol six for the six column dollar symbol seven for the seventh column and what this is saying i i want you to print columns one six and seven from the slash etsy slash pass wd file using colons as the separator and that's what the output of that would look like now it's not very readable because we didn't tell it to separate the columns with spaces or colons or anything we told it hey print columns 1 6 and 7 basically all jumbled up together is what that is actually telling it to do but we can tell it hey you know what put some spaces in here so i can do double quotes and then space and then double quotes inside the double quotes a space in between the columns here if i run that we added spaces to that now that still doesn't look very readable probably better than spaces inside the double quotes would be a forward slash t and what that will do is that will give us a proper tab in between the columns and hopefully that lines up a little better and it does it it's much more readable that way now other than specifying a field separator to search for and use you know to determine what the columns are you can actually specify all to actually print out the field separator as well and you can tell it to change the field separator to a different character as part of the output so if i ran this command right here what this is is inside the single quotes now i have two different actions so the very first action is begin and then fs equals colon and then semicolon ofs equals dash so this is saying hey find the the colons treat them as the field separator to perform these actions but as part of the output print a field separator that is a dash and what i want you to print i want you to print the first sixth and seventh columns on slash etsy slash pass wd now if i run this you can see the output now i have columns one six and seven and the separator is dashes and that's actually printed out this time it actually prints out the separator where by default all typically doesn't print out separators now one cool thing that you can do with awk as far as printing columns there's a very easy way to print the last column of each line it's with print dollar symbol capital n capital f let me show you this what i'll do is let's cat the slash it's a slash uh shells file here and this lists all the shells that are available on my system right now and what i could do is if i wanted to print out all the shells on my system but i didn't want the full path name you know i just want sh bash get shell csh fish you know the proper names without the full path what i could do is i could pipe this through all or i could just use the alt command i don't need to cat and then pipe it into awk i can just do this alk dash capital f and then inside the double quotes here i'm going to tell it use the uh slash as the field separator then what i want you to do is then i want you to search for a particular string here so i'm going to do slash slash and what this is is inside these slashes here awk is going to search for a pattern what is the pattern i want you to search for i want you to find the beginning of every line so that's the carrot symbol that begins with a slash now you have to escape that slash in between the two outer slashes so it's going to search for basically every line that begins with a slash that is what we're looking for that way it ignores these first three lines which are you know not information i'm interested in and then after that what we want to do is we want to actually print and dollar symbol capital n capital f print the last field and if we did this right and of course we didn't do this right because i get no output because obviously you have to specify slash etsy slash shells at the end my bad and then i get a list of all the available shells on my system but you see there are some duplicates i have two zsh i have two fish so what i could also do is other than you know just running the alt command i could then take the output from that all command and then pipe it into unique yeah and then it removes the duplicated entries so every entry now has to be unique and that is actually a list of all the available shields on my system right now and of course through the magic of piping if i wanted to properly sort this alphabetically you know to make this even more readable i could do that as well and let me clear the screen and let's run the df command because this is another common command that has nice columned information that people love performing all con so let's do df and i'm going to run this through alk i'm going to go ahead and pipe it into awk and let's look for a particular string of information so what i'm going to do is inside these slashes what do i want to search for i only want to return the lines that have slash div actually let's uh narrow it down even further i want the lines that include slash div and slash loop so that particular string slash dev slash loop and then what i want you to do is i want you to take those lines that have that string and then i want you to print uh columns one let's just do one okay and then let's print a few other lines and let's do again let's just do tabs here so let's print the second field and then another tab and then print the third field so we're going to take the first second and third column on every line that includes the string slash dev slash loop and there we go on that now one neat thing you could do here is uh awk is a proper scripting language in itself and it does do simple math if i instead of printing columns two and three maybe i wanted to perform some kind of arithmetic you know some mathematics on these uh these numbers here so what i could do is i could do something like a print column one and then a tab and then print two plus three and there you get column one and then you get the sum of columns two and three the numbers you know added together and of course if i instead of plus wanted to do minus i could but on the loops those columns are actually the same number so it should be zeros across the board now one of the cool things you can do with awk is you can filter results by the length of the line itself so let me cat slash etsy slash shells one more time because it's such a simple file to work with here and what i could do is i could all and then single quotes again and this time what i want you to do is run this function length and then inside parentheses dollar symbol zero so i want the the length of the line and i want it to be greater than seven characters and then give it a file to actually run this on slash etsy slash shells and it's only going to return the lines that were greater than seven characters because there was one line that was less than seven characters slash ben slash sh was only six characters and you see i filtered that out by using the length function if you wanted to see just slash bin slash sh what i could do is rerun that command itself instead of length uh greater than seven length less than seven and now the only line that is returned actually was the blank line there was one empty line let me rerun that command and do less than eight and now i have two lines now i have the empty line and then i also have the slash bin slash sh line another example of all you know as an almost scripting language in and of itself is let's do the ps command again for the processes i'm going to do ps dash ef this is a list of all the processes that are running on my machine right now what i could do is i could pipe this into awk and i'm going to do inside the single quotes this time and then a brace pair here i'm going to do an if statement so let's do if and then inside parentheses i'm going to do dollar symbol nf so if the last field because remember dollar symbol nf is the last field space equals equals and then if the last field equals slash bin slash fish because fish is my default shell so there's going to be several instances of fish running on the system right now probably at least i'm assuming this will not work obviously if fish isn't running but i'm going to print dollar symbol zero so every line where the last field is slash bin slash fish and right now there are four processes running on my system that had slash bin slash fish as the name of the process another cool scripting example that you could do with all you could do something like this let's do awk and then let's do begin and then let's go ahead and get the squirrely braces out and instead of an if statement this time why don't we do a for statement i'm going to do 4 and then let's go ahead and declare a value of an integer so i'm going to do i equals 1 and then semicolon i'm also going to declare that i is less than or equal to 10 and then semicolon and then i plus plus so i want you to incrementally add 1 to i until you get to 10. right we're going to take i from 1 to 10. and then what i want you to do is let's do a print statement after that and then i'm going to do double quotes and inside the double quotes i'm going to do the square root of and then comma space i comma space the word is comma space and then i times i and then semi colon at the end so can you guys see what that's going to do we're going to take every number 1 through 10 and get the square root of it right when it's going to print out the square root of i is the actual square root i misspelled square though let's go ahead and actually make that the correct spelling and there you go the square root of 1 through 10. another thing you can do with awk is you can have it print out every line of a document that begins with a certain character or a certain string so here's an example of that from my bash rc so what i'm doing in this is i'm telling alk and then inside the single quotes i'm saying the first column needs to be it needs to match this search pattern and it needs to match the beginning of the line needs to be either a b or a c and then what i want you to do is print the entire line so print every line where the very first character is either a b or a c and those are all the lines from my bash rc that meet that criteria now let me clear the screen i'm going to cat out this file that i i created just as an example file here because you often have this kind of stuff on your system or you know sometimes you're dealing with these column layout kind of files especially where you have line numbers so you know one column is just numbering one through whatever and then you have a second column and you know sometimes you want to print out just the second column now of course with all what i could do is i could tell it just to print out the second column right because spaces spaces are the field separator but you know sometimes you know sometimes you may want to do it a different way depending on the layout of the file so let me show you an alternative method we could use i could do something like alt and then inside single quotes print and this time i'm going to use this function here this substring function and then inside parentheses we're going to tell it to print the dollar symbol 0 comma and then 4. so what i'm saying in this statement here is i want you to print every line but i don't want you to print until the fourth character so ignore the first three characters of every line and if i run that you see no longer do i have the line numbers you know that very first column now i just have the second column which was a list of uh window managers now another neat thing we could do to that number dot text file is i could run a command like this here what this is going to do is it's going to find every line what i want to do is i want to find every window manager in that list that had the letter o as part of the name so that is what the match function is doing it's saying match on every line the letter o and then every line that has the letter o in it i want you to run print what you what i want you to do is print that line and then follow it with this string has o character at and then the function r start that is the index location where the letter o actually appeared on the line i hope that makes sense let me show you this in action so this is the output line one x monad has o as part of the name so x monet has o character at the index position is six meaning o is actually the sixth character on the line now sometimes you want to print a range of lines so there's the df command but you know i don't want all of this information maybe i want only a section of the output from the df command well what i could do is i could df i could pipe that into all let's get into the single quotes and i'm going to tell it nr which is the number of records so think of the number of records as the line numbers essentially n r equals equals seven so you need two equals for it to actually mean equals so nr equals equals seven and then i want to do a range from line seven to line let's do line eleven and then space and then we want to run this action here print nr the number of records dollar symbol 0. so if i run that what we should get is lines 6 through 11 printed out from the df command and there you go and we even get the line numbers as part of the command because that's what the in the print nr means now if i didn't want the line numbers as part of the output i just delete the nr part and now we get the output without the line numbers a common thing you can also do with awk that i find myself doing occasionally is getting a line count with awk so what you could do is let's do this on the slash etsy slash shells file that we were working with earlier so i'm going to do awk and then inside the single quotes i'm going to do end space and then let's do the print action again print nr again that's the number of records that's the number of lines on slash etsy slash shells and we get 13 return that's how many lines are in slash etsy slash shells now what's really cool is you can actually do this on multiple files and it will actually add the number of lines together so if i wanted the number of lines all together in my slash etsy slash shells file and also my slash xl slash pass w default i get 55 returned so that's just a little bit of awk there i'm not a master at alk by any means but i know a lot of the the alt commands i know enough to where it's one of my go-to things especially with shell scripting if i need to manipulate text in some way you know typically i do it with awk even if it may make sense to do it with other tools you know you have other things like grip and sid and a bunch of other command line utilities i'm always talking like i said in the beginning of this video i probably overuse awk a little bit but but it's such a cool command that i think everybody needs to know about especially if you really want to start getting serious as far as shell scripting you really need to know how awk works now before i go i need to thank a few special people i need to thank the producers of this episode absi gabe james mitchell akami allen chuck david dilling gregory eryon paul polytech scott stevens finn west and willie these guys they're my highest tiered patrons over on patreon without these guys this episode about the alt command would not have been possible the show is also brought to you by each and every one of these ladies and gentlemen as well all these names you're seeing on the screen all these fine ladies and gentlemen they help support my work over on patreon because i don't have any corporate sponsors it's just me and you guys the community if you'd like to support my work i'd appreciate it look for distrotube over on patreon alright guys peace
Info
Channel: DistroTube
Views: 106,082
Rating: 4.9537468 out of 5
Keywords: awk command, awk command in linux, awk tutorial, shell scripting, linux tutorial, awk tutorial in linux, linux commands, linux basic commands, gnu awk, awk scripting, linux awk, awk command in unix, shell scripting tutorial, linux, gnu linux, awk tutorial examples, awk examples, how to use awk, linux command line tutorial, linux administrator, command line interface, command line basics, command line tools, command line linux, distrotube
Id: 9YOZmI-zWok
Channel Id: undefined
Length: 20min 1sec (1201 seconds)
Published: Wed Jun 16 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.