Python Tutorial: Automate Parsing and Renaming of Multiple Files

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey how's it going everybody in this video I'm going to do something slightly different from my other tutorials so from time to time I'll run into a problem that can be solved with a quick and easy Python script now I thought I could start recording these videos so you can see how these scripts can help automate a process that would otherwise be boring or repetitive or just prone to mistakes so let me know how you all like this kind of video and if you find it useful I'm hoping that by seeing how we solve these simple real-world problems that it might give you some ideas for how you can automate your own repetitive tasks so the problem that I ran into today was that I downloaded a lot of videos from a free class online that I wanted to watch while I was traveling back home for the holidays but when I downloaded these videos the titles were formatted in such a way that they wouldn't be sorted properly on my phone's playlist so this is just a small example here this isn't the actual class files that I downloaded but this is just a small example that I made up so that you can kind of get an idea of what the problem was so you can see here that they put a title at the beginning of the file name and then they have a dash and then they had the course name and then they had a number which shows in what order these videos should be watched so for example this number 1 is the first video in the class the number two should be the second video number three should be the third and so on but these file names get sorted alphabetically so you can see how them having a custom title at the beginning of the file name can make the order in which these videos are supposed to be played all out of whack so you can imagine if I had hundreds of videos for a specific class how this would start to be kind of a pain if I was doing this on my phone and I watched the first video and then I had to scroll through a bunch of videos to find the second one it'd be really nice if they were just in order so that they could just autoplay in the order that they're supposed to so really I want to rename all these files and I want this number here to be at the beginning now I could go in and do this manually but the classes that I've downloaded were a lot more than this there were hundreds of them so it would take forever to go in and manually rename all of these so that the number is at the beginning so I'd really just like to write a quick and easy Python script to do this for me so let's go ahead and do that okay so I have a blank Python file here and the first thing I want to that I'm going to do is import the OS module this lets us navigate the file system and change file names and things like that so now one of the first things I'm going to want to do is to change my directory to the directory that holds all of my files and we can do that with OS dot CH dir and then we need to put in the path to the folder that holds those video files now there might be a faster way to do this but to get a file path quickly what I like to do is just open up finder and then drag the folder that I want into the terminal and then it will autocomplete that entire path and then from there I can copy that and then just paste it into my program here and if I save that now I'm going to check and make sure that I'm in the directory that I want to be in so I'm going to do a print of an OS and then a get current working directory and I'm going to run that and you can see that it prints out after a changes directory to this location then you can see that we are in the correct location where our files are so now that we know that we're in the right directory let's go ahead and try to print out all of the files that are in this directory so what I'm going to do is I'm going to do a for F in OS dot lister and this will list everything in the directory and just to start off here I'm just going to do a print F to see if I'm getting all the correct file names and you can see here at the output that it did print out all of those file names so you're going to kind of see that when you write scripts like this that there's a lot of trial and error so instead of just directly jumping in and trying to write out a solution you'll likely want to do one thing at a time so at first we change directory and then we printed that out to make sure we're in the right spot and then we loop through all the files in that directory and we printed out all those files to make sure that we were getting those correctly and then we can kind of just build up a solution one step at a time so now that we can see that we have all of our file names here let's go ahead and split off the extension from the rest of the file name the way that we do this is we do this with a command called OS dot split text and we'll pass in that file now if I print this out then you can see that what it gives us on actually that's not oh s dot split text that's Oh s dot path dot split text so if I save that and run that you can see what it gives us is a tuple and each tuple the first element has this file name here without the extension and then the second part is the extension so I'm going to use this tuple and I'm going to set this equal to two variables so the first one I'm going to call file name and then a comma and then the next one I'm going to call file ext and I'll just set that equal to those tuples and now underneath here if I was to print out the file name then we have the file name without the extension so now let's remember what we're trying to do we're trying to rename this file to where these numbers are at the beginning and in this specific example we can see that there are these hyphens between the title and then the course name and then the number so let's see what happens if we take this file name and do a split and we'll do a split on that - so if I save that and run it now you can see that we have three elements we have the title and the course name and then the number so now just like the line above I'm going to take what we just printed out and I'm going to set three different variables here I'm going to call this F title and then I'll call the second one F course and then I'll call the last one F num and I'll set that equal to those elements so now just to make sure that we did that right I'll print out one of these at a time so you can see how we have the title and if I print out the course they should all be the same and if I copy and print the number then they should all be different there so now let's see if I can print out a formatted string in a way that I want my file to be represented so a formatted string I can put in place holder here so let's say that I want the number here with a dash and then I want the course name and then a dash and then I want the title here at the end and then I'll also want the extension so I'm going to put in a placeholder for that too so I'll do a dot format and this is where we fill in what those placeholders were so first I wanted the number so I'll copy that and paste it in first and then I wanted the course so I'll grab that and put that in and then I wanted the title so I'll copy that and put that in and then I also wanted the file extension here so I'll save that and actually just to make this consistent with the rest of the program I'm going to call this F underscore ext and I'm going to call this F underscore name there so let me replace those okay so now let me print this out now you can see that this is close to what we want now we do have some weird spaces here between these hyphens and before the file extension so let's go ahead and take care of that so in order to remove those I'm going to take these three variables here and I'm just going to set them equal to their same variable but instead let me copy all three of these here and instead I'm going to do a dot strip on the end and this will strip away any white space that is on the left or the right so now if I save that and run it you can see that now when we print this out that those spaces are taken care of so this is looking pretty good we're just about to the point where we're finished up now I could probably go ahead and rename these files to this output and be done with it it would sort a lot of the files in the way that I want them sorted but there's a couple of things here for personal preference I don't like like for example I don't like the number sign being here at the beginning of the file name so one thing I'm going to do is I'm just going to strip that off by grabbing everything from the second character on so I'm going to go ahead and open this up and do a one and then a colon there to go to the end of that string now if I run that then you can see that now it stripped off that number sign there at the beginning and another thing that I notice here is that if we're going to be sorting these by the filenames that actually this one and this 10 will most likely be next to each other so after it plays the first video the next one in line will probably be the 10 because it'll just sort it based off of the first character here which is a 1 so all the ones will be grouped together so one way that we can get around this is to pad the single digits with zeros so instead it'll be 0 1 0 2 and then that'll put all of the single digits at the beginning and then the 10 will be at the end so the way that you can do a 0 padded string is with a method called Z fill on the string so I'm going to go I'm up here to the number and I'm going to do a Z fill and then the parameter that you pass in is how wide you want the string to be so I want the string to be two digits wide so 0 1 so if I save that and run it now you can see that all my single digits are padded with a 0 and if the digit is already 2 as the 10 is then it won't do anything to it so now this is looking good now we can pretty much rename this in any way that we want now based on our personal preference so really now that I'm looking at it I kind of don't think that I need the course name there either so I'm just going to go ahead and take that out so I'm going to take out the placeholder for the course and then take it out of the format and if I rerun it now you can see that that's gone and this is looking good to me so I'm going to go ahead and rename all of the files to this new format so I'm going to go ahead and take away this print statement and that string that we printed out I'm just going to set that to a variable called new name and set that equal to that and now if we remember here we're within a for loop so we have the original file here as f so to rename this I'm just going to do an OS dot rename and within here I'm going to pass in the original file and then what I want to rename it to is this new name so I'll save that and now I'm going to go ahead and before I run it I'm just going to let this take up half the screen here and I'm going to let the folder that I was within take up half the screen here and you can see that whenever I run this I'm going to go ahead and run it now and within this directory it did exactly what we wanted so it took our new name and it replaced the old file name with the new so you can see how a simple script like this would save a ton of time if you had tens or hundreds of files that you had to rename instead of going in and doing them manually which also you could easily make mistakes this allows you to do everything all in one shot and it's less prone to errors and also you can save these short simple scripts for a later use if you ever run into the problem again so that's going to do it for this video let me know what you guys think about me doing this kind of video from time to time I know that I didn't go into as much step-by-step detail the kind of the idea here is that if you see a specific problem being solved with a quick and easy script that maybe it'll give you an idea for how you can solve some real world problems and easily automate some of the tasks that maybe you had to do that were repetitive or prone to mistakes and things like that so hopefully you guys found this useful if you have any questions just ask in the comment section below be sure to subscribe for future videos and thank you all for watching
Info
Channel: Corey Schafer
Views: 336,775
Rating: 4.983294 out of 5
Keywords: Python, Python Tutorial, Python Script, Python File, File Parsing, Parse File, Rename File, Python Rename File, Python os Module, os module, Python os, Python for Beginners, Learn Python, Programming, Computer Science, Programming Tutorials, Software Engineering
Id: ve2pmm5JqmI
Channel Id: undefined
Length: 12min 34sec (754 seconds)
Published: Wed Dec 23 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.