Python Tutorial: File Objects - Reading and Writing to Files

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey there how's it going everybody in this video we'll be learning how to work with file objects in Python and some of the useful things that we can do with these objects so whether you use Python for desktop or web applications you're likely going to be interacting with files a good bit so it's definitely a good skill to have to know how to properly interact with these file objects okay so let's go ahead and dive in so to get a file object we can use the built-in open command so I have a file here called test.txt in the same directory as my python file now if I open this file we can see that it's just a plain text file with multiple lines so let's see how we can open and read this file from within Python now the way I'm going to open up the file right now isn't the way that it's normally recommended it's usually recommended to use a context manager which I'll show you here in just a second but to show you why a context manager is useful let me first show you this method for opening files first so what we're going to do here is we're going to say F equals open and we're just going to open that test dot txt file now if you're working with files from different directories then you're going to have to pass the path to that file into the open command but since this file is within the same directory as my Python file then I can just pass in the name of the file but if you want to learn more about how paths work then we touch on that a little bit in the tutorial I did on the OS module okay so the open command here allows us to specify whether we want to open this file for reading writing appending or reading and writing now if we don't specify anything then it defaults to opening the file for reading but I usually like to be explicit here so let's go ahead and say that we want to open this file for reading and we can do this by passing in a second argument here and that's just going to be the string of a lowercase R and we'll touch on some of these later but if I wanted to write to a file then it would just be a lowercase W that I pass in appending to a file is a lowercase a and if I wanted to read and write to a file then I could do an R plus but for now we just want to read the contents of the file let's just pass in a lowercase R okay so now the file is actually open and we can print the name of the file if I was to do a print F dot name and also before I run this and print the name of the file out there's one more thing that we have to do here if we open a file like we just did here then we need to explicitly close the file when we're done using it and to do this I'm going to do it by saying F dot close so now we've closed that file let's go ahead and run this and you can see that it printed out the name of the file that we opened and so this has some more information that we can print out also if we wanted to print the mode that the file is currently opened with I can do a F dot mode and if I run that you can see it prints out a lowercase R because we open the file for reading okay so now even though that this works the way that we've just now done this let me show you how to instead open the file using a context manager and why for most use cases you'll want to work with files this way so if we open the file like we did here then we have to remember to explicitly close the file if we don't close the file then you can end up with leaks that cause you to run over the maximum allowed file descriptors on your system and your applications could throw an error so it's always important to make sure that you close the files that you open so in order to use a context manager then it's kind of similar but we can do this using the width keyword so we can say width and then I'm just going to copy all of this so with open test.txt in read mode and then here at the end I'm going to say as F and then I'm going to put in a opening for our block here now this can be a little confusing to people at first because the variable name is actually over here on the right using as F instead of over on the left when we said F equals open but the benefit of these context managers is that they allow us to work with files from within this block and after we exit that block of code it'll automatically close the file for us so we don't have to worry about whether or not we add in these closes here now this will also close the file if there are any exceptions that are thrown or anything like that so that's why using these context managers are considered a best practice and it just automatically takes care of all that cleanup for us so now I'm going to go ahead and delete my outside open and close statements there now one thing that some people don't realize is that you actually have access to this F variable for now I'm just going to say pass within this context manager now we actually have access to this file object variable after we exit the context manager but the file will just be closed so for example if I print the closed method on F now and run that you can say that it you can see that it returns true but even though that we have access to this variable here it is closed so it's not like we can read from it like if I try to read the contents from the file and print that out then you can see that it throws a value error here and it says I owe operation on a closed file so for what we want we're going to have to work with this file from within this context manager and for the rest of the video I'll be using these context managers to work with files since it's a good practice but I wanted to show you the other way first in case you see it in examples or wondered why I wasn't doing it that way okay so back to our file so we just tried to read the contents from the closed file and got our error but let's look at how we can read the contents from the file from here within our context manager so let's create a variable called F underscore contents and this will just hold the contents of our file now if we do an F dot read and if I print this out oh and actually I need to actually print out that F underscore contents so if I save that and print that out then you can see that it printed out all of the contents of our file now if you have a small file then this is probably what you want but what if we have an extremely large file that we want to read but we don't want to load all of the contents of that file into memory well there are a couple of other methods here that we have available for reading file contents instead of F dot read so just to look at a couple of those I could say F dot read lines and if I print this out then you can see that we get a list of all of the lines in the file and it looks a little weird because we have our newline characters in there but if we look through this list then it actually gets every line of the file as a different element of that list now instead of F dot read lines I could do F dot read line and if I save that and run it now you can see that read line grabbed the first line of our file now every time that we run F dot read line it gets the next line in our file so if I was to copy all of this and then do it again and run that now you can see that it got the first and the second lines from the file now this printed out a little weird here because the print statement ends with a new line by default but if I go up here and pass in an empty string to the end of our print statements then it will no longer add in that extra new line and now you can see that those are the way that they are in the file okay but we still haven't solved our problem of how we can read all of the content from an extremely large file if we read the entire file and all at once then we could run out of memory and we don't want to go through and do F dot read line you know thousands of times so what we're going to do here is instead of using read line or read lines we can simply iterate over the lines in a file by saying for let me go to a new line here for line in F and then from here we can just print that line so I'm going to copy that and save that so now let me go ahead and comment out these lines and run this iteration over the lines and you can see that it printed out all of the lines in our file now this is efficient because it's not reading in all of the contents from our file all at once so it's not a memory issue that we have to worry about what it's going to do is it's just going to go through and get one line at a time from the file now this is usually good enough for most people but sometimes you may want more control over a exactly what you're reading from the file now if we go back I'm going to go ahead and delete this line if we go back to our F dot read line here and I'm going to get rid of that one now I'm going to go back to using F dot read and if you remember this red and the entire contents of the file so if I run that you can see that we got the exact same thing well with F dot read we can actually specify the amount of data that we want to read at a time by passing in the size as an argument so if I pass in a 100 to our read method and then print this out you can see that it printed out the first 100 characters are of our file instead of printing the whole thing all at once now if I was to copy this and run this again then you can see that it printed out the rest of the file because it picked up where it left off and read 100 more characters of the file now when we reach the end of the file then read will just return an MG string so if I was to copy this for a third time and rerun this then you can see that nothing happens because what happens when we reach the end of the file read just returns an empty string so this print statement is just printing out an empty string okay so how are we going to use this technique in order to read in a large file so since we don't know exactly how long the file will be we're going to have to use some kind of loop that just iterates over small chunks at a time so instead of hard-coding in 100 here I'm going to create a variable here called size to read and for now we'll just go ahead and set that equal to 100 so now if instead of passing in 100 to F dot raid let's just pass in this size to read ok so this will grab the first 100 characters of our file now remember when we hit the end of the file then read will just return an empty string so if we do a while loop and say while the length of F contents is greater than 0 then we will print out the contents that we got from read now don't run it like this yet because this will be an infinite loop we're never advancing the contents of the file after it prints the contents and we want to read in the next chunk of characters so in order to do that then we just have to again say F contents equals F dot read of that sized chunk now what it's going to do after this line here is that it's going to kick us back out to the while loop and it will check if we've hit the end of the file because F dot read will return an empty string and it won't meet this conditional so now if I go ahead and run this then you can see that it printed out all of the contents of our file so to get a better idea of what's going on here let's change the size to read to ten characters instead of a hundred characters and every time that we print out F dot contents here instead of an empty string let's make this an asterisk so now if I print this out then you can see it's a little more clear that we're looping through ten characters at a time and it's printing out these asterisks through every loop so you can see that it came through the loop here and it printed out these and then the asterisks that we know that it's just that chunk then it printed out the next ten characters and then the next ten characters and so on until we got to the end of the file now when we read from files you can see that it advances its position every time so we can actually see the current position using F dot tell so what I'm going to do is I'm going to comment out this while loop here and down here I'm going to say print and will print out F dot tell so if I go ahead and run that you can see the F dot tell returned ten so it's saying that we're currently at the tenth position of the in the file and that's because we've already read in ten characters here and we can manipulate our current position using the seek method so to show an example of this let me print the first 20 characters of the file by running F dot read twice so I'm going to go ahead and print out the contents after the first 10 characters there and then I'm going to do this a second time to get the next 10 characters and I'm going to go ahead and take out this second empty string there so that it pushes our finished statement out of the way so now actually let me get rid of F dot L here and go ahead and run this so we can see that it printed out the first 20 characters of our file now when we read in this second chunk here it picked up at the tenth position and read in the next ten characters like we would expect but what if I wanted that second read to instead start back at the beginning of the file and we can do this with F dot C so between these two reads if I was to do an F dot seek of zero and save that and ran it now you can see that it set our position back to the beginning of the file so the second time we read in our contents it starts back at the beginning instead of picking up where we left off after the first read now we used seek zero to start at the beginning of the file you can use it to change the position to any location that you'd like ok so now let's take a look at writing to files and a lot of this will be similar to reading so first of all what happens if we try to write from within a file that we have opened in read mode so let's go ahead and try that so I'll do an F naught right and I'll just do an F dot right of test I'm going to go ahead and get rid of that while loop also and save that so you see when I have a file open in read mode and try to write that we get an error that says that this is not writable so we have to have the file open in write mode so now back up here within our open statement let's create a new file called test2 dot txt and instead of reading we are going to write to it now in order to do that we can just say a lowercase w instead of that lowercase R now you can see over here in our directory that we don't have a test - txt yet now if the file doesn't exist already then this will go ahead and create it now if the file does exist then it will overwrite it so be careful if you're writing to a file that already exists now if you don't want to overwrite a file then you can use a lowercase a for appending to the file but we're going to go ahead and Rove write this file if it exists so first of all instead of writing to this file I'm just going to go ahead and put in a past statement here which basically says don't do anything so I'm going to go ahead and run this and you can see that it created test - txt so I didn't actually have to write anything to the file in order to create it just using the open with the write mode will create the file so now in order to write to this file then we can just do what we did before we can do an F dot write test.txt so I'm going to go ahead and run that now if we go over here to test - txt then you can see that it wrote test to our file now if I go back here and do another write to this file then it's going to pick up where we left off just like the read method did so now if I run this and go back to the file then you can see that it wrote test twice back to back now you can actually use seek when writing files also to set the position back to the beginning of the file and we can do this if I go back here between these two write statements and I was to do an F dot C of zero now if I run this then you can see that the second test over wrote the first one so C can get a little confusing for file right because it doesn't overwrite everything only what it needs to overwrite so for example if instead of writing the same thing twice if I was to do an F dot seek at the beginning and write out and R as my second one there and now if I run that and go back to the file then you can see that the are only overrode the T and test it didn't delete the rest of the content so using file seek whenever I am writing to files it can get a little confusing and I don't use it a whole lot but maybe there are some use cases out there that you guys will find it useful for okay so let's go ahead and pull all of this together and use read and write on multiple files at the same time so we're going to use this to make a copy of our test dot txt file so let's go ahead and delete our test2 dot txt file here so that we don't confuse the two and I'm going to go ahead and close that there okay so I'm going to go ahead and get rid of these here so first let's open our original test txt file in a read mode and instead of F here I'm going to use our F and I'll just say our F therefore read file since this is the file that we're going to read from in order to write to our copy so now within this with statement here I'm going to go ahead and let's go ahead and copy all of this and paste another open within here I'm going to call this a test underscore copy txt I'm going to open this in write mode and I'm going to call this WF for write file now you can actually put both of these open statements on a single line separated by a comma but I think readability here is pretty important and mixing those two on one line is sometimes difficult to understand at least for me so this is usually how I work with multiple files at a time as putting them on two different lines one nested within the other okay so now within here we have two files open our F for reading our original file and WF for writing to our copy not to do this it's just as easy as saying for line in our F we want to do a WF dot right of that line okay so now let's walk over this one more time so we have our original file opened and we're reading from that file and we have a file that doesn't exist yet that's our copy and we're writing to that file and we're saying for each line in our original file write that line to WF which is the file that we are copying to so if I go ahead and run that now you can see that it created this test copy dot txt file and if I open this you can see that it is an exact copy of our original ok and lastly let's look at how we can do something similar and copy a large picture file now this is going to be slightly different so if I look in my current directory that has my Python script that I'm currently running I also have a picture of my dog here when he was a puppy and let's go ahead and try to copy this picture file using file objects in Python now this file here is called Bronx jpg and if I just try to replace our text files with these picture files and down here I'll call this Bronx underscore copy dot jpg now this is exactly the same as our previous example but we're trying to use a picture instead of a text file now if I try to run this you can see that we got an error down here that says UTF codec can't decode byte in the position 0 so in order to work with images we're going to have to open these files and binary mode and all that means is that we're going to be reading and writing bytes instead of working with text now I'm not going to go into the specifics but if anyone is curious about the differences then I'll try to leave some resources in the description section as to what exactly that means or for this case in order to work with these pictures to use binary mode we can just append a V to our read R here and our write W there so now with that one simple change if I save that and now run it then you can see that we do have this copied picture file here and if I go over to finder then you can see that that file copied exactly the way that the original is ok so last thing here now I said earlier that sometimes you want more control over exactly what you're reading and writing so instead of doing this line by line let's instead do this in specific chunk sizes and we saw something like this earlier when we were learning how to read our files so to do this let's just do a chunk size and we'll set this equal to 4096 now you can choose different sizes but this is the one that I'm choosing here so now let's do an RF chunk and we're just going to read in a chunk of our read file here so I'll say RF dot read and I'll pass in this chunk size so now we're reading that much data from our original picture so now let's create a loop that will write these chunks to our copy until there's nothing left to read from the original and if you remember from earlier to do this we can do a while loop and while the length of this chunk here is greater than zero then we want to take our copy file and write that chunk to it so I'm going to write this chunk to our copy and then to keep this from being an infinite loop I have to read in the next chunk size so I'll paste that in there to read in the next chunk from the original file so now if I come up here and I delete this copy that we just created so I'm going to delete that and now I'm going to go ahead and rerun it using the code that we just wrote and you can see that it made that copy there and if I go back over to finder and open up that copy then you can see that it made an exact copy of our original okay so I think that's going to do it for this video there's a lot more that we could look at with files and I'll plan on putting together some tutorials in the near future for how to work with temporary files in memory files and things like that but I hope that this was a good introduction into working with files and some of the useful things that we can do with them now if you do have any questions and feel free to ask in the comment section below and I'll do my best to answer those be sure to subscribe for future videos and thank you all for watching
Info
Channel: Corey Schafer
Views: 1,139,251
Rating: 4.943409 out of 5
Keywords: Python, Python Files, File io, Python File io, Python Read File, Python Write File, Read File, Write File, File, Files, Python File i/o, File i/o, Python File Read, Python File Write, Copy File, Python Tutorial, Python Tutorials, file i/o tutorial, Learn Python, Python open, Context Manager, Context Managers, Python for Beginners, Programming Tutorials, Software Engineering, open file, Python open file
Id: Uh2ebFW8OYM
Channel Id: undefined
Length: 24min 33sec (1473 seconds)
Published: Fri Apr 29 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.