>> SPEAKER: So far, it's likely
that most of your programs have been a bit ephemeral. You run a program like Mario or Greedy. It does something, it maybe prompts
the user for some information, print some output to the screen,
but then when your program's over, there's really no evidence there
it was ever run in the first place. I mean, sure, you might have left
it open in the terminal window, but if you clear your screen, there's
really no evidence that it existed. We don't have a means of storing
persistent information, information that exists after our
program has stopped running, or we haven't up to this point. Fortunately though, c does
provide us with the ability to do this by implementing
something called a file, a structure that basically
represents a file that you would double click on your computer, if you're
used to a graphical user environment. Generally when working
with c, we're actually going to be working with
pointers to files-- file stars-- except for a little bit
when we talk about a couple of the functions that
work with file pointers. You don't need to have really dug
too deep into understanding pointers themselves. There's a little teeny bit
where we will talk about them, but generally file pointers and
pointers, while interrelated, are not exactly the same thing. >> Now what do I mean when
I say persistent data? What is persistent data? Why do we care about it? Say, for example, that
you're running a program or you've rewritten a
program that's a game, and you want to keep track
of all of the user's moves so that maybe if something goes wrong,
you can review the file after the game. That's what we mean when we
talk about persistent data. >> In the course of running your
program, a file is created. And when your program
has stopped running, that file still exists on your system. And we can look at it and examine it. And so that program would be set to
have created some persistent data, data exist after the program
has finished running. >> Now all of these functions that work
with creating files and manipulating them in various ways
live in standard io.h, which is a header file that
you've likely been pound including at the top of pretty
much all of your programs because it contains one of the
most useful functions for us, printf, that also lets
lives in standard io.h. So you don't need to pound include
any additional files probably in order to work with file pointers. >> Now every single file pointer function,
or every single file I/O, input output function, accepts as one
of its parameters or inputs a file pointer-- except
for one, fopen, which is what you use to get the file
pointer in the first place. But after you've opened the
file and you get file pointers, you can then pass them as
arguments to the various functions we're going to talk about
today, as well as many others so that you can work with files. >> So there are six pretty
common basic ones that we're going to talk about today. fopen and its companion
function fclose, fgetc and its companion function fputc,
and fread and its companion function, fwrite. So let's get right into it. >> fopen-- what does it do? Well, it opens a file and it
gives you a file pointer to it, so that you can then use that
file pointer as an argument to any of the other file I/O functions. The most important thing
to remember with fopen is that after you have opened the
file or made a call like the one here, you need to check to make sure
that the pointer that you got back is not equal to null. If you haven't watched the video on
pointers, this might not make sense. But if you try and dereference
a null pointer recall, your program will probably suffer
a segmentation [INAUDIBLE]. We want to make sure that we
got a legitimate pointer back. The vast majority of the time we will
have gotten a legitimate pointer back and it won't be a problem. >> So how do we make a call to fopen? It looks pretty much like this. File star ptr-- ptr being a generic
name for file pointer-- fopen and we pass in two things, a file name
and an operation we want to undertake. So we might have a call that looks like
this-- file star ptr 1 equals fopen file1.txt. And the operation I've chosen is r. >> So what do you think r is here? What are the kinds of things we
might be able to do to files? So r is the operation that we
choose when we want to read a file. So we would basically when
we make a call like this be getting ourselves a file pointer
such that we could then read information from file1.txt. >> Similarly, we could open file 2.txt
for writing and so we can pass ptr2, the file pointer I've created here,
as an argument to any function that writes information to a file. And similar to writing, there's
also the option to append, a. The difference between
writing and appending being that when you write to a file,
if you make a call to fopen for writing and that file already exists, it's
going to overwrite the entire file. It's going to start
at the very beginning, deleting all the information
that's already there. >> Whereas if you open it for appending,
it will go to the end of the file if there's already text in
it or information in it, and it will then start
writing from there. So you won't lose any of the
information you've done before. Whether you want to write or append
sort of depends on the situation. But you'll probably know what the
right operation is when the time comes. So that's fopen. >> What about fclose? Well, pretty simply, fclose
just accepts the file pointer. And as you might expect,
it closes that file. And once we've closed a file, we can't
perform any more file I/O functions, reading or writing, on that file. We have to re-open the
file another time in order to continue working with
it using the I/O functions. So fclose means we're done
working with this file. And all we need to pass in is
the name of a file pointer. So on a couple slides ago, we
fopened file 1 dot text for reading and we assigned that
file pointer to ptr1. Now we've decided we're
done reading from that file. We don't need to do any more with it. We can just fclose ptr1. And similarly, could we
fclose the other ones. All right. So that's opening and closing. Those are the two basic
starting operations. >> Now we want to actually
do some interesting stuff, and the first function that we'll
see that will do that is fgetc-- file get a character. That's what fgetc generally
would translate to. Its goal in life is to
read the next character, or if this is your very
first call to fgetc for a particular file,
the first character. But then after that,
you get the next one, the very next character of that file,
and stores it in a character variable. As we've done here,
char ch equals fgetc, pass in the name of a file pointer. Again, it's very
important here to remember that in order to have
this operation succeed, the file pointer itself must've
been opened for reading. We can't read a character from a file
pointer that we opened for writing. So that's one of the
limitations of fopen, right? We have to restrict
ourselves to only performing one operation with one file pointer. If we wanted to read and
write from the same file, we would have open two separate
file pointers to the same file-- one for reading, one for writing. >> So again, the only reason
I bring that up now is because if we're going to make a call
to fgetc, that file pointer must've been opened for reading. And then pretty simply,
all we need to do is pass in the name of the file pointer. So char ch equals fgetc ptr1. >> That's going to get us
the next character-- or again, if this is the first
time we've made this call, the first character-- of whatever
file is pointed to by ptr1. Recall that that was file 1 dot text. It'll get the first character of that
and we'll store it in the variable ch. Pretty straightforward. So we've only looked at three
functions and already we can do something pretty neat. >> So if we take this ability
of getting a character and we loop it-- so we
continue to get characters from a file over and
over and over-- now we can read every single
character of a file. And if we print every character
immediately after we read it, we have now read from a file and
printed its contents to the screen. We've effectively concatenated
that file on the screen. And that's what the
Linux command cat does. >> If you type cat in the file name, it
will print out the entire contents of the file in your terminal window. And so this little loop here,
only three lines of code, but it effectively duplicates
the Linux command cat. So this syntax might
look a little weird, but here's what's happening here. While ch equals fgetc, ptr is not
equal to EOF-- it's a whole mouthful, but let's break it down just
so it's clear on the syntax. I've consolidated it
for the sake of space, although it's a little
syntactically tricky. >> So this part in green right
now, what is it doing? Well, that's just our fgetc call, right? We've seen that before. It's obtaining one
character from the file. Then we compare that
character against EOF. EOF is a special value that's
defined in standard io.h, which is the end of file character. So basically what's going to happen
is this loop will read a character, compare it to EOF, the
end of file character. If they don't match, so we haven't
reached the end of the file, we'll print that character out. Then we'll go back to the
beginning of the loop again. We'll get a character, check
against EOF, print it out, and so on and so on and so on,
looping through in that way until we've reached the end of the file. And then by that point,
we will have printed out the entire contents of the file. So again, we've only seen
fopen, fclose, and fgetc and already we can duplicate
a Linux terminal command. >> As I said at the beginning,
we had fgetc and fputc, and fputc was the companion
function of fgetc. And so, as you might imagine,
it is the writing equivalent. It allows us to write a
single character to a file. >> Again, the caveat being, just
like it was with fgetc, the file that we're writing to must've been
opened for writing or for appending. If we try and use fputc on a file
that we've opened for reading, we're going to suffer
a bit of a mistake. But the call is pretty simple. fputc capital A ptr2, all
that's going to do is it's going to write the letter
into A into file 2 dot text, which was the name of the
file that we opened and assigned the pointer to ptr2. So we're going to write a
capital A to file 2 dot text. And we'll write an exclamation
point to file 3 dot text, which was pointed to by ptr3. So again, pretty straightforward here. >> But now we can do another thing. We have this example
we were just going over about being able to replicate the cat
Linux command, the one that prints out to the screen. Well, now that we have the ability
to read characters from files and write characters to files,
why don't we just substitute that call to printf with a call to fputc. >> And now we've duplicated cp,
a very basic Linux command that we talked about way long
ago in the Linux commands video. We've effectively
duplicated that right here. We're reading a character and then we're
writing that character to another file. Reading from one file, writing
to another, over and over and over again until we hit EOF. We've got to the end of the
file we're trying to copy from. And by that we'll have written all
of the characters we need to the file that we're writing to. So this is cp, the Linux copy command. >> At the very beginning of
this video, I had the caveat that we would talk a
little bit about pointers. Here is specifically where we're
going to talk about pointers in addition to file pointers. So this function looks kind of scary. It's got several parameters. There's a lot going on here. There's a lot of different
colors and texts. But really, it's just the
generic version of fgetc that allows us to get any
amount of information. It can be a bit inefficient if we're
getting characters one at a time, iterating through the file
one character at a time. Wouldn't it be nicer to get
100 at a time or 500 at a time? >> Well, fread and its companion function
fwrite, which we'll talk about in a second, allow us to do just that. We can read an arbitrary amount
of information from a file and we store it somewhere temporarily. Instead of being able to just
fit it in a single variable, we might need to store it in an array. And so, we pass in four
arguments to fread-- a pointer to the location where we're
going to store information, how large each unit of information
will be, how many units of information we want to acquire, and from
which file we want to get them. Probably best illustrated
with an example here. So let's say that we declare
an array of 10 integers. We've just declared on the
stack arbitrarily int arr 10. So that's pretty straightforward. Now what we're doing though is the
frecall is we're reading size of int times 10 bytes of information. Size of int being four-- that's
the size of an integer in c. >> So what we're doing is we're reading
40 bytes worth of information from the file pointed to by ptr. And we're storing those
40 bytes somewhere where we have set aside
40 bytes worth of memory. Fortunately, we've already done that by
declaring arr, that array right there. That is capable of holding
10 four-byte units. So in total, it can hold 40
bytes worth of information. And we are now reading 40 bytes
of information from the file, and we're storing it in arr. >> Recall from the video on pointers that
the name of an array, such as arr, is really just a pointer
to its first element. So when we pass in arr there, we
are, in fact, passing in a pointer. >> Similarly we can do this--
we don't necessarily need to save our buffer on the stack. We could also dynamically allocate
a buffer like this, using malloc. Remember, when we
dynamically allocate memory, we're saving it on the
heap, not the stack. But it's still a buffer. >> It still, in this case, is
holding 640 bytes of information because a double takes up eight bytes. And we're asking for 80 of them. We want to have space
to hold 80 doubles. So 80 times 8 is 640 bytes information. And that call to fread is
collecting 640 bytes of information from the file pointed to by
ptr and storing it now in arr2. >> Now we can also treat fread
just like a call to fgetc. In this case, we're just trying to
get one character from the file. And we don't need an
array to hold a character. We can just store it in
a character variable. >> The catch, though, is that
when we just have a variable, we need to pass in the
address of that variable because recall that the
first argument to fread is a pointer to the location and memory
where we want to store the information. Again, the name of an
array is a pointer. So we don't need to do ampersand array. But c, the character c
here, is not an array. It's just a variable. And so we need to pass an
ampersand c to indicate that that's the address where we want
to store this one byte of information, this one character that
we're collecting from ptr. Fwrite-- I'll go through
this a little more quickly-- is pretty much the
exact equivalent of fread except it's for writing
instead of reading, just like the other-- we've had open
and close, get a character, write a character. Now it's get arbitrary
amount of information, right arbitrary amount of information. So just like before, we can
have an array of 10 integers where we already have
information stored, perhaps. >> It was probably some lines of code
that should go between these two where I fill arr with
something meaningful. I fill it with 10 different integers. And instead, what I'm
doing is writing from arr and collecting the information from arr. And I'm taking that information
and putting it into the file. >> So instead of it being from
the file to the buffer, we're now going from
the buffer to the file. So it's just the reverse. So again, just like before, we can
also have a heap chunk of memory that we've dynamically
allocated and read from that and write that to the file. >> And we also have a single variable
capable of holding one byte of information, such as a character. But again, we need to pass in
the address of that variable when we want to read from it. So we can write the information
we find at that address to the file pointer, ptr. >> There's lots of other
great file I/O functions that do various things besides
the ones we've talked about today. A couple of the ones
you might find useful are fgets and fputs,
which are the equivalent of fgetc and fputc but for reading
a single string from a file. Instead of a single character,
it will read an entire string. fprintf, which basically allows
you to use printf to write to file. So just like you can do the
variable substitution using the placeholders percent i and
percent d, and so on, with printf you can similarly take the
printf string and print something like that to a file. >> fseek-- if you have a DVD player
is the analogy I usually use here-- is sort of like using your
rewind and fast forward buttons to move around the movie. Similarly, you can move around the file. One of the things inside
that file structure that c creates for you is an indicator
of where you are in the file. Are you at the very
beginning, at byte zero? Are you at byte 100,
byte 1,000, and so on? You can use fseek to arbitrarily move
that indicator forward or backward. >> And ftell, again
similar to a DVD player, is like a little clock that tells
you how many minutes and seconds you are into a particular movie. Similarly, ftell tells you how
many bytes you are into the file. feof is a different version
of detecting whether you've reached the end of the file. And ferror is a function
that you can use to detect whether something has
gone wrong working with a file. Again, this is just
scratching the surface. There's still plenty more file I/O
functions in the standard io.h. But this will probably get you
started working with file pointers. I'm Doug Lloyd. This is cs50.