How to write your own code libraries in C.

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hey everybody! Our topic for today is libraries. If you are a programmer you use software libraries all the time, but you may not think about it. And, many of you probably have never made your own. Today we're gonna change that and write some of our own libraries in C. A library is a collection of pieces of software that you bunch together and you want to distribute—either you put them together in a collection so that you can reuse them in different programs or maybe you have a favorite data structure—hash table, linked list, queue— whatever. And you want to be able to use it all over the place. You want to give it to your friends, then maybe that's a good candidate for something you might want to put in a library. The library that you use all the time but you probably don't think about is LibC, otherwise known as the C standard library LibC is home to malloc, calloc, realloc, free, printf, and all the other favorite functions that you call all the time but you didn't write and you didn't really think about where they came from. They're mostly all in Lib C. But, today we're interested in making our own libraries so let's make a library in C for Linux. So let's start with a header file. This is the header file that programmers will include when they want to use your library. Let's add the usual boilerplate stuff, and let's add a function that I'm going to put in my library, and we're going to need a .c file. Okay, that's where I'm going to put the function's code. Okay. So, the function of the day is reverse. It's going to take a string and reverse it in place. It doesn't copy the string. It's destructive. So, it just reverses the bytes—just in order—takes the last byte, swaps it with the first byte. It also returns a pointer to the string, and that's really just for convenience. So, this is the function I'm going to play around with today. I could really use any function though, and I could have more than one function in this library. I'm also going to make another test file that's going to test my library. It's going to call this reverse function, so that we can see whether or not it actually works. Okay. And, this test program is just going to print out the first argument and the reverse of the first argument. So, it just takes that argument I pass to the test program and it reverses it, and prints it both ways. I've also made a little Makefile to compile my code, and—first off—I'm compiling my library code into a .o file. Now you've probably seen .o files before. We usually think of a .o file as an intermediate step in compilation before you get your final binary—we usually link together a bunch of .o files, but you can really think of a .o file as a simple...library...for lack of a better term. I could take that .o file, copy it into another directory, into another project, or I could send it to a friend of mine, and they could use it in their projects. So, let's do that. Let's link our .o file with our test program, and it works. Ok, now where do we go from here? Well, for one I'm going to add a clean target to my Makefile, so that I can clear out past compiles. That's just for convenience, and then I'm going to add another rule to compile my library another way, as a .so file—aka shared object or a dynamically linked library. If you're on Windows and see a .dll, file that's we're talking about. Now shared objects or shared libraries are a little different. They still hold code. But, while the linker actually put that .o file into my final compiled binary a .so file is separate. It's designed to be separate, and it's designed to be loaded at runtime. And, when we build our .so file we need a few options you may not have seen before. First, -fPIC just means we're going to generate, or the compiler is going to generate, position independent code. That's code that can be placed anywhere in memory and still run correctly. And, because at runtime you're going to load this program into memory and we don't know where the library is going to be put in memory, so position independent code is important. The other option is "-shared". All that means is I want a shared library, and we've already talked about what that means. OK, and then I'm going to add my new shared library to the default "all:" rule and we can compile it. OK. Now, I want to use my new shared library. So, let's make a new program. It's actually just my old program but I'm going to compile it differently. The first difference is that I'm not going to pass my .o file to my compiler. Instead, I'm going to add a -L option telling the compiler to look in the current directory for libraries, and then I'm going to add a -l (little L) option to tell it that I want to link the program with libmycode. Now, this might be a good time to mention that this -lmycode is shorthand for (-)libmycode. My compiler is assuming that all libraries are beginning with the letters "lib". Ok. So, libC would just be -lc libmycode is just -lmycode. This is just telling it I want to link this program with this library, and once we specify that that linkage is supposed to happen then the compiler can figure out the rest. OK, so compile that. Good. OK. And, then I try to run it—not so good. The problem is the program loader is looking for libraries and it can't find our new library. So, we're going to have to help it. We can tell the loader where to find our new library by adding it to the LD_LIBRARY_PATH environment variable. Now this variable tells the loader where to look for libraries. So, I'm just going to add my directory to the front, and then I can run my program, and it works. But, what a pain!?! I don't want have to type that in every time I run my program. So, the other option is, I can install my library to one of the directories that the program loader automatically searches for libraries at runtime, like /usr/lib, for example. If I put our new library in one of these directories then I won't need all that LD_LIBRARY_PATH business. I can just run my program and it will find it. OK, but this still seems like a little bit of a hassle. Why would I want to use a shared library? The reason is code size. If I use object dump (objdump) to look at the symbol table, you can see that the first program assigns an address to my reverse function, but with the second one—the one that uses the shared library—the address is all zeros and the section is undefined. That's because it's going to be assigned when the program runs. And, if we look at the two different programs, you'll notice that the one that uses the shared library is smaller. Now in this example it's not a huge difference. It's only about 600 bytes, and that's because the amount of code in the library is really small, but when you're dealing with large libraries and large code bases with a lot of code, it can make a big difference and save you a lot of space. So, think of it this way. On the machine I'm currently using, LibC takes up about 2 megabytes of space. Now, two megabytes is not that big of a deal, but keep in mind that every program on this machine is linking to LibC. So, if I don't use a shared library that means that every program on my machine is going to be 2 megabytes larger, and it also means that for every one of those programs that that could be up to two megabytes more that I would have to load into memory every time I run a program. So, that could really add up. The other advantage of using a shared library is, let's say that we find a bug in LibC. We can patch that bug by just installing a new version of LibC on the machine, and I don't have to patch every program on the machine that uses LibC. So, that's a huge advantage in terms of maintenance. But, all those advantages aside, let's say you still don't want to go the shared route, and you really want that code from your library to be part of the binary, so if you don't need to worry about whether the shared library is there—whether it's installed properly. Then once you want is a static library, and as I mentioned before, .o files you can kind of be thought of as a static library, but usually when we talk about static libraries—when we're packaging up static code that's going to be linked statically—the more typical approach is to use a .a file. A .a file is made with the "ar" command (that stands for "archive"). So, let's add one more option to our Makefile, and this is going to compile our code into a static library. Now, I'm going to give it a different name so we don't confuse the linker. If I didn't have the shared one in here we could just name it "libmycode.a", but we do have the shared library in here with the same name (different extension). So, I'm going to use a different name. And, then we can just use the ar command to make this .a file using the following options: so "r" means replace—means it's going to replace any existing files that exist in the archive with the same name. "c" means create and that means we're going to create the archive if it doesn't already exist. And, "s" means we're gonna generate an index that's going to be used by the compiler to make sense of this library. Why "s" is for index? I have no idea. In this example, I'm giving it one .o file, but if I had a bunch of .o files I could just list them at the end and then they would all be bundled up in this new static library. So, let's compile it, and there it is—our beautiful new static library. Let's also add a rule to our Makefile that compiles our program with the new static library. It's basically the same as it was with our shared library. The linker just looks for what kind of library you're using and then if it's a static library it stuffs all that code into the final binary, and if it's a shared library then it won't. OK. So, let's add our new static library to the list of things we want to make... and compile...and there it is. Notice again that the static version is bigger. The dynamic version is smaller, but the bigger static version doesn't need the library anymore. All the code is inside of it. So, I could just throw the library away, at this point, and the static binary is still going to work just fine. And, if we run it... oops sorry...if if we run it. OK. It works. And, now you know how to write static and shared libraries in C for Linux. The process in Mac OS and Windows is going to be a little bit different. You're going to have some different compilers, different compiler flags, the extensions are going to be different. You're going to have DLL or .dylib, but the idea is the same. The concepts are the same. Really, what you're doing here is the same. All of these libraries are just different ways to fundamentally accomplish the same thing— which is help you to package up code so that you can reuse it, and you can share it. And, I hope that helps, because that's all I got for you today. Tune in next time for my next video when I...well I don't know what it's going to be about, but I'm sure it will change your life. So, happy coding, and I'll see you later.
Info
Channel: Jacob Sorber
Views: 135,778
Rating: undefined out of 5
Keywords:
Id: JbHmin2Wtmc
Channel Id: undefined
Length: 9min 56sec (596 seconds)
Published: Mon Feb 11 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.