Hey everybody! Our topic for today is
libraries. If you are a programmer you
use software libraries all the time, but
you may not think about it. And, many of
you probably have never made your own.
Today we're gonna change that and
write some of our own libraries in C. A
library is a collection of pieces of
software that you bunch together and you
want to distribute—either you put
them together in a collection so that
you can reuse them in different programs or
maybe you have a favorite data structure—hash table, linked list, queue—
whatever. And you want to be able to use
it all over the place. You want to give
it to your friends, then maybe that's a
good candidate for something you might
want to put in a library. The library
that you use all the time but you
probably don't think about is LibC,
otherwise known as the C standard library
LibC is home to malloc, calloc,
realloc, free, printf, and all the other
favorite functions that you call all the
time but you didn't write and you didn't
really think about where they came from.
They're mostly all in Lib C.
But, today we're interested in making our
own libraries so let's make a library in
C for Linux. So let's start with a header
file. This is the header file that
programmers will include when they want
to use your library. Let's add the
usual boilerplate stuff, and let's add a
function that I'm going to put in my
library, and we're going to need a .c
file. Okay, that's where I'm going to put
the function's code. Okay. So, the function
of the day is reverse. It's going to take a
string and reverse it in place. It
doesn't copy the string. It's destructive.
So, it just reverses the bytes—just in
order—takes the last byte, swaps it with
the first byte. It also returns a pointer
to the string, and that's really just for
convenience. So, this is the function I'm
going to play around with today. I could
really use any function though, and I
could have more than one function in
this library. I'm also going to make
another test file that's going to test
my library. It's going to call this reverse
function, so that we can see whether or
not it actually works. Okay. And, this test
program is just going to print out the
first argument and the reverse of the
first argument. So, it just takes that
argument I pass to the test program and
it reverses it, and prints it both ways.
I've also made a little Makefile to
compile my code, and—first off—I'm
compiling my library code into a .o
file. Now you've probably seen .o files
before. We usually think of a .o
file as an intermediate step in
compilation before you get your final
binary—we usually link together a bunch
of .o files, but you can really think of a
.o file as a simple...library...for lack
of a better term. I could take that .o
file, copy it into another directory, into
another project, or I could send it to a
friend of mine, and they could use it in
their projects. So, let's do that. Let's
link our .o file with our test
program, and it works. Ok, now where do we
go from here? Well, for one I'm going to
add a clean target to my Makefile, so
that I can clear out past compiles.
That's just for convenience, and then I'm
going to add another rule to compile my
library another way, as a .so file—aka
shared object or a dynamically linked
library. If you're on Windows and see a
.dll, file that's we're talking about. Now
shared objects or shared libraries are a
little different.
They still hold code. But, while the
linker actually put that .o file into
my final compiled binary a .so file
is separate. It's designed to be separate,
and it's designed to be loaded at
runtime. And, when we build our .so file
we need a few options you may not have
seen before.
First, -fPIC just means we're going to
generate, or the compiler is going to
generate, position independent code.
That's code that can be placed anywhere
in memory and still run correctly. And, because at runtime you're going to load
this program into memory and we don't
know where the library is going to be put in
memory, so position independent code is
important. The other option is "-shared".
All that means is I want a shared
library, and we've already talked about
what that means. OK, and then I'm going to
add my new shared library to the default
"all:" rule and we can compile it. OK. Now, I
want to use my new shared library. So,
let's make a new program. It's actually
just my old program but I'm going to
compile it differently. The first
difference is that I'm not going to pass my
.o file to my compiler. Instead, I'm
going to add a -L option telling the
compiler to look in the current
directory for libraries, and then I'm
going to add a -l (little L) option to tell it
that I want to link the program with libmycode.
Now, this might be a good time
to mention that this -lmycode is
shorthand for (-)libmycode. My compiler
is assuming that all libraries are
beginning with the letters "lib". Ok. So, libC would just be -lc
libmycode is just -lmycode. This is just
telling it I want to link this program
with this library, and once we specify
that that linkage is supposed to happen
then the compiler can figure out the
rest.
OK, so compile that. Good. OK. And, then
I try to run it—not so good.
The problem is the program loader is
looking for libraries and it can't find
our new library. So, we're going to have to
help it. We can tell the loader where to
find our new library by adding it to the
LD_LIBRARY_PATH environment variable. Now
this variable tells the loader where to
look for libraries. So, I'm just going to add
my directory to the front, and then I can
run my program, and it works. But, what a
pain!?! I don't want have to type that in
every time I run my program. So, the other
option is, I can install my library to
one of the directories that the program
loader automatically searches for
libraries at runtime, like /usr/lib, for
example. If I put our new library in one
of these directories then I won't need
all that LD_LIBRARY_PATH business. I can just run my program and it will
find it. OK, but this still seems
like a little bit of a hassle. Why would
I want to use a shared library? The
reason is code size. If I use object dump (objdump)
to look at the symbol table, you can
see that the first program assigns an
address to my reverse function, but with
the second one—the one that uses the
shared library—the address is all zeros
and the section is undefined. That's
because it's going to be assigned when
the program runs. And, if we look at the
two different programs, you'll notice
that the one that uses the shared
library is smaller. Now in this example
it's not a huge difference. It's only
about 600 bytes, and that's because the
amount of code in the library is really
small, but when you're dealing with large
libraries and large code bases with a lot of code,
it can make a big difference and
save you a lot of space. So, think of it
this way. On the machine I'm currently
using, LibC takes up about 2 megabytes
of space. Now, two megabytes is not that
big of a deal, but keep in mind that
every program on this machine is linking
to LibC. So, if I don't use a shared
library that means that every program on
my machine is going to be 2 megabytes
larger, and it also means that for every
one of those programs that that could be
up to two megabytes more that I would
have to
load into memory every time I run a
program. So, that could really add up. The
other advantage of using a shared
library is, let's say that we find a bug
in LibC. We can patch that bug by just
installing a new version of LibC on the
machine, and I don't have to patch every
program on the machine that uses LibC.
So, that's a huge advantage in terms of
maintenance. But, all those advantages
aside, let's say you still don't want to
go the shared route, and you really want
that code from your library to be
part of the binary, so if you don't need
to worry about whether the shared
library is there—whether it's installed
properly. Then once you want is a static
library, and as I mentioned before, .o
files you can kind of be thought of as a
static library, but usually when we talk
about static libraries—when we're
packaging up static code that's going to
be linked statically—the more typical
approach is to use a .a file. A .a file
is made with the "ar" command (that
stands for "archive"). So, let's add one
more option to our Makefile, and this is
going to compile our code into a static
library. Now, I'm going to give it a
different name so we don't confuse the
linker. If I didn't have the shared one
in here we could just name it "libmycode.a",
but we do have the shared library
in here with the same name (different extension). So, I'm going to
use a different name. And, then we can
just use the ar command to make this .a
file using the following options: so "r"
means replace—means it's going to
replace any existing files that exist in
the archive with the same name. "c" means
create and that means we're going to
create the archive if it doesn't already
exist. And, "s" means we're gonna generate
an index that's going to be used by the
compiler to make sense of this library.
Why "s" is for index? I have no idea. In
this example, I'm giving it one .o
file, but if I had a bunch of .o files
I could just list them at the end and
then they would all be bundled up in
this new static library. So, let's compile
it, and there it is—our beautiful new
static library. Let's also add a rule
to our Makefile that compiles our
program with the new static library. It's
basically the same as it was with our
shared library. The linker just looks for
what kind of library you're using and
then if it's a static library it stuffs
all that code into the final binary, and
if it's a shared library then it won't.
OK. So, let's add our new static library
to the list of things we want to make...
and compile...and
there it is. Notice again that the static
version is bigger. The dynamic
version is smaller, but the bigger static
version doesn't need the library anymore.
All the code is inside of it. So, I could
just throw the library away, at this
point, and the static binary is still
going to work just fine. And, if we run it...
oops sorry...if if we run it. OK. It works.
And, now you know how to write static
and shared libraries in C for Linux. The
process in Mac OS and Windows is going to
be a little bit different. You're going to
have some different compilers, different
compiler flags, the extensions are going to be
different. You're going to have DLL or .dylib,
but the idea is the same. The
concepts are the same. Really, what you're
doing here is the same. All of these
libraries are just different ways to
fundamentally accomplish the same thing—
which is help you to package up code so
that you can reuse it, and you can share
it. And, I hope that helps, because that's
all I got for you today. Tune in next
time for my next video when I...well I
don't know what it's going to be about, but
I'm sure it will change your life.
So, happy coding, and I'll see you later.
