Hello my name is Marshall Kirk McKusick and I've been around as long as dinosaurs
and mainframes have ruled the world which is to say the sixties and seventies however by 1970s a new breed of mammals had begun to show up
on the scene known as mini computers although they were just toys in the 1970s they would soon grow and take over most of the computing market In 1970 at AT&T Bell laboratories two researchers Ken
Thompson and Dennis Ritchie began developing the UNIX operating system Ken Thompson who had been an alumnus at Berkeley came back on a sabbatical in 1975 bringing UNIX
with him In the year that he was there he managed to get a number of graduate students interested
in UNIX and by the time he left in 1976 Bill Joy has taken over in running the UNIX system and in fact continuing to develop software for it. Bill began packaging up the software that had
been developed under Berkeley UNIX and and distributing it as the Berkeley Software Distributions whose name was quickly shortened to simply BSD BSD continued to be distributed with
yearly distributions for almost fifteen years initially under Bill Joy and later under others including
yours truly. By the late 1980s interest had began to grow in freely redistributable software so a number of us at Berkeley began separating
out the AT&T proprietary bits of BSD from those parts that were freely redistributable. By the time of the final distribution at BSD
in 1992 the entire distribution was freely redistributable. I live in a capsule history here but if you're interested in the entire story I have this three-and-an-half hour epic which is available from my website www.mckusick.com that gives the entire history of Berkeley. Following the final distribution from Berkeley two groups sprung up to continue supporting BSD the first of this was the NetBSD whose primary
goal was to support as many different architectures as possible everything from your microwave oven all way
upto your cray XMP In fact today NetBSD supports nearly
sixty architectures. The other group that sprang up was FreeBSD. Their goal was to bring up BSD and support
as wide a set of devices as possible on the PC architecture. They also had a goal of trying to make the
system as easy to install as possible to attract by a wide group of developers I chose to work primarily with the FreeBSD
group both doing software and also together with George Neville Neil writing this book ""The Design and Implementation
of the FreeBSD Operating System"". Together with this book I developed a course which runs for twelve chapters and thirty hours. The purpose of this video is to give you a taste of that course. What follows are excerpts from the first lecture
of the course which of course you can also get from my website
www.mckusick.com. Enjoy. This class is nominally about FreeBSD
because well that's what I know best and that's what
the textbook is organized around but the fact of the matter is that it's really a class about your UNIX and that really covers sort of the broad range of things
in the open source arena as its FreeBSD in Linux which of course you use a lot out and it also covers a commercial systems %uh Solaris, HP-UX, AIX and so on. I am going to tend more towards the open
side open source side of things.So it's really
going to be more FreeBSD in Linux than it's going to be Solaris and HP-UX and so on. For the most part at the level of this course
we're dealing with the interfaces to the system and the fact that the matter is a those interfaces are highly
standardized at this point and whether it's FreeBSD or Linux or Solaris
or whatever the Socket system call has to do the same
thing, it has to have the same arguments in that, it has to have the same effect and so until you get down to the really nitty
details of how they actually go about implementing
that the differences are relatively minor. So I would say that sixty to seventy percent
of the material that I'm covering is just as true for FreeBSD as it would
be for Linux or for Solaris %uh AIX is a little bit sort of off in the weeds %uh as is HP-UX but luckily we don't have to worry too much about
that. Okay so the other thing is that I'm going to assume that
all of you have used the system. I get really sort of worried when people you know raise the hands and ""Hey, what's a Shell?"" or I don't
put a lot of code up but a one piece of code and someone said ""Why are there two pipe symbols in the middle of
that that If statement?"". No we're not programming the Shell we're programming
in C. So hopefully you can tell the difference between
Shell scripts and C code. so okay but I am but am gonna assume you haven't really looked inside the system. So I gonna start everything to at a very
high level. The problem is I have already discovered you come
from a lot of different sort of backgrounds and levels of knowledge and so the way that I find works best to sort of
be useful to everybody is that three pass algorithm so what I will do is start the first pass a very
broad brush high level description of what's going on and then I will go back and i'll go through the
same material again but at a lower level of detail then i finally go back and go through a very nittily
low-level of detail and the fact of this is if you are learning new stuff
as I'm doing the high-level thing you are gonna be utterly washed by the time I get to
low level niggly details but since I'm going to do it topic by topic when I get to the end of one of those nearly
low level niggly details i'll give you a clue as i will say ""Brain
reset, I'm starting a new topic"" so even if you're completely lost you can now start listening again plus I'm gonna get
the broad brush up again. okay and for those of you that know a lot of
this stuff already you'll probably find the broad brush rather boring but by the time we get down to nearly low level
details I think you'll actually pick up some things that you will find useful and interesting. So in this way hopefully everybody will
get some useful percentage of material out of the course. I am gonna start out by just walking through and giving you the outline of what we're going to try and do here
here As i said we're going to go roughly just about two-and-an-half hours of lecture about two hours forty minutes per week and so we will start off this week with an introduction. This is as I said we're going to start from the
top and then just start working our way down so the general thing I'm going to do is
to talk about the interface %uh which is something that you are presumably fairly familiar with since
you've worked with that system and then you have to sort of layout terminology although we use normal english words they have sometimes rather bizarre meanings compared to their
common usage and so I will just sort of lay out the terminology
lay out the the way we talk about how the system is structured and this week we will also talk about the
basic services ""What is it that the kernel is providing for us?"" and then of course we'll proceed to dive down in and and see how
that is done so here in Week number 2 we're gonna look at the system from the
perspective of something that manages processes. One way of looking at the kernel is it's really
just a the resource manager and the resource that
its managing are things going to do with processes So we'll look at a process, what the structure of
it is and talk about the different ways that they can
be structured. Process can for example be an address space
and can have one thread running in it can have multiple threads running in it. so we'll talk about the different ways
that we think a process is. We will look at the management of those processes we've got to lay out the bits and pieces that
need to be managed and then talk about how we do that. we'll talk about jails.. this is something
that you currently find only in FreeBSD hasn't made it into Linux yet although the concept is being actively worked
on so my guess is that you'll see that fairly soon. we'll also then talk about scheduling which is in essence how we decide what gets
to run, when it gets to run, how long it gets to run, etc. okay The week after that we will go into virtual
memory. Signals aren't really part of virtual memory
but they didn't fit into next week's material so I just would dropped that at the
beginning but the bulk of Week 3 is going to
be the management of Virtual Memory. So we've got a bunch of physical memory, a bunch of
processes that are trying to use their address spaces and we will talk about essentially how you will make that all work It's called a virtual memory because it's sort of a cheat. We promise you the world and
then we deliver you as small number of pages as we think we
can get away with. Okay. So the first three weeks then essentially
get us through looking at the world as if it was all all about processes. Then in Week 4 we change gears. we say
okay well you know the kernel isn't just all about processes. You can sort of
look at it orthogonally and you can say it's really just a giant I/O switch it's just like a traffic cop that's just managing
these I/O streams and so let's look at it from that perspective. And we'll start with special files, again this
sort of the interface when you talk about UNIX systems, when you talk about what's normally /dev interface that gets you access
to the various I/O streams that are available and we'll look at how that's organized and
the structure of it which used to be fairly simple but in the
last decade has gotten incredibly complicated. We will also talk about pseudo terminals in
job control this is about as interesting as watching the
grass grow but unfortunately it's a major component of the system and especially people that deal with system
administration have to know far more about this than they probably ever thought they
wanted to. Okay we will then continue in Week 5 with
the kernel I/O structure, We will start with multiplexing of I/O. The
kernel of course has done this always but we're really talking more about how do
we export I/O multiplexing to user applications. We will then move into auto configuration strategy Auto configuration is what happens typically or historically I guess you
could say as the system boots. so all that stuff that comes out about what hardwares are on the machine and how it's all
interconnected all of that is tied up in auto configuration and that used to happen just once it boots but in modern systems today it's an ongoing process. It happens at boot
but it also happens anytime you plug a new I/O device, a
PCMCIA card, or you remove a disk or you put in a new disk. or any sort of activity that changes the I/O structure of the machine auto configuration has to get fired back up and figure out what's disappeared and cleanup and figure out what new has arrived
to configure it in. and then we'll talk a little bit about the configuration of the
device driver this actually gets into an area that is one well let me just give it as a bit
of advice to the class esspecially those of you who work in system administration. You really want to be careful that
you don't learn too much about device drivers because there is really these three things that it's not good to learn about and if you do
learn about it it's really good to keep it to yourself because if you become an expert or viewed as an expert in any of these areas you will become the designated stuccy for
that and your site you'll never get to do anything but that so The three things that I highly
recommend not learning very much about are device drivers, send mail configuration files or anything having to do with LDAP or anything in
that general domain because as I say that will become your life's work and there's other things that you might find more interesting.
""Do you have a question?"" so one of my students empathizes with my point I believe you said you worked on that mail
system so you you might know something about
Sendmail configuration files but you don't have to answer that okay so we're going to talk about what a device
driver does and really just sort of the entry points to it but we're not going to talk about how you
write such a thing, how you debug such a thing or much of anything about it. I actually used
to teach an entire class believe it or not about device drivers but then I realized the error of my ways and I have
since gone through and made a point of forgetting
every slide in that talk. okay so then we will move on to File system and as always we'll start at the high level
talk about the interface what is it that is exported out of the system and then we will start diving down in the C and
how do we go about implementing that so we'll start with the so called Block I/O system it's historically been called buffer
cache and you still hear it called that periodically and the fact of the matter is that there isn't really
about buffer cache anymore, there is just one big cache in it.Its the VM cache and the Filesystem has a view into it
and the processes have a view into it but at
the end of the day you really don't want the same information
on two different pages of memory because that just leads to trouble. But Filesystems think they have buffers and so
there's this manouver where we make these things that look like what historically
were buffers that really just map into VM system but they're still managed in the way that
they have been managed historically okay We will then get down into Filesystem implementation
the local file system if you will and into also soft updates and snapshots. this for the time being is something that you see
only in FreeBSD the alternative to soft updates is journalling
which is %uh more commonly used for example what is used by ext3 and so i'll go through soft updates and a lot of the issues in soft updates are the
same issues that you have to deal with journalling what is it that we're protecting and how do we
go about doing that and the difference is in the detail. There is actually a paper in the back to your
notes if this is something that interests you it's a comparison of journalling versus
soft updates that was done about five or eight years ago. and not to spoil the punch line but the answers
they both work about are the same Okay snapshots again is something that
if you've worked with things like the network
appilance box you're probably quite aware of what snapshots are and how they do
or don't work for you this is the same functionality in the Filesystem implemented in a
somewhat different way okay so this Week 6 is really going to be the local
file system the disk connected to the machine
that we are dealing with. Week 7 then we get into multiple
Filesystem support so how do we abstract out that Filesystem layer and support Multiple Filesystems at the
same time so for example in FreeBSD you can of course run with their traditional
fast Filesystem but if you happen to like the Linux Filesystem
better or you have to share a disk with a Linux machine you can run the ext2 or ext3 and it will perfectly happily do that so we will have to look then at how do we provide
interface so that we can plug in all these different Filesystems that we want to support another area of which there's been a great deal of growth at least in code complexity is so-called Volume Management so in the good old days a Filesystem lived on a disk or
piece of disk and that was that but in this day and age that won't do any more so we aggregate disks
together by striping them or RAID arraying them or various other things and we need a whole layer in the system just to
manage those disks we'll then get to the as an example of an alternative
Filesystem we're going to talk about the Network Filesystem or NFS but that's not because this is the world's best remote file system or the cleanest design or any of the
properties you might hope that such a class as this one would have but it's ubiquitous very widely used and so we're going to talk about that one okay we'll then once again switch gears in Week 8 and turn our attention to of Networking and
Interprocess communication and again we'll start from the very top so we'll
go through, we'll go with concepts, the terminology that gets used and what's the difference between domain
based addressing and an address domain you know we'll go through what the basic IPC services are, essentially what are all the system calls that
have anything to do with networking and just sort of describe what each of them are
and I'm going to go through a somewhat contrived example that makes use of every one of those interfaces and just to sort of show how they all connect
together and for those of you that work in networking or had done any kind of network
programming if you're looking for a week to miss and the
Week 8 is the one to miss that's 'cause that's the sort of most basic lecture that I'm going to give If you are not sure whether or not you need to
go through that, there is one of the papers in the back it is an introduction to Interprocess communication read that paper if you say yeah yeah yeah
yeah yeah you are done with Week 8. on the other hand if you dont come to Week
8 and then in Week 9 I say I call on you and say alright what is it that listen system call does and you
can't tell me you're gonna get a demerit okay then in Week 9 we will get into the actual networking implementation itself, we go
through system layers as we did in all the other areas and we will spend a significant portion of that
class talking about routing routing for those of you that haven't had the pleasure
of dealing with it is a black art or at least a dark science and so we'll talk about it from the perspective first of all of what
do we do locally within the machine and then what are some of the bigger strategies
that we can use for doing routing enterprise wide routing or area wide routing something like throughout the
state of California or throughout the US whatever this again like device drivers is really
just sort of a nickel tour through the what the choices are what that the basic
strategies are that are used If you're thinking you're going to walk out
of here knowing how to set up a routing well sorry we are not going to get that far but you should at least have a pretty good idea
of what the issues are and what the general solutions are okay then finally in Week 10 well not finally
but next few weeks and we will go through the Internet Protocols primarily TCP/IP and this is what are the algorithms that are used and I'm putting a particular emphasis for this particular class on changes that have been made in the protocols to deal with a lot of the sort of attacks that
we've been seeing the SYN attacks and that sort of thing rather than just a straight iteration of what the the actual protocols
are i'll talk primarily about IPv4 but I will also try and talk a bit about
IPv6 as well all right so the first ten weeks are sort of the kernel course now we attack two weeks at the end to talk about sort of the bigger picture of System Tuning,Crash dump analysis that level of
thing The idea is to really consolidate what
we figured out or talked about in the first ten weeks and how that applies to tools that we have available
to us to look at what the system is doing, analyze what the system is doing and hopefully improve the performance of what the system is doing and for the most part the kind of tuning that I'm
talking about is not going in and hack hack hacking your kernel because the fact that the matter is most of the time you can't do that anyway so it's more looking at it from the perspective
of saying is this system running badly because it doesn't
have enough memory on it? or is it running badly because there isn't enough
I/O capacity? or is it running badly because it's got
enough I/O capacity but certain drives are being overloaded or is it being overrun because we're simply trying
to do too much on this machine?,etc. so that's the sort of level of thing that we're
looking at it but tied into lot of concepts that we talked before so we can talk
about active virtual memory and what that means and essentially measure what it is and hopefully
then you will understand in the context of what we talked about in the VM section what that really means the Crash dump analysis is one of these
topics that you are gonna love or hate you actually have to deal with crashed
dumps its people find it invaluable and if you don't have to deal with Crash dumps it's an incredible mass of boring detail the only good part of it is that that's the
whole session is only about an hour long If it interests you, listen closely and if it bores you, well, its only an hour long okay lastly we'll talk a little bit about
security issues again this is really more to the tools that
are available to deal with security staff as opposed to a
complete tutorial on how to implement security so those of you
that deal with security this is just gonna to be sort of security one oh
one for those of you that have but you'll have to deal with it but haven't really
thought about it it'll probably scare you to death and
you wonder how to keep the machines from being hijacked everyday Okay so that's in essence what we're going
to try and do here anybody have any comments, questions, thoughts.
No? All right well. Let's get started we will be begin on page fifteen with an
overview of the kernel. Hopefully nobody's lost yet. What's a kernel? All right. so starting at the very top the big broad brush what we have is a UNIX virtual machine and virtual machines are actually something
that has been around as a concept since the sixties difference is really just sort of the level
of the interface that people have dealt with when they talk about Virtual Machines in the 1960s computers were these enormous things you would
have your computer room would be something that'd be three times the size of this conference
room if you had a computer the computer itself was tall as a refrigerator freezer imagine five or eight or ten of these units
side by side that itself made up the computer that would be one big for the core processor and the one which
should be the floating point unit and several of them that would be the memory the core momory
literally the core memory and then they'd be other rows of these
disk drives which were about the size of the washing machine and then behind that since you couldn't store
everything on disks so then you had rows of tape drives and then you had this little set of sort of munchkins that would run around and and tend
to the machine and they'd mount tapes and take off tapes and mount disc packs and remove disc packs
because the drives themselves were very expensive and
so you wouldn't just as today we have a one spindle that was dedicated just to one set
of platters you could take out a set of platters and put in another hundred megabytes set of platters and these are
platters that are this big around and it's like six or eight
of them and giant head assemblies they comes rumbling in and
out anyway one of these giant giant machines that costs many millions of dollars would run
at about ten million instructions per second, 10 mips and 10 mips was more computing power than anybody
could possibly imagine using in a single application just by contrast you know this four-year-old laptop here is probably on
the order of one or two hundred mips but anyway people couldn't really view what we would
do with a lot of computing power and the other thing was that you didn't have
a notion of sort of an operating system that had applications running on it because everybody wanted to write straight to
the raw hardware and so what IBM who was a big manufacturer
of machines in those days did what they came up with this thing called
the VM and this was a little you'd call an operating system really but what it did is it cloned independent copies of the machine that worked just
like the original machines so you could boot something that you thought it was an operating
system on top of VM so you take one least ten mip machines and
it would clone six identical one mip copies and then you could boot whatever you wanted on each one of those machines
so if you were doing database stuff you would boot your
database because database cannot ran on the raw hardware or if you're doing payroll who would boot up the payroll
program or if you actually tried to service
users you could boot a time sharing batch thing that would read card images and print
stuff out or they even had TSO the Time Sharing
Option where you could interactively sit and type and send stuffs in and get answers back and also you could boot TSO so whatever set
of things you need you could boot them and they ran
independently as if they were running on their own machine but all the VM did was it give you an exact
raw copy of the hardware so when UNIX came along they sort of liked the notion of providing the concept of independent
things that you could operate in but they wanted it at a higher level so you're looking really to do it instead of at the raw hardware level to do it at a process level and the idea that then was that the interface you
would program to would be what we think of as a System call interface today and the idea then was that you would be given a process or set of processes and those were independent. your process
couldn't affect the address space of another processor. You couldn't reach
over and mess around with their addresses, you couldn't mess around with their I/O
channels you could slow them down by being a pig but that was about the only way that you could affect
other processes and so what the interfaces that they had there was one that had these characteristics
had a a paged virtual address space so you din't have to know as in the old days how much physical
memory is on the machine and make your application fit into that amount of memory you just had what looked like a large uniform address space even if the underlying
hardware had segments or some other hardware brain damage it looked to you like he just had a big uniform
address space and the size of your address space was independent
of the amount of memory that was on your machine your address space couldn't be bigger than amount of
physical memory cause we sort of move pages around underneath whatever part address space was actually
active and there's obviously limits to this if
you are trying to run a 1 gigabyte of application on top of ten megabytes of memory it's probably going to bring new meaning to
same day service but if you're willing to wait long enough it
will eventually move the pages around and you will progress through getting your application run another thing was dealing with software
interrupts in the old days you had to understand how the hardware worked in order to deal with exceptional conditions
so for example if you did a divide by zero the hardware would jump through some
vector location or something and you had know how that worked and make
sure that you had your program usually some little bit of assembly language
set up to deal with that and UNIX said let's let's get away
from the hardware here and so they did this thing called signals and so they just define a set of the signals is that
if you do divide by zero you simply register a routine you
want to have called you don't have to know how the hardware figured it out you just know that that routine is going to get
called and you can deal with it at that point well we got set of timers and counters to keep
track of what we're doing, this is really more for counting than anything else but applications may want to have access to that. we have a set of identifiers that we're
going to use for things like accounting, protection and scheduling and so on and one of the the early philosophies of UNIX was to try
and keep it simple. operating systems have gotten very baroque in particular the thing that pre dated UNIX was a thing called
Multix Multix was was a joint project between
Honeywell, a big computer manufacturer of the time AT&T bell laboratories the big industrial labratory at that time and MIT a big university then and still today and those three organizations got
together to try and build this time sharing operating system and it it just got bigger and more grandiose and more complex and never
finished because as soon as they sort of see oh we know how to do that but we could
do this other thing too and so then they would tear it apart and they never really got to something that could be put into production and so the AT&T Bell laboratories decided to pull out of
that project and the two of the people that had been working on
that project, Ken Thompson and Dennis Richie were sort of bummed because they were now
back to typing cards and putting them through card readers and they had gotten used to the idea that you could
actually sit at an ASSR33 teletype and interact
with your computer and so they found an old %uh PDP-8 sitting off in
the corner that had been abandoned and started working on this little tiny operating
system which they called UNIX which eventually moved to the PDP-11 and
became what we have today but because it was they were coming first of all from Multix
where everything had been done and in great grandiose detail and because they're fundamentally were two
of them working on it and they wanted to get something done and within a year or so one of their philosophies was let's find the one way of doing things let's not have eight ways from Sunday let's just
get the one way and that's what we will provide. So what is
the sort of core set of things that we need. well first thing is when it comes to identifiers,
let's not have you know eighty thousand different identifiers so they came up with process identifiers, user identifier and at that time a single group
identifier and later expanded and they used that sort of identifiers for everything
so its used for counting, used for making protection decisions, used for scheduling
decisions and again it was the simplicity of thing which
was what was driving their decision but they're really sort of two key ideas
that they had that really made the difference that that's what set them up side from what everybody else had done before them and which in retrospect is something that has been pervasive
more or less ever since the first of these was the notion that we have a unique descriptor space that is given a descriptor it can reference
any I/O device so or even any kind of I/O channel so you can have a descriptor for terminal
or descriptor for a file or descriptive for a disk or descriptor for a pipe or descriptor
for a socket and you don't need to know what it references in order to be able to read
and write that thing so if i hand you a descriptor
you can read from that the descriptor or you can write to that descriptor and the correct thing will happen and you'd say well that's so obvious I mean how else could you
possibly think of doing it? well predating UNIX everything was done with a little subsystem that would open a file, read a file, write a
file, close a file and there was another set of system calls which
would open a terminal,read a terminal, write terminal, close terminal and yet another one which was create a pipe,read a pipe,
write a pipe and so on. so if you are just a drop dead stupid
program like say CAD you would have to have code in there and say was
my input a terminal which in case I need to use the read terminal or is it a file which in case i need
to use read file or is it a pipe in which in case i need to use read pipe and so the program itself had to have all
this coding in it whereas when they went to the uniform descriptor space CAD doesn't know it doesn't need to know
it just says read my input, write the output and it works and we add a new type of descriptor and CAD just continues to work just as it always
did. So this proved to be a very powerful construct and pretty much every operating system after
UNIX did that there's one exception of %uh large company in the Pacific North-West that still has not quite uniform descriptor
space but %uh that's part of their legacy that really they're working on that. Longhorn will be here. and anyway this set of facilities then makes up the UNIX virtual machine and in some sense we still see virtual machines
being used today in fact we're seeing sort of a reversion back to some of the IBM stuff in things
like the VMware which is essentially allow you to go back to booting
native operating systems again so sort of interesting to watch that the sort of pendulum of back going back and forth
of what's the correct layer for for doing virtual machines Okay? so far so good? all right so i said that there were two key ideas that UNIX had the first of these being the uniform descriptor
space the second one which was really critical was
this notion of processes as a commodity item so here on Page 17 I've tried to lay
it out the that the components that make up a process and what do I really mean when I say a process as
a commodity item okay leading up to UNIX the systems that pre-dated it, processes were these very large heavyweight expensive things and if you look at MVS which was the operating system
that ran on IBM for doing multiple processing and the system administrator would decide at boot
time what degree of multiprocessing they wish
to support so they'd say well well, we'll let upto six things happen at once and so as part of booting up they would create six processes and now you as a user if you wanted to do
something let's say you wanted to compile and run a program you would be given a process and it was up to you to figure out how to stage what you needed
done and that this was often fairly complex and so you would have to write out all the
steps that you wanted in this wonderful thing called JCL Job Control Language. Job Control Language was send mail configuration
file of the sixties there where people whose sole job at the company
was how to put this stuff together 'cause all you had to do is get one extra space or
a missing comma something in there and the whole thing would just blow up. it would
just sort of spit the card deck back at you and say well somewhere in there is a mistake that's sort of
in the general area of this card and I can't deal with it. Fix it. and of course in those days it wasn't just a matter of hitting
carriage when you know make carriage return you have to get your deck pull out the card, and type the
new one, put it back in and re-submit it As heaven forbid you couldnt touch that
card reader you know, it had to be done by an operator so the card deck will read through it would
disappear and you know if you're lucky a few minutes later
if you were not lucky a few hours later you would get a print out which was what had happened and then you could
look at it and you know I put a comma in the wrong place I guess
I get to do it all again so the thing you would need to do there for compiling and running a program was you'd have to break into these steps. well
I need to run the the preprocessor and so clean out whatever gump that was left
over on that process from the previous user put the preprocessor in there and then read from this file here let's
say I gotta put it somewhere so creative scratch file over on this disk and it was excruciating detail like how many cylinders
and how many tracks and this and that blocks blah blah blah and don't forget any of those parameters 'cause
it'll spit it out if you do and so then it would run the first step in that
if its successful then you'd have sitting in this scratch file that you had created the output of the preprocessor and then
you'd load the first pass of the compiler and you say now read from that scratch file
and create this other scratch file over here and when thats successful and we need to delete that
one and then load the second pass, put that back
into another scratch file and then we run this assembler, and the optimizer then the loader this and that finally run the program and if all goes well you know at step sixteen out comes the answer forty two. so UNIX said, look this is silly a lot of this is just bookkeeping and computers do bookkeeping really well and you'll recall yeah but it's going to take
all these cycles it's like computers are supposed to be labor-saving
devices right? so they came up with this notion that they would
create processes on the fly as needed you had you've had a preprocessor in two
steps of the compiler and then optimizer and then a loader we just create Boom seven processes and we connect them together with pipes and so we take the input and you know run
through in through the pipes and you know out the end
you get the the executable and we will simply create each of these processes and so you as a user just type you know the C compiler and it just fork these things pipe them together got the result and then once it was done with this processes is
just threw them away so any time you'd create a new process and it came to you pristine clean and you needed a bunch of things it did
put everything in intermediate files the fact of the matter is in the early days those computers didn't really have enough memory to support
all that stuff at once so behind you those pipes were actually implemented
as files but you didn't have atleast to remember to create
them and delete them and deal with them as far as you were concerned it just look stuff
flowing through pipes and of course today it just does flow through pipes in memory okay so this notion then that that we're just gonna
create processes on the fly is needed and connect them together as needed it was a novel concept and it wasn't that somehow mysteriously figured
out how to create processes cheaply cause they hadn't they were still really expensive to create but that extra effort was worth it because it was saving a lot of programming
time so my favorite example is you run ls so we have to create a process load the ls binary into it it prints a line or two on your screen and we tear the entire thing down and return
all its resources back to the system more than ninety percent of the cost of running
ls is creating and destroying the process a tiny fraction of it is actually running
ls but it goes so fast, who cares right so the point is that that concept of just creating things as
needed again was very powerful and is one that is just pervasive today okay so what is a process actually made up
of it gets some amount of CPU time or at
least we do dearly hope that it gets some amount of CPU time, the lack of getting
CPU time that makes it a computer so sluggish of course others really boils down to scheduling and we're going to talk about scheduling probably more than you care to in a couple weeks time we have the asynchronous events these are the external events that are coming in so they may be either things that were coming in from the outside world like
start, stop and quit oh out-of-band data arrival notification that kind
of thing or it may in fact be things that the program
is bringing down upon itself such as a segment fault,a divide by zero and some other what would normally be viewed as incorrect
operation and so we'll talk about that when we talk about
signals every program gets some amount of memory it gets an initial amount when it starts
up injured generally allocates more as it goes along this of course we will deal with very extensively
will spend an entire week on it when we talk about how virtual memory is implemented and then we get I/O descriptors I used to say that every program had to have
at least one I/O descriptor since it absolutely had no input absolutely no output then it was sort of pointless of course I had to have one of my students
come up and point out to me there is an a class of programs which don't need I/O descriptors and that is these things called benchmarks it just compute something all we really care
about is how long it takes them to compute we dont actually care what the answer is In theory we dont I personally like my benchmark stop with
something so I can see it there doing computing the right thing but in theory that wouldn't be necessary outside of that class of programs everything needs some sort of descriptors and
of course we'll talk about descriptors quite extensively as we go through the I/O subsystem okay so the executive summary is that processes
are the fundamental service that is provided by
UNIX and what we're going to spend essentially the
next two and a half weeks working on is what what makes up processes we'll go into much more detail about each of these
four points and then how do we actually go about providing that bit of service the next thing that I'm going to do now is this go through and lay
out some of the terminology that we have when we're talking about processes so this is sort of the big picture here were
on page eighteen and you can see we have sort of three bits that
make up the system we have the currently running user process and then what we call the top half of the kernel and the bottom half of the kernel now this would be a picture for a uniprocessor so one CPU if we had a multiprocessor %uh then we would have one instance of the kernel but multiple instances of the user process but for any given CPU on a multiprocessor it is running exactly one process so you may think they we're running for four-five
processes all at once but the fact of the matter is that any instant
in time there's only one process which is actually running and that is the one that we have loaded in the system now we give the illusion that were running
lots of things because we switch between them rather quickly so it looks like things are happening in all
windows at once but in reality that's not really happening okay so there is a set of properties that I want to
look at that had to do with each one of these parts here but just to sort of look at it from the
big picture perspective what you see here is there is boundary between the user process
and the top half of the kernel which is really just like a glorified sovereignty
call it's a lot like calling into a library routine
like calling strcat, strcpy or something like that when you do a system call we take that same set of parameters now this is sort of brick Wall here if you will that is protecting the top half of the kernel from the application I'll go more into some detail about how that
actually gets implemented but in essense you can think of it
is is there sort of this whaling Wall and these little chinks there and you can sort of push a request
through and somebody other sides sort of pulls that
looks at it and decides whether they're going to dain to provide service to you and if they do then they sort of send it back well like a library where you can just sort
of reach in and walk around if you want to you good programming practices you don't do that
but you could all right so the the top half of the kernel is really looks
a lot like a big library %uh it just happens to be a library
routines that deal with things where processes need
to interact with each other in fact for many people they don't understand
for what's the difference between the C library and the top half of the kernel if it's something that you're doing that
no other process needs to know about then it can be in the C library so if you call strcat to concatenate two
strings together nobody else needs to know you're doing that
you don't need to coordinate with anybody else that you're doing that it's just happening so that goes in the C library. on the other hand if you're reading or writing
the file there may be other processes that are also
reading and writing that file and therefore that has to be done by the kernel because they can coordinate all the different processes that are trying to access
that file. so the top half of the kernel is pretty straightforward
code it looks a lot like any other library that
you would write if you look at top half kernel code you know you see all read,come in
it's got these parameters we Mark around we get some data that we put it in the buffer and
we return back and in fact writing code for the top half of
the kernel is not all that difficult to do it's you have for many of the same properties that you would
when you're writing user level application code the bottom half of the kernel is where things
start to get nasty because the bottom half of the kernel is the part
of the system that deals with all of the asynchronous events
in the system is things like device drivers, timers that level of thing that are driven by hardware events so for example a packet arrives on the network that causes an interrupt to come and that will be handled by the bottom half of
the kernel and historically when an interrupt came in it preempted whatever
else was going on and it ran until it finished and then it returned and it could not go to sleep to wait for resources or other
things %uh in current systems you can actually go to sleep in the interrupt driver
and waiting for some other activity to complete it is however not a good idea to do that because the usual case of most device drivers is they
can finish whatever they're doing in an interrupt without ever blocking and so when an interrupt comes in we assume that you're
not going to sleep and if you actually then go to sleep.oh man you didnt tell us you're going to do this we
have to go off to do a whole lot of other work that we had originally planned on doing so if you go to sleep in a device driver you are taking a very serious performance hit so it's highly recommended that you don't
do that but if you have to you can on it's because of this historic behavior
or of not being able to sleep in the bottom half
of the kernel that you have certain properties that have taken over in device drivers and that is that a device driver should be handed all
the resources it needs to get his job done you don't give a disk device driver
Go read this and put it somewhere you have to say Go read this particular block here is a chunk of memory that I want that
data to put in and notify me when it's done because things like allocating memory are classic
places where you end up having to go to sleep to wait for stuff to happen and historically you couldn't do that even currently don't want to have to do that so device drivers generally have all
resources pre allocated and then they can just go the one place where this doesn't work is the network and in particular you don't know when somebody's going to send
packets to you you say well you're looking to open connections but if you're doing something like IP forwarding there's no top half state it's dealing with this packets
they're just coming in on one interface being sent out on another interface they never pass through any part of the top
half of the kernel and so in the case of network device drivers they need to allocate memory and if memory gets into short supply and they try to allocate memory and it's not
available they historically coudnt wait for memory to be
available and even in practice today don't wait for memory to become available they simply drop the packet on the floor it's like well I didn't have any place to
put it sorry oops now that doesn't cause incorrect behavior because the higher level protocols will retransmit but it does cause great performance problems
because retransmission means that connections stall they have to back up they have to resend data and so on so you really want to avoid dropping packets
if you can possibly help it and consequently we tend to pre allocate a certain amount of memory for
the network drivers and we try very hard to make sure that we're not
going to run out of memory but if packets come fast enough and we can't deal
with them as quickly as they are arriving then over short period
of time we get to the point where we simply have to start
dropping packets okay this is a part of kernel that you do not wish to
write code for because it is extremely difficult to
debug you get these bugs where the only time it happens is on the third Tuesday
when there's a full moon and we have a disk interrupt followed by %uh a
terminal character coming in and the network packet arriving of size fifteen
twenty two and when all those things happened the system panics and of course there's like it panics
cause you're following some bad pointer something that should have been there
but was freed some time in the distant past we are not sure when and try to debug things like that is extremely
difficult and you can think well I think I found the problem but
it's not reproduceable you know you have to wait for the next third
Tuesday with a full moon and blah blah blah to happen and you know so you sort of statistically
guess that you fix that you know I was getting this bug once every three days and now it's gone for two weeks without happening did you fix that? or if you've been lucky and and it's that coupled with the fact that you're
dealing with hardware and hardware rarely works the way it's documented
to work and so you know they're doing everything that
it says you're supposed to do it still doesn't work because you didn't set
the fiddle bit over on that other place over there that's not documented anywhere but if it's
not said it doesn't work occasionally so this is another reason that you really want
of avoid dealing with this part of the system if
you can possibly help okay but lets go through and and look at some
of the properties here starting up at the user process we're running with preemptive scheduling now there's several caveats here preemptive scheduling is the default so called shared scheduler that is what you normally use there are other
schedulers like the real time scheduler where what I'm saying isnt that true we'll talk about some of the schedulers was
later but the usual scheduler that you're running
on under UNIX is a shared scheduler and under the shared scheduler user applications run with pre emptive scheduling and pre emptive scheduling means that you run at the whim of the system if it wants you to run you run once you to start running you have no guarantee
of how long you're going to run it might like to run for three instructions
and then decide it doesn't like you many more it wants to run something else or you might get to run for several seconds
and in a row with the with no intervening things interrupting you you just don't know and really all you know is that they claim that they're using statistics
and that and that the statistics are fair and so on average you're going to get a reasonable
amount of time but thats up to the system you don't control that the real point here is that you don't have any way of creating
a critical section you can't say okay I don't want to be interrupted during this particular sequence of things so you have to program assuming that you may be interrupted at any
point okay the next thing is that when you're running
in a user process you are running in with the processor in what's called unprivileged
mode one of the requirements for running any kind
of a UNIX system is that you have to have a processor that
support privileged and unprivileged two different modes of operation in privileged mode which is what the kernel
runs in the entire repertoire of the hardware is available by this I mean you can set all the registers
you can fiddle with the memory management unit you can initiate I/O you can access any memory anywhere etc when you're running in unprivileged
mode which is what user processes run in and this a large subset of the instructions which
you cannot execute you cannot initiate I/O on devices you cannot change the memory mapping you cannot access memory that's not part of
your address space you cannot execute certain instructions
like halt and so in general you are protected from manipulating anything that's outside of your
address space this of course is desirable because when you're running in this unprevileged
mode you're protected from other processes manipulating you
and they're protected from you manipulating them for those of you that have had that misfortune
to have to use early versions of windows up to about ninety
eight they always ran with the processor
running in privileged mode even in applications and so either maliciously or accidentally you could stop on other people address space
or you could stop on the kernel and a lot of the blue screen of death was
people just following wild pointers and trashing different
parts of the system taking everything down it also makes it far easier to implement things like viruses and worms and
other things because a user application can we rewrite the boot
block on the disk they can just the write down and manipulate the registers that allow them
to do whatever they want whereas when you're running in unprivileged
mode you cant write those kinds of of things so modern versions of Windows anything from about
2000 on now run with privileged and unprevileged mode but UNIX has always required that and so when you're running an user process you cannot block i mean you cannot execute the instructions which
cause a context switching to occur you can't pick what's going to run next you can't make that thing run next all you can
do is go to the operating system and say hey I've got nothing to do. pick somebody else
to run and the operating system is the think they can
then execute the instructions which cause a different process to be loaded and run alright.finally while you're in a user application you're
running on a user stack that's part of the user's address space so part of creating a process gives you a runtime
stack as part of a virtual address space and so it
can be more or less up to the limits of the hardware
as big as you want it to be so if you are running on thirty two-bit processor you're stack can get the 2 gigabytes and the what this means is that anytime you
allocate local variables you don't have to worry about Oh is that gonna overrun my stack? so if you need a hundred thousand double precision floating
point numbers you can just as a local variable allocate an array of size a hundred-thousand type
double and it just decrements your stack pointer by
hundred hundred thousand bytes away you go it's just virtual address space as you'll see when we get into the kernel that ceases to be the case