PROFESSOR: OK. Today I'll use this trinity-- the science, engineering,
technology trinity-- as sort of a theme that
we're going to develop through the whole term. And we'll just briefly refresh
ourselves, or remind ourselves, what Professor Lauffenburger
talked about on Tuesday. And then I'm going
to go a little bit into some detail
about how information is used in biology, in cells. It's a way of gaining
a broad perspective of a facet of the biological
basis for bioengineering. but it's also a way
to introduce to you how we can use that information
in developing technology. OK. Now just a few odds and ends. We do have a course website. On that course website
will be articles that I may or may not download
from newspapers or magazines. Generally, if you want to
be in the biospace, read the newspapers, you know. Today the concern
is the occurrence of the flu virus in Africa now. So it's now spread from China
to Turkey from the Middle East down to Africa. There's a big question
of how it got there, and then what's the
implications for a flu pandemic. Of course, we have
this very large concern about a flu pandemic as
biologists, bioengineers, and biotechnologies. It's an area in which
our contribution, in some small part,
may play a large role. OK. So just generally be aware of
what is going on in the world. We have online, on
the website videos that the various
bioengineering professors have produced that describes some
facet of their research. These are
professionally-produced videos. So please go to
them, because they'll be useful in augmenting
some of the seminars that you'll hear from
the other faculty. I went over the paper. The paper will be graded
generally A, B, C, or fail. Last year I think I gave an
obscene percentage of As. It's only a five-page
paper, so it's not, like, a writing requirement thing,
where you sit up the night before and write something. You really have to talk
to me about what topic do you want to
write about and how you are thinking about exploring
that topic in your paper. So there'll be a certain amount
of interchange between you and me about your paper. And as a result, you'll
have a well-crafted paper that will be relevant
to the kinds of things that we're trying to accomplish
or learn in this course. This is just the schedule
of the upcoming lectures. And so these are the professors
from different departments in engineering that
will be talking about some flavor
of bioengineering. And the course ends on March 23. That's when the
paper will be due. It'll be due in class. I'd like also to
have an email copy. But we'll go into that detail
much later in the term. So this is only a
half-term course. And I'll end with a selected
topic in bioengineering. And it's usually
something that's topical, based on what's in the news. OK. Now, Doug spent
quite a lot of time emphasizing that engineers
are both scientists and technologists. We create technology, but the
technology is based on science, on knowledge. So science is what
we can find out about something that exists. Engineers either learn from that
and develop technology or help understand that, the
basis for that science. What I'd like to do today is
to explore the bioscience basis for bioengineering. And when you talk about
biology, you really have to start with the cell. You can talk about the molecules
that comprise the cell. You can talk about how
cells are organized, into tissues and organs. But in the end, the quantal
unit of life is the cell. And so if you look in just the
scale in which we immediately find ourselves in when
we talk about cells, we're in the microscopic world. We're in the micron to
tens of micron sizes-- because that's
typically what are the range and sizes of cells. And then if we think about how
biology or biological structure is organized relative
to cells, there's how tissues, which are how
cells are organized to carry out a specific function-- and then the tissues
themselves are organized into organs in higher animals. Organs carry out a
complexity of tasks. And obviously,
the hierarchy goes to the body plan of an adult.
Cells, on the other hand, are, themselves-- we'll view them as machines
that are composed of subsystems. The subsystems are built
from basic components. The basic components themselves
are some specified collection of atoms. So obviously, we can go
down to the nanoscale. But when we talk about
nanoscale in terms of biology, it's really at the level
of the macromolecules that comprise biological structure. So the size range is huge
when you talk about biology. It's from nanometers
to meters in size. We can think about
biology in terms of energy that drives or that is the
basis for biological function. We can talk about
the forces that are involved in biology,
whether it's the forces that are exerted at the cellular
level or forces that are exerted at the molecular level. So as opposed to
the physical world, the biological
world is very much a world in the piconewton
to nanonewton range. So it's not a lot of force. Piconewton is roughly the
force exerted on a bacterium by gravity. Or if I had a laser pointer and
you saw the spot on the wall, it would be the optical
pressure of the flux of photons on the wall. That's on the order
of a piconewton. So we're not talking
about a lot of force. But at the molecular level,
that's what drives biology. As far as times, biology is
very much a chemistry-based process-- and so chemical
reactions and things that regulate and control
chemistry in those time scales. But it's also a
mechanical process. And so there are
mechanical steps and mechanical reactions at the
molecular and cellular level. They have their own time domain. And then, obviously, there's
other time dimensions. For instance, the
onset of cancer is measured in years as
opposed to things that happen at the molecular level. So just be aware of the
space that we'll be talking about in different examples. I have this picture
that I lifted from David Goodsell's book. It's a rendering of a bacterial
cell in which he has depicted the components or the
contents of that cell in as realistic fashion as possible. So it's the dimensions or the
sizes of the macromolecules-- in this case, the protein
and the nucleic acid-- is represented, as well
as the concentration of these elements. And so one of the striking
things that you see is that a cell is packed
or is a dense collection of macromolecules. To your eye, it may seem
that this order is random. But in fact, there is
order in the organization of the cell and its contents. As I said, cells range in size. They, in a lot of cases,
move at different rates of their lifetimes,
depending on what cells can span from minutes to years. We already talked about force. Living systems basically
combat entropy, so that requires
energy of some kind. The energy is converted from
energy sources that we ingest or we absorb from
the environment. And there are some basic
activities that cells do. All cells grow, divide, and die. Right? And then, in addition to that,
they take on some advanced functions or
specialized functions, like an immune cell
will migrate and look for pathogens in an organism,
a muscle cell will contract, a nerve cell will
transmit information as electrical pulses, and so on. Just to give you an
idea of the magnitude of biological complexity--
a bacterial cell is a unicellular organism. So its unit is one. Whereas an estimate for the
number of cells in a human is on the order
of 10 to the 13th. So that's you and me. And in our lifetime, we turn
over on the order of about 10 to the 16th cells. So that's a lot of biology
that goes on in the span-- in a human lifetime. Now, we're engineers. You're here because you're
interested in bioengineering. And there's an aspect
to understanding biological function
at the cellular level, at the molecular level,
or, on the other hand, at the tissue or organ level. It'll be emphasized
to you over and over again that you should view these
things as systems [INAUDIBLE].. Actually, everybody turn
off your cell phones too. So if we think about
cellular systems in biology, we can think of it in terms of
the different kinds of cells that may make up an
organism or an animal. And one way of
thinking about it is to break down biological
processes in basic units. And we tend to think
of them in terms of the chemical systems of
a cell, mechanical systems of a cell. So the chemical systems are
basically the biochemistry of cellular processes. The mechanics describe all of
the forces and structures that contribute to force,
that produce motion either at the molecular level,
or at the cellular level, or at higher levels. We know, for instance, that
cell membranes have the ability to segregate charge. When you distribute
charge, then you set up an electrical system. And then we're living beings. We, as I said, try
to combat entropy. And so there's a certain
amount of thermal energy that characterizes
living systems. That is the input
from the environment. And actually, that
amount of thermal energy from the environment
accounts for a lot of the stochastic
or random behavior that you see in
biological processes. So as a formalism, we can
think of cellular systems or cellular machines
as doing work. And then the question
is whether or not we can describe a cell in
terms of its subsystems and whether that's a
good representation. So when Doug talked about models
as an endpoint of the synthesis between experimental
biology and engineering, one of the things
that we want to do is to understand these
different subsystems to a degree that maybe we are able to,
in the end, model a cell, either at a very basic level
or at a very fine level. Because if we can model
cells or cellular processes, then, if you think ahead,
we can then devise or think about ways in which we can
interfere, or intervene, or modify the behavior of
cells, or what cells do, or things that go on in cells. OK. Now, I told you a
couple minutes ago that cells grow and they
divide, and after they divide, they become something. And in a human being, you
start with a fertilized egg, which grows and divides
into 10 to the 13th cells. So in the development
of an embryo, there's a lot of cell
divisions that go on. And in the lifetime of
a person, if there's 10 to the 16th cells,
then that means that there's on the order
of 10 to the 16th cell divisions that go on. So 10 to the 16th is
a pretty big number. And at the end of the
day, or at the end of the year, or at
the end of the decade, or at the end of some
point in your lifetime, you pretty much
want to make sure, or hope, that the cells that are
being replicated in your body as you age actually have exhibit
a high degree of fidelity in copying the initial
set of information that gave rise to your cells
when you were an embryo. Because if there are errors
in the information that's specified, the division of
cells into two daughter cells, then those errors get
replicated and additional errors are introduced. They multiply. And there may be synergy
between these errors, which we call mutations, that lead
to, for instance, cancer. So if you think of this
in terms of information, the cell is somewhat of
an autonomous system. And the cell grows,
has the information for dividing into
two cells, and it has the capacity of copying
that information that specifies a cell into the two copies. OK? So this is a little
bit different way of thinking about what we
call DNA replication and cell division than how you may have
heard about it, for instance, in 7.01. So that raises a question. How many have taken 7.01,
or are taking it now? So let's put it this way. Who hasn't had
Introductory Biology? And then who hasn't
had a biology course? OK. That's all right. OK. So in 10 the 13th mitoses to
give rise to the adult body plan or the 10 to the 16th
divisions that give rise to cells in your
entire lifetime, you want to make sure
that the information that encodes this process is copied
faithfully, without error. Now, we know what
that information is. It's your genome. It's the organization
of DNA, the sequence of DNA in the cell. And the genome is-- the DNA encodes-- or
you can think of DNA as a chemical tape, as an
analog of a type of a tape. But in this case, it's
of a chemical nature in which the sequence
of just four bases specifies what ends up being
the components list of the cell. So the genome, it tells
us what different proteins are encoded in the cell. Also in the genome
are sequences that tell us where these
protein or genes that encode these proteins
lie and some aspect of the timing in which
these genes are read out. And then the other
thing about the genome is that it's actually mirrored. It's not a single
strand of DNA, but it's a double strand of DNA. And so, if you remember back
to even high school biology, that double strand is
a type of mirroring, because one strand is a
complement of the other, meaning that if you
only have one strand, you can synthesize the
information or the sequence on the other strand. So you can think of
the genome itself-- because it's a diploid
copy, at least in us-- as a way of providing some
error-checking capability. But also, then,
in each cell, you inherit almost a 100%
accurate copy of your genome. There is an error rate
in DNA replication, and we'll talk about
that in a minute. OK. Cells grow and divide. Some cells actually stay as
a resident population of stem cells, which just grow and
divide, grow and divide, grow and divide,
spewing out cells which later will
become specialized into different kinds of cells. And so here is a
representation of cells that give rise to the cells in
our immune system and our blood system. So here is the T And B cells
from lymphoid stem cells, which originate from these
pluripotent stem cells. And then red blood cells,
or platelets, or other cells that arise, again,
from this pluripotent stem cell, but through a
myeloid stem cell lineage. So there are decisions that
are made whether or not to stay or remain a stem cell. And then if the decision is to
become a differentiated cell, there's decisions that
control, or steps that control, whether a cell
ultimately becomes one of these kinds
of cells or becomes cells of the immune system. And so on. So that's a general plan of
how complexity is carried out in these systems. Now, I talked about the DNA is
how information is represented and it's packaged into genomes. The genomes themselves
can be one piece of DNA. Or for very large genomes
it's convenient to break them into smaller bits
called chromosomes. This is just a plot of the
different genomes that actually have been sequenced
over the last 25 years or 30 years in which
the size of the genome is represented in the number
of nucleotides in that genome. Or another way of thinking
about how big a genome is is in its functional
units, which are the genes. And so the human genome
is 3 billion bases. It encodes roughly 24,000 genes. If I gave this-- if you sat in on this lecture
maybe five years ago-- certainly 10 years ago-- we had thought that the genome
was more like 100,000 genes. But now that the genome
has been sequenced, it's now certain that the number
is much smaller, and very close to 24,000 genes. These blue dots represent
key genomes that have been sequenced with time. And so you see that we
started with a simple virus, went to a simple organelle,
went to more complex viruses, and then eventually up the
tree of complexity to yeast and then methicillin such
as flies, worms, and humans. In the meantime, genomes have
been continued to be sequenced, and the community of
genome-sequencing people have spent a lot of time
sequencing microorganisms, since they're easy to sequence
and the biological diversity is high and, plus, because
we're interested in combating disease. OK. Now, again, going back
to what you should have learned already. This is just a schematic
representation of DNA. This is diagram of the
chemical structures of DNA in which the four different
bases are represented by different colors. And what this diagram
represents or shows is that what one strand encodes
is mirrored in the other because one particular base
will be a base pair with only one of the other three bases. So we have in
double-stranded DNA an exact complement
of each strand. And then, again, if you think
of this in terms of tape, when we want to copy the
genome, it's a physical process. So think of it as a tape machine
in which you copy from one tape to another. There is a readout or a
reader, which in this case is an enzyme complex
called DNA polymerase, and other cofactors or
other associated proteins. That enzyme or that
machine will take DNA-- DNA has to spool
through the polymerase. But this is a
chemical process too. So you need nucleotides
to be fed in. They get assembled
through covalent bonds, and you get from
one strand of DNA-- it's complement-- and
from the other strand, that strand's complement
is also synthesized. So think of this as a machine. The machine does chemistry. The chemistry is a
chemistry in which a chemical copy of a
chemical tape is made, but that there has to be a
process in which this machine reads the information
or recognizes each base at each position
and inserts with high fidelity its complement. OK. Now, the error rate in chemical
synthesis is on the order of 1 in 10 or 1 in 100. So if you think of
making DNA with a machine or just taking a
simple set of chemicals and throwing them
together and throwing in an enzyme that
copies them, you get a pretty good copy of DNA. 90% accurate is
pretty good, right? But when you think
of your genome as being 3 billion
bases, 10% error means that you have
100 or 300 million-- you have 300 million-- sorry. 10% error means that 300 million
bases are, in fact, wrong. So obviously, we can't
tolerate that rate or that level of error. So the genome is
read by a polymerase. It reads or synthesizes
a complementary strand at the rate of about 800 or
1,000 nucleotides per second. For very large genomes, then,
that would take a long time. In our body, a human cell
divides approximately once a day. And so if we have 3 billion
bases, you could do the math. Your cell, you don't have a lot
of time to do a lot of things. So for very large genomes,
the replication proceeds in parallel at different places
within a chromosome and all the chromosomes are being
replicated simultaneously. So you have this
process in which you make a copy of the
genome, and then those copies are then distributed
to the daughter cells during cell division. Now, as I said, the
frequency and error in just making a chemical
copy is on the order of 1 in 100 or 1 in 10. There's additional
sources in the process that improves this error rate. So, of course, we can't live-- we wouldn't be able to
survive if our DNA was mutated at that high rate. And so, for instance,
as a machine, as a chemical machine that
adds a nucleotide that's designed to base pair with an
existing nucleotide in a DNA strand, the polymerase
has a selectivity for certain nucleotides given
what base it needs to insert. And so that selectivity
reduces the error rate so that at the
level of polymerase, the error rate becomes 1
in 10,000 or 1 in 100,000. That's still pretty high. So what do we have, or
what's available to improve the fidelity in DNA replication? The polymerase or one of
the associated proteins actually checks which base
got stuck in, and it matches-- it looks to see what base
it should have stuck in from reading the
original strand, and it sees what base
actually got stuck in. And guess what it does? The wrong base got stuck in? It stops, takes that
base out, and replaces it with the correct base. So there's a
proofreading function, and that improves the
fidelity of replication by several log orders
of magnitude more. So we're moving in the
right direction, right? So that's problems in just
the inherent machinery that we have in copying DNA
from one strand to another. Now, on top of that, there are
a whole load of other factors which are trying
to mutate that DNA. And a lot of the problems in, or
a lot of the causes of, cancer are environmental
influences that cause mutations in our genome. So, for instance-- now, I fly
a lot of miles every year. If you're in a jet
at 40,000 feet-- my exposure to ionizing
radiation is a little bit more than yours. Or if you spend a lot of time
near sources of radiation, then your DNA is
going to get damaged. There's mechanisms that look
for damaged DNA, where there's a chemical change in the
DNA base in your genome or there could be
places in which a base got deleted or inserted. Anyway, in addition to the
error correction that's inherent in the replication machinery,
there's post-replication error correction processes-- one
called mismatch repair-- and that goes back and further
cleans up the genome sequence. And so the frequency of
error after mismatch repair is now down to a level of 1 and
a billion or 1 in 10 billion. So the evolution of the
system in faithfully copying one genome into two
copies is actually a very finely-honed process,
and it's all through evolution. Now, if you think of this-- I don't think there's
a physical system-- if you copy from tape
or copy onto disk, you don't achieve this
high level of fidelity in those copies. OK. Now, all I told you about was
that information got copied from one strand or
one genome to another and that the
information is in genes. But that in itself
doesn't tell you anything about the machinery. All we know from the
genome is the parts list. All we know is that
there's 24,000 genes. We have a good idea
of the function of many of those genes. We don't know what all of those
genes are or what they do, but we have a good guess
and we are continuing to be able to make
better guesses about what those unknown genes are up to. But in the end, the genes
encode proteins and the proteins are the execution
machinery of a cell. Those are the macromolecules
that actually carry out the work of the cell. And so the question
then is, how do you go from information encoded in
DNA to, somehow, information which is really just function? All you care about is that
different proteins carry out specific functions under
certain conditions. How do you go from a string
of four bases to a function? And so, again, in
biology, you've learned through the
process of transcription, in which a messenger
RNA is made, and then the process
of translation, in which the information
in nucleic acid sequence is then translated into
a protein sequence, that eventually protein
function arises. So it's an odd concept, right? It's like taking a blueprint,
a parts list of a machine and somehow trying to figure
out how that parts list knows what to do. Now, there's an
important concept that you have to keep
in mind when you're trying to study or understand
biological process in that we have several levels
of interactions that are responsible for
encoding the information. First of all, in the
case of DNA sequence, it's the string of bases and
the sequence of that string of bases which is important. But that sequence is hardwired
in that each base is covalently attached to its neighbor. So it's a chemical
bond which holds the information in its place. Now, what happens is
that information is then translated into a protein
sequence, a sequence of amino acids, and they're
co-linear in their information. So the protein sequence, if this
represents a polypeptide chain, is a string of amino
acids also held together by covalent bonds. But that in itself
doesn't encode or it doesn't represent a
function that's carried out. What happens is that the amino
acids in the polypeptide chain, at least in a local
area, will tend to interact with each
other in specified ways, that we understand very well. And what you have
is local occurrences of the polypeptide chain
aggregating, or assembling, or associating with each other
in various stereotypic ways. And so you have local
organization of polypeptide, and then these
elements then further associate with each
other and compact into something that
has, ultimately, a three-dimensional structure. So you have the information
originally hardwired in a sequence, but the
nature of the amino acids that are in that sequence then
causes that protein to fold. Once that protein folds-- it's only when the
protein is folded does it become functional. So that's how
information encoded in a genome ultimately
encodes or inherently has in it the
information what to do. OK. Now, this is a diagram of how
genes are turned on and off. And it's a particular
network diagram. When a phage infects
a bacterial cell, there's one of two
choices that's made. The phage either decides
to integrate itself in the bacterial
genome and hibernate or if the environmental
conditions, i.e. the nutrients are high, then
that phage will replicate and generate new phage. And so this is the
decision circuitry for that in which we have,
in blue, the DNA sequence encoding, in this case, the
genes for different elements, different proteins. In front of these
genes are sequences in which other proteins bind. And those proteins,
whether they're bound in front of a gene or not,
determines whether that gene is turned on or is inactive. And you see from these
arrows that genes that get or activate-- when genes are
expressed and protein made, that those genes, in
combination with other genes, can form a decision
local logic circuit which then tells other genes to
either turn on or turn off. So you can look at how genes
interact and understand and codify in very specific
ways the biochemistry of a particular process. And so this is a model. It's a representation of
how we see, in this case, gene expression is controlled. And because it's a model,
we can then predict or simulate what the
outputs would be, whether it's the prediction
of whether the bacterial cell or the virus will replicate
or whether it'll hibernate. So this is, again,
a role in which bioengineering plays, because
it is the element in which we can not only analyze
biological processes, but one of the goals, if you remember,
in the design and synthesis aspects of what
engineers do, modeling is an important
component in design. OK. Now, in the last
10 minutes or so, let's now take this little
bit of information-- I walked you through
in a very gross level how DNA is replicated. It's through polymerase. Now let's see what we can
do with that little finding or that little
bit of information and turn it into
something that has a technological application,
which is DNA fingerprinting. So how many watch-- what's that show-- CSI? I think more people watch CSI. Just don't be afraid. Just raise your hand. We know you're all
closet TV freaks. OK. Well, anyway. Just think about DNA
fingerprinting or genotyping as an application. The background is,
this is a chromosome. It has two arms. And just on chromosome
11 in our genomes, there's a location called THO1. It's a very specific
location in the DNA sequence. Now, first of all,
I want everybody to think of two numbers
between 6 and 12. So the first number
we'll call m. And so everybody think of
a number between 6 and 12. Write it down. m and then the number. And then everybody now
think of a second number. We'll call it f. And again, pick a number
at random between 6 and 12. This is how--
actually, 5 and 11. Sorry. We'll do 5 and 11. We'll do THO1. OK. So this is how DNA
fingerprinting works. And it's based on the
discovery that in our genome we have areas in which, in this
case, a four-base sequence, AATG, happens to be replicated,
happens to be copied in tandem. And at this position in
all of our chromosomes, there'll be anywhere
between 5 and 11 copies. Now, we're all diploids, right? We got one copy from mom,
and we got one copy from dad. So on one chromosome
inherited from mom, there's some number of repeats. And on the other copy
inherited from dad, there's another set of repeats. So the number of repeats
of these sequences determines which allele at
that locus that you have. Now, these repeats
in our sequences aren't just confined to this
one part in chromosome 11, but, in fact, there's millions
of these kinds of repeats. They range in size from
two-base repeats to six-, eight-base repeats. So this is just a map
of the distribution of some commonly-used
four-base repeats that are used in diagnostics
and in determining whether you or someone are
related or are responsible for, are the origin or source of
a particular piece of DNA. So in all of our
chromosomes, we all have these simple
tandem repeats. And so the question then
in distinguishing just, say, you from somebody else-- so here's the scenario. There was a crime, and
there's a pool of blood. Just say somebody
smashed some glass, grabbed something, got cut,
left some blood on the glass. We analyze the DNA. Now, what we analyze the DNA
for is the alleles at THO1. So how many people picked, I'll
say, 7 as one of their alleles? Raise your hand. My goodness. That's more than
statistic-- well. OK, then. How many picked 10
as the other allele? OK, not as many. Now, how many picked both
7 and 10 as both alleles? So that's even less. OK, we have 1, 2, 3, 4, 5, 6, 7. OK. Now, obviously, at random
chance, at one locus, you matched at those
two alleles, right? Now, does that mean that
you committed that crime? And then think of
the world population. How many billion is
the world population? At random chance, if we were
only to look at one allele, how many people
would match, just if we assayed at that one site? OK. So what's the answer? Obviously, if we
look at other sites in our genome for
other repeats, then we can knock down the
number of incidences in which you, by random chance,
would match with that sample. But even with the
piece of DNA that-- just looking at that
loci, we can still knock down the number
of people that match. So how many people-- so 7, 11. What was it? 7, 10, right? 7, 10. Let's just say m
is 7 and p is 10. Now, how many match
at those two loci? Not just 7 and 10, but that
the 7 was matched with the m. Just one? Raise your hands high. 1, 2, 3, 4, 5. OK, so we knocked out
two people, right? So if we went to your mom
and we went to your dad, got their DNA samples,
then you should have-- well, let's say, your DNA
should have at least one of the alleles
contributed by mom or dad. Now, there would be a
problem if one of the alleles didn't match mom or dad. It means that either mom or dad
wasn't your biological mother or father. And that happens. If you listen-- if you
watch on public TV, there's a series by William
Louis Gates, a professor at Harvard, who genotyped the
DNA of eight prominent Afro Americans, including
Oprah Winfrey, and then revealed to them where
their biological origins were. So it's all about DNA testing. So, here, the name
of the game is to determine how many repeats
there are in your genome. And so it's just a
counting exercise. And so now the technology
is, how do we count repeats? How do we find-- we know where to find
the repeats because we can use polymerase to
copy, using a strand that's designed to flank the
regions that outlie these satellites of repeats. And we use polymerase
to make a copy of one strand and the other. So this is the polymerase
chain reaction. And the number of
repeats is directly correlated with the length of
that copy, the [INAUDIBLE].. And so if we take a piece of
DNA that has these repeats, throw in these primers,
amplify across this region, we should be able to
see, in that DNA sample, we should be able to based
on size tell whether or not it correlates with one of the
repeats in a model DNA sample. So this is just a sizing
ladder of 5, 6, 7, 8, 9, 10, 11 repeats. And these are samples taken from
different people in which this corresponds to six repeats. This corresponds
to nine repeats. This corresponds
to eight repeats. This corresponds to something
a little bit less than 10. It turns out at THO1
you're missing one base, so that repeat turns out
to be a little bit shorter. So in almost all of these cases,
the alleles are different. You got one from mom
and one from dad. They turn out to be different. In some cases, they
happen to be the same. So you get one band. So it's counting these repeats. Now, of course, we want to
measure at different loci to exclude people by
random, at random. And so we want to look
at different loci. We also want to increase the
efficiency of this assay, so we multiplex the assay. Instead of looking
at just one repeat, we look at five or six
repeats all in the same assay. And so what we do is, we size-- separate the different repeats
of the different loci on a gel. So it's a medium for
separating DNA by size. And so you see here the
alleles for Penta E, the alleles for this locus,
the alleles for THO1. So just running on one
gel, you can discriminate among five different loci. If you run two gels, you can
discriminate among nine loci. And the standard tests
done in DNA forensics is looking at
something like 12 loci. So at 12 loci, the chances that
you will match with a sample left at a crime scene scales on
the order of the number of loci that you test at. So if you test at 12 loci, then,
if you have a world population of 10 to the 12th
individuals, then there's a random chance that two
people will match in their DNA. So this is virtually
a foolproof method for identifying identity. I'm going to speed
through the rest of it. If it's a matter
of sizing by DNA, then we understand how
that process occurs. It's a matter of
how long the gel is. It's a matter of what the
selectivity of the gel is for separating one
piece of DNA from another. It's a matter of how
you inject the DNA in. And there's an
element of diffusion. So let's hold the thought. We'll go back to
this and quickly finish it up on Tuesday. And then I'll introduce
you to the next topic.