(light bells ringing) (light piano) - [Narrator] We are the paradoxical ape. Bipedal, naked, large-brained, long the master of fire,
tools and language, but still trying to understand ourselves. Aware that death is inevitable,
yet filled with optimism. We grow up slowly. We hand down knowledge. We empathize and deceive. We shape the future from our shared understanding of the past. CARTA brings together experts
from diverse disciplines to exchange insights on who
we are and how we got here. An exploration made possible by the generosity of humans like you. (light electronic music) - Thank you for this opportunity
to present some of my work. I'm going to talk about
archaic introgression and what we have learned
about introgression from these archaic hominins
in the history of non-Africans and more recent work that talks about archaic introgression
in African populations. So analyses of genetic data
have shown the broad outlines of the evolution of modern humans. We know that modern
humans evolved in Africa and then there was this
Out of Africa Exodus. What has happened in the last
10 years is this revolution in ancient DNA has given us
access to genome sequences from two archaic hominins, Neanderthals and the sister
species the Denisovans. And by comparing these genome sequences to modern human genomes, we are learning about the interactions between archaic and
modern human populations. For example, we now know
that all non-Africans today trace a small proportion
of their genetic ancestry to the Neanderthals. On the other hand, Oceanian
populations in addition to their Neanderthal ancestry also traced some ancestry to the Denisovans. Now because these
introgression events introduced a large number of mutations
within a short span of time into the human gene pool,
there has been the hypothesis that these introgression
events could have had a major impact on human biology. So this has motivated a number of efforts to better understand and connect
these introgression events to impact on specific traits. So here's an example of an early effort that I was involved in
where we were looking for genetic variants that predispose Mexican-Americans
to type two diabetes and this analysis revealed
a novel genetic variant and what we found was this variant had a unique geographic distribution where the risk variant was
essentially absent in Africa. It was present at low
frequencies outside of Africa, but it was particularly present at higher frequencies
in most of the Americas and when we compared the
mutations that were present on this genetic variant
to the Neanderthal genome, we found that this was
likely to have introgressed into the modern human gene pool from the Neanderthal introgression event. So there have been several
such other notable examples of specific introgressed variants that have affected biology. For example, there's been an introgression that has been documented in the STAT2 gene which is particularly important
in immune related function. Another extremely exciting
discovery was the EPAS1 gene. So this is a gene where specific mutations that have been found in
Tibetans have been shown to be important in adapting
to higher altitude living and a recent analysis
showed that this mutation that allows or contributes to higher altitude
adaptation is introgressed from the Denisovan population. So to better understand the
contribution of introgressed DNA to specific biological phenotypes, we'd like to go from understanding which genes might be introgressed to a more genome-wide assessment. So this has motivated
efforts to build maps of introgressed DNA. What does that mean? So if you look at a
population that is descended from introgression between a
modern human and an archaic, the genome of this population
has a mosaic structure where there are parts of the genome that are inherited from
the archaic population and others that are inherited from the modern human population and because of the way recombination
chops up the haplotypes that are passed down from
one generation to the other, if you look at these genome sequences, the length of these introgressed
segments are characteristic of the time before which
the introgression occurred. So how do we go about
building maps of archaic DNA? So we use statistical models which compare the genome sequence
that we are interested in to an archaic genome as well
as to a modern human genome. And by comparing the genetic variation that is shared between the test genome, the archaic genome and
the modern human genome, we can essentially build
these maps that tell us what regions of this test
genome trace their ancestry to the archaic population. So using these maps we
can begin to understand at a very fine scale how
archaic ancestry is distributed both across populations
and across the genome. For example, we looked
at a very diverse set of modern human individuals
today and we built maps of Neanderthal DNA in these populations and what we show as we
recovered the signal where there's an enrichment
of Neanderthal DNA in populations residing outside of Africa. There's interesting variation
within these populations and I think Josh Akey's talk will touch upon the factors behind this variation. We can also build maps for Denisovan DNA and we see an enrichment
of Denisovan introgression in Oceanian populations,
but we also see populations in East Asia which have small amounts of Denisovan introgressed material. Beyond looking genome-wide
across populations, we can ask how does introgressed
DNA vary along the genomes and what we can see is that
there is a wide variation in how much introgressed
material a person carries as we move along their chromosomes. For example, there are
places in the genome where there is an enrichment
of introgressed DNA. In other words where a lot
of people present today carry introgressed DNA variants. For example EPAS1 was the
example that we started out with has a high proportion
of introgressed variants when we look at Tibetan populations and these are variants introgressed from the Denisovan lineage. Here's another example. So this is a gene that's
a particular outlier. It lies in this locus
called the basonuclin gene. A gene that is known to be
involved in skin related function and at this gene, present in Europeans, but half of them carried
the Neanderthal variant, compared to about 50,000 years ago when only about two percent of them carried the Neanderthal sequence. So we can try to figure out
what might have resulted in an increase in the Neanderthal
frequency at this gene, likely because this had
some adaptive benefit. However it's not the case that all introgressed Neanderthal variants are necessarily adaptive. Indeed we think that genome-wide, most introgressed Neanderthal or Denisovan DNA is deleterious. For example, there are
large regions of the genome which we call deserts of archaic ancestry where no present-day human carries either Neanderthal or Denisovan DNA. So these are particularly interesting because these are places in the genome which seem to be
resistant to introgression and potentially they harbor mutations that are responsible for
the modern human phenotype. So here is one particularly
interesting example of a desert. This is a desert which is resistant to both Neanderthal and
Denisovan introgression and it overlaps a gene called FOXP2. So FOXP2 is this famous
gene that has been shown to be involved and important
in speech and language. So now moving beyond non-Africans, we'd like to switch our attention to introgression in Africa. So the reason why our
understanding of introgression outside of Africa has been so advanced is because of the availability
of whole genome sequences from archaic populations
like the Neanderthals and the Denisovans, but
once we turn our attention to Africa, the situation
becomes a lot less clear. The reason is we don't have ancient DNA from archaic common groups. It would be wonderful to have them, but the technology hasn't
yet been successful in extracting ancient DNA. So what we decided to do was to look for signals of introgression in Africa without needing access to
ancient archaic hormonogenomes. So to do this, we adapted
two complementary approaches. So one is an approach that
looks at genome-wide data and it counts up the
different classes of mutations that a person carries along their genomes and it turns out that these classes of
mutations are indicative or characteristic of the history
of archaic introgression. The second line of evidence
involves building these maps, but doing so without recourse to an archaic reference genome. So let's talk about the
first line of evidence. So the statistical summary of the data we are gonna be looking
at is something called the site frequency spectrum. In a brief way, the way to think of the site frequency
spectrum is we are looking at positions along a person's
genome and we are counting up what kind of mutations
occur at a given position. So here we have genomes from Africa, we have the Neanderthal
genome and we have the genome from a chimpanzee. We are going to focus on those positions where there's a difference
in the state carried by the Neanderthal and the chimpanzee and at those positions, we're gonna see what
count of African genomes carry a state that
matches the Neanderthal. So for example at this position, the Neanderthal does
not match the chimpanzee and when you look at the Africans, they have two copies of the mutation that matches the Neanderthal. When you look at this position, the Neanderthal again does
not match the chimpanzee and the Africans carry
three copies of the mutation that matches the Neanderthal. So we go along this genome and tabulate this statistical summary which we call the conditional
site frequency spectrum. Now why do we do this? It turns out that there is
some population genetic theory that tells us what we should expect to see in this statistical
summary of the data. For example, if Africans
and Neanderthals split and never interbred, then this summary of the
data is uniformly distributed across all mutational classes. So now what do we see in the data? When we look at the West
African population, the Yoruba, the conditional site frequency spectrum which here is the blue
dots are far from uniform. They have this U-shaped pattern. We look at other West African populations and we find the same
characteristic U-shaped pattern. So in other words, at least a simple model where Africans and Neanderthals split and went their own way
does not fit the data. We then asked, could this be explained by other models of human history? For example, we have a
fairly good understanding of the relationship between
Africans and archaic populations and could this potentially explain and we find that again,
current models of human history do not offer a good enough fit
to the data that we observe. So then we explored additional models that are more complicated which involve different
levels of integration into the African population. For example, we asked whether there was structure within Africa. This is quite possible,
given all the evidence about deep structure within
different African populations. Is it possible that there was integration from a Neanderthal related
population into the Africans? Or is it possible that there
was a super archaic population that introgressed into Africa? And for each of these models
we tried to figure out which of them best explains our signal and the model that does explain the signal of the conditional site
frequency spectrum is one where there was integration
into the African population from a super archaic
population that split off prior to the split between
Neanderthals and modern humans. So this is neither
Neanderthal or Denisovan and so we termed this a
ghost archaic population and the key thing to remember
here is this is actually quite deeply diverged, farther
more than the Neanderthals and the Denisovans compared
to the modern humans. Now we can be more quantitative
about this analysis and we can try to figure out when did this population split off, when did it come back and interbreed and what proportion of archaic ancestry is present in Africans today? And so we did further analyses
and these are estimates with quite wide uncertainties, but what we estimate is a
date of about 600,000 years for the split time and
an interbreeding time of around 43,000 years. So this is still fairly
recent interbreeding event in the history of the African population. Further, we estimate a fairly
substantial contribution of this archaic ghost
lineage of about 11%. So compared to the Neanderthal and the Denisovan introgression event which are of the order
of a couple of percent. So we tried to have a
complementary line of evidence to convince ourselves
that this was plausible and to do this, we went back and tried to extract segments of DNA
in the African population that could potentially arise
from this ghost archaic. So to do this, we had to
have a statistical model which does not require
an archaic population because we don't have
this reference genome. So we validated this model, we showed that it works
under different settings, and then we applied it to
the West African Yoruba and we got these segments of
archaic DNA which we went back and asked, is this closely related to one of the genomes that
we have sequence data from? So we compared the introgressed
DNA segments in Africa to hunter-gatherer genomes,
genomes from Pygmy populations. So these are populations
which have been shown to have complex interactions with the West African
populations that we've analyzed and finally to known archaic genomes like Neanderthals and Denisovans and so what we're showing here
is a measure of divergence. So on the left you are closely related, on the right you are further related and what we find are that
these archaic segments compared to other non archaic segments are not particularly closely related to any of these populations
that we have genomes from. So what is this population? We don't know and so this
is one of the questions that we'd like to be able
to answer going forward. So just to summarize,
there is clear evidence that there is archaic introgression within and outside Africa and we have an increasing complexity in the picture of interactions between modern humans and archaics. So John Hawks also talked
about this preprint that came out last week
from Alan Rogers group which showed that there are additional archaic introgression
events in human history and so a big question for us
is to have a holistic picture which puts together these
different introgression events, asks whether some of these are coming from the same population or are these distinct archaic groups? This is a challenging task
and to be able to do this we need to analyze diverse modern and ancient genomes from Africa. We don't have ancient archaic genomes, but we do have ancient genomes from other modern human populations and we need to do this in the context of these more realistic models of history which take into account
deep introgression events and finally, the statistical
models that we've talked about are making certain assumptions
which are fairly simplistic and those need to be extended
to handle this complexity. With that, I'd like to
acknowledge my student Arun who's done a lot of this work on ghost archaic introgression funding and I'd be happy to take questions. (audience applauding) (light electronic music)