(inspirational music) - I'm very excited to
discuss my favorite topic in the whole world, which is high through put CRISPR Screens, and so I would encourage
you guys to interrupt as often as you'd like to ask questions. I have a lot of material, but I'm happy to divert
the flow to talk more about what you guys would be interested in. So, Luke set up a lot of
important topics that, in general, what we're
going to try to do now is to take some of those technologies that have been engineered
to work fantastically well in individual instances,
and now scale them up to do genome-scale screens, and so, first, I'm just going to give some general motivation
for why one might want to do a genetic screen. Probably, many of you have either done one or thought about doing one, but there's lots of
reasons you might consider doing this type of thing. I'll describe, partly, by
way of historical context, but partly by way of
describing how exactly it is that the Cas9 system has been so powerful at the functional genomics scale. Comparison to some RNAi
technologies that were used early on in functional genomics, and in my own lab, we've
done some direct comparison between RNAi and CRISPR that actually is somewhat eliminating
in terms of understanding the difference between
a knock down phenotype and a knock out phenotype, and what you can do with that information. I'll describe a number
of different screens that have been done by a
number of different labs. I'm gonna focus some on the
screens that we've done, just because I can talk
in more detail about them, but I'll describe some
particular applications in terms of identifying the
targets of different drugs and phenotypes for essential genes, which are very easy
phenotypes to screen for. I'm gonna describe a
little bit a continuation of what Luke set up,
and using CRISPRi and a to do genome-scale screens
and really elegant work that's been led by Luke and Max and others in the Weissman lab. And throughout, I'm gonna
sort of try to give, actually, some tips and things that
are just sort of practical, things that we've learned
in the process of doing a lot of screens, as well as
different design considerations that you might consider
if you're thinking about setting up one of these screens, and then, the last bit,
I'm gonna describe two relatively new applications
of CRISPR screening that we've been exploring in my lab, but have also been used
in a number of other labs. In particular, genetic
interaction screens, where you're knocking out
pairs of genes to look at synergies between
different pairs of genes, and a scaling of the
technology that Luke described at the very end of his talk,
which is what can you do if you can mutagenize
a large stretch of DNA. And then, at the end,
I'll just describe some useful things for what do you
do when you have a screen, the results from a screen. How can you actually use that information? So, just one really basic slide, I think, to kind of understand where
screens have come from. Probably, this may be too
basic for some of you, but before the sequencing of the genome, in the olden days, and
actually still today, to some extent, the way you
would do screens would be using forward genetics, and
here, you take an organism. You mutagenize it. You select for some kind of a phenotype that you're interested in,
say resistance to a drug, and then you find the
gene that's responsible for that phenotype,
usually by complementation or something like that. And this was, for a very long time, the way that people did
genetics in an unbiased fashion. Of course, you can also
study heritable traits and then find genes that way, but this is the way that
people have been doing genetics for a long time. With the sequencing of the genome, a number of different technologies
enabled reverse genetics, which is now that you know
the sequence of those genes, you can either use, for example, an antisense oligonucleotide or RNAi, which was the technology of
choice for about 10 or 15 years, where you can specifically
code a double stranded RNA, a short, I should say, 21
nucleotide doubled stranded RNA that effectively incorporates
into the cell's endogenous RNAi processing machinery,
which, in the end, cases a degradation of
a target RNA molecule, and so this, I should say, has been used for quite a long time. Still actually has some uses,
as I'll describe to you, but the key idea here is
that you're introducing whether by a chemically
synthesized double stranded short RNA molecule, or a
lenti-virally encoded hairpin that, in the end, gives
you that same product. A double stranded RNA that causes degradation of its target. So, now that we have the ability to knock down specific genes
and the sequence of a genome, you can do genome-scale reverse genetics, and that allows you to potentially find all the genes involved
under your favorite process, so instead of just mutagenizing
a plate of worms or bacteria and hoping that you find
an interesting mutant that gives you the phenotype you'd like, you can really systematically scan through every possible gene, knock
it out or knock it down, and ask what is the
phenotype of that gene. And so, I say that's
easier to saturate because, if you wanted to actually
saturate the entire genome doing a chemical mutagenesis screen, you would need an enormous
vat of worms or bacteria, and you know, thousands of people years to actually potentially get
all of the available mutants that might give you all the phenotypes, but if you know the sequence, you can actually go in and just surgically knock them down or out one at a time. So, this is a much more
efficient way of associating the gene with the phenotype, and now that we have not just
the sequence of the genes but actually all these
regulatory elements in the genome that Luke was describing, we can actually approach
a systematic annotation of the genome, both for
coding and non-coding areas. So, there's a lot of
information on this slide, but I wanna just describe
the two general strategies that have been used for
doing genome-scale screens, so on the right, you'll see
arrayed library screening. This entire slide is geared
toward CRISPR screening, but a very similar overall
workflow was used for RNAi for a long time, and the
key differences are that, in this case, you have one
well, one gene knock down, and in this case, you
have a pooled library, where you have lenti-viral, usually, plasmids encoding either
an shRNA or, later, a CRISPR guide RNA, and these are infected at low multiplicity, so that
each cell gets one virus in a pool, and then you can
subject this pooled population to selection, and so, in
the end, in both cases, you end up with some selection strategy, and then, you know,
informatics to separate the hits from the non-hits
in these types of screens. One of the nice things
about an arrayed screen is that you can potentially get much more high-content imaging, so you can put these plates,
after you knock down one gene in a well into a microscope, take an image of what the organelles
look like, for example, and really get an enormous
amount of information for what's actually going
on in that one well. You can't get, typically,
that level of information in a pooled screen, but these
have a number of advantages. They're much, much easier
and cheaper to conduct, so I'll just tell you right now, everything I'm gonna talk
about from here on out is in pooled format, because
we love the pooled format, but I wanted to set this up because a lot of screens were done this way, and it's still actually
possible to buy, even now, CRISPR guide arrayed in libraries, because there are some
applications where you really want an arrayed one gene at a time,
one well at a time screen. Okay, so, just to kinda go back to the history of these types of screens, so for a long time, screens were done, RNAi screens were done
in an arrayed format, and typically, the workflow
was sort of summarized in that slide before, but essentially, you have a stock plate, which
has one siRNA in each well. You end up transfecting
that into the corresponding well of cells through
one or another strategy, and then you expose those
plates to some kind of a drug, for example, so in this study
from Michael White's lab, they were looking at modifiers
of a chemotherapeutic drug called paclitaxel, and they
could find a number of genes, including proteasome,
different microtubule kinetochore-binding proteins, were modifiers of the toxic
activity of this cancer drug, and so, the utility of this approach was can we find genetic modifiers
that might be useful in the setting of cancer, where you use a drug like paclitaxel, which is a front line
care chemotherapeutic in breast cancers, for example. Sometimes, that drug fails. Could you find another target
that would actually improve the efficacy of that drug? So, in some cases, these worked. Here's a pooled screen. Now, each shRNA is encoded
in a single lenti-virus. These are infected into cells, and the screen here is to find, in the context of a KRAS mutant cancer, genes which, when you knock them down, further sensitize to
another kinase inhibitor, a MEK inhibitor, so MEK
inhibitors have been this, you know, powerful therapeutic
that have been used in some cancers. They almost always fail because the cancer finds another way around it, and so this was a synthetic lethal screen to try to find combinations
of second targets that would help a MEK inhibitor to kill KRAS mutant cancer
cells, which, in general, are thought to be un-druggable, so KRAS is this powerful oncogene that's mutated in a huge number of cancers and is thought to be difficult
or impossible to drug, so if you could find other
targets that would work in a KRAS cancer, you
might have a good way of killing these cells, and so, in this pooled screen,
they infected a entire genome-wide library, knocking
down one gene at a time. They treated with this
MEK inhibitor at a dose that would kill some of the cells, and then looked for things
that, when you knock them down, actually further allow
killing of the MEK inhibitor. So, this is what they found. The MEK inhibitor is called
selumetinib, I think, and so what they found is
that, in the presence of a control shRNA, the cancers still grow, but if you knock down BCL-X Long, and you use this inhibitor together, now you have synergistic
killing of these cancer cells, so this is to show that,
whether you're doing an arrayed RNAi screen
or a pooled RNAi screen, the technology actually works, so I'll say, there have
been enormous improvements in CRISPR screens. Some work from Luke and work
from a number of other labs that I'll describe, where it's
possible to really improve the specificity of these screens. Having said that, it's
possible to do an RNAi screen and actually get useful results, and I'll describe a little
bit more about this. So, this was taken further,
and they actually took tumors in mice and treated
with that same combination of now an avid drug
that targets BCL-X Long and the MEK inhibitor, and could actually show tumor regression, so just to show that you
can take an in vitro screen, use those findings to
actually translate it to anti-tumor therapy. Okay, so, you know, while there
shining examples of that type in the literature, where
you could find something really useful, there were also
these truly alarming examples of how terrible RNAi can be, and this is one of the
reasons why people have been so excited about how dramatically improved the CRISPR screens are. This is sort of a classic
example in literature of three different labs
who all did an HIV screen. So, ostensibly, this is a
virtually identical screen. I believe they used the same cell type. It was a screen for
resistance for HIV infection. I think they used
different RNAi libraries. There were three different labs, but what you see here is that
the overlap in their findings was actually probably
lower than you would expect by pure chance, which is
actually sort of impossible if you think about it, but it's one of those cautionary tales that is actually often cited in reviews as an example of how bad RNAi can be. And just to be completely
fair to these folks, it's not like they're,
you know, terrible labs that don't know what
the hell they're doing. It's that there are fundamental problems that actually need to
be considered, I think, in any screen, not just RNAi screens, but CRISPR screens as well. RNAi is subject to
massive off-target effects that are typically not seen
in most CRISPR screens, although I'll give some
caveats for that in a minute, so Luke showed this amazing figure, where you knock down with CRISPRi GFP, and that's the only
gene in the whole genome that gets changed. You would never, ever see
that in an RNAi screen, because if you put an
RNAi species in a cell, of course, depending on
how much shRNA or siRNA, you express hundreds of genes in addition to the one
you're intending to target also change, so off-target
effects are a major problem. With siRNA or shRNA, reagent
heterogeneity is a major issue. It's still an issue
with CRISPR guide RNAs, so we still don't know what is the perfect sequence of a guide RNA, but this is an even
bigger issue with shRNAs. It's actually not that easy to find a good siRNA or an shRNA. At the same time, differences
in methods for calling hits, cell lines, and the
protocols that are done in different people's labs will
always be a source of error, and so that should be kept in mind. I think, even now with, you
know, the latest and greatest CRISPR libraries and people
who are using, you know, what ostensibly should be
ATCC-derived K562 cells, you sometimes only see 60%
overlap between two labs that have done a nearly identical screen, so there are some factors
that probably won't be solved in the near future that account for differences between labs. Okay, so enter CRISPR. So, by comparison to an
shRNA screen or RNAi screen, where you're knocking down a transcription of an MRNA, with the initial
CRISPR screens that were done, and of course, this is
now complemented with the CRISPRi-type screens that Luke has done. You can take, for example, active Cas9, create an indels in genes
and effectively cause functional gene deletion, so
there's a huge difference now. You're causing an effective
gene deletion at the DNA level versus a knock down of
transcription at the RNA level, and I should say, a lot of
the things that I'm describing in the context of RNAi are now applicable and actually even better
with the CRISPRi system, so you might just
substitute, in some cases, shRNA with CRISPRi when
you're thinking about how you might conceivably
compare a knock down phenotype with a knock out phenotype. Okay, so one actually useful figure. This is, I think, from Feng Zhang's lab, but basically, they're showing
the type of differences in gene expression you might
see in a knock out system versus a knock down system, so even though there are these
nice examples in literature of when you target a gene
with active Cas9 and get nearly complete deletion, in fact, the amount of cutting and
editing of alleles you get is quite heterogeneous, so you know, at best, you're gonna get
genes that were never edited, some heterozygotes, and you
know, some complete deletions. This is, in fact, much more complicated in some cancer cell lines where you have more than two alleles. You may have a real spectrum of different editing phenotypes, and that accounts for
some of the heterogeneity. If you saw in the slide that Luke showed from Bruce Conklin's induced nuclease, there's a lot more
heterogeneity in that cell type than there is in the CRISPRi system, and that's partly because
you can't really control what happens when you induce editing with active Cas9. This has important implications
in a screen context, because you're now looking at
a pooled population of cells and you have to think very carefully about what do I call a hit, if my
population of 1,000 cells, each of which knocks down one gene, has an array of different
alleles being knocked down. By contrast, RNAi or
CRISPRi, while you may have not a complete knock
down, would at least have a more uniform perturbation,
so in principle, you could get an allelic series, so a uniform perturbation and different levels of knock down. This can actually be
useful in some contexts. So, this is sort of one of the examples, and from one of the early screen papers, where you can see, here's
an shRNA knock down, where you have different
levels of knock down, or a CRISPR knock out. This looks amazing, but I should say, this is a single allele GFP. Turns out to be a very
easy gene to knock out, because even frame shifts in this gene cause effective loss of fluorescence, but what you'll see,
even in this, you know, seemingly near-perfect scenario, is this population of cells that actually weren't edited at all, so this is something
that you need to consider in the context of a genome-scale screen. You will have some cells
that are not edited. Okay, I should also say
that both of these systems suffer from off-targets,
and I thought I would just briefly kind of compare and contrast some of the useful properties
of each of these systems, so CRISPR, in general, has, you know, greatly fewer off-targets. I have an asterisk there
because I'm gonna show you that, at least for active Cas9,
there are some very significant off-target effects that
should be considered, especially in the context of a screen, namely that, if you have an
active nuclease in a cell, it can cause DNA damage,
and that DNA damage actually has a measurable
growth phenotype, and potentially can be a
modifier of other phenotypes that you're interested in studying. Having said that, if
you look, for example, at what Luke had showed for
that single GFP knock down, you know, it tends to be
much, much more specific than RNAi, certainly. It's possible to completely delete genes. It's much easier to find
effective guide RNAs. Not only can you delete genes, but, as Luke has elegantly shown, you can also activate and
repress gene transcription. RNAi, well, has some
limitations, as I pointed out. One of the really nice
things is that there's a lot less engineering required, so one of the unfortunate
kind of pains in the ass about doing a CRISPR
screen is you have to get all of those components into the cells, so you have to get Cas9. You have to express it. You have to get a guide RNA. You have to express it
at high enough levels. You know, there's a lot
of reasons why you'd, obviously, want to do that, but I will say one of the nice things
about an shRNA screen is that you actually only have
to put in that one component, and you know, you still have to worry about expression levels,
but it's less engineering. Because they don't
actually induce DNA damage, even though they have their
own toxicity due to like saturation of the micro-RNA machinery and the off-target effects,
there's less nuclease toxicity in an shRNA screen. Perturbation can be more uniform. It's possible to target
specific transcripts, although, of course, it does
have some significant increase in off-target effects, so these are some pluses
and minuses to keep in mind. Some of these apply to the CRISPRi system and the less toxic than the guide or than active Cas9 for sure, but actually, many of these
disadvantages are not present in CRISPRi, so I think increasingly, people will be sort of
shutting down shRNA screens, although, because of this type of thing, in some cases it's actually really nice to be able to pop in an
shRNA and not have to worry about getting Cas9 and a guide in. Okay, so in some of the
early work that I was doing together with Martin Kampmann
and Jonathan Weissman's lab, we actually were, at that time,
building what we had thought was a really nicely improved RNAi system, and so I'm gonna just
briefly describe this, because it actually fit
into what we then used to do the CRISPR screens, but essentially, we
got around this problem of low on-target efficacy
and the huge problem of off-target effects by just making a massively redundant library, so instead of the two or
three shRNAs or siRNAs that people had used in the past, we just put, you know, 25 shRNAs, and if you use that many shRNAs, you're bound to have some that will work, and you have enough of them that, even if there's an
off-target for one of them, you can ignore it because you look at the consensus phenotype of those shRNAs. And this greatly increased
our ability to statistically significantly pull out
hit RNAs and screens, and of course, it's possible because of the two key technologies
that enable, really, a lot of the stuff that we're doing now, the ability to synthesize
these pooled libraries in a complex micro-array
synthesis platform, and the ability to sequence
those libraries at the end. So, our platform that we
built together with Agilent, of course, builds on
work from Steve Elledge, Greg Hannan, George
Church, and many others who had used micro-RNA based synthesis to not just make one shRNA
or the exact same thing can be used for guide
RNAs, but to synthesize now tens or hundreds of thousands
of different guide RNAs or shRNAs in one shot,
and you can clone these because the oligos are
high enough fidelity directly into a lenti-viral
vector in about a week, plus or minus. It allows you to do very rapid
screening in pooled format, so instead of the arrayed format, which I should have
mentioned also comes with an enormous cost and just time to actually screen
through those libraries, these pooled screens are quite fast, and importantly, for our
purposes, that the platform is very highly adaptable,
so when, you know, after we had spent all this
time building what we thought was a great shRNA platform and the CRISPR revolution happened, we could just take a
guide RNA and drop it into that exact same lenti-viral vector, and effectively use the
same workflow to do now CRISPR screens instead
of these shRNA screens. And because we can also change
the algorithms that we use to design these shRNA or guide RNAs, we've tweaked the vector to
make double guide RNA platforms, expression vectors, this
platform has been quite valuable for making these changes. So, the general workflow
for the pooled screens that I'm gonna spend most
of the time talking about is we build a pooled lenti-viral library that code the guide RNAs. We infect them into
cells so that each cell gets one guide RNA. These are typically
already expressing Cas9. We then do a pooled screen, and I'll describe a number
of different pooled screens that you can do but, you
know, a standard option that we might do is to look
for modifiers of a drug. I showed you two examples
of different cancer drugs that have been used, so we
do a lot of these in my lab, but a number of other labs have done this for a long time, to look for
genetic modifiers of a drug. And in the end, you
sequence these populations, and you compare the treated population with the untreated population. You count the abundance of the guide RNAs, and you can call hits. In our libraries, we use a huge
number of negative controls so you can actually call hits. I think there's another lecture on statistics of calling
hits and things like that, so I'm not gonna spend too
much time actually describing how the hits are called,
but suffice it to say, for most of the libraries that we use, we have a huge number of negative controls that allow us to carefully define what no phenotype looks like, and then you could compare the phenotypes of the guide RNAs targeting
your favorite gene to that population and very
clearly call hits from non-hits. At the end of the talk, I'm gonna describe some further work that we
have done to take the output of this type of screen and actually do genetic interactions measurements, so at the end of screen,
you might have 100 genes or 500 genes, and you
would love to know, well, what do these things actually do, so we've come up with
a strategy to repurpose tools that have been built in yeast to do systematic pair-wise
interaction maps, and actually understand gene
function in pooled format. Now, I'll describe that toward the end. So, some typical pooled
screens that you might do and considerations, so as
I mentioned, a standard den of workflow is to introduce a library, take your favorite drug, for example, and then look for modifiers of cell death. This tends to be incredibly easy to do, and so a lot of screens that
people do out of the box are just growth-based screens, where you're looking for
things that either increase or decrease the growth of
a cancer cell by itself or modify the activity of
a drug that'll allow you to do these sorts of screens. A number of other strategies
have also been employed, so you can, for example,
follow the activity of your favorite transcription factor by coupling it to either
a fluorescent reporter, and then you do flow
cytometry based screens, or you can couple it to
expression of a chemical, a drug resistance marker, and then do viability based screens that are reporting on the
activity of this promoter, so your favorite stress
response, for example, you could couple to
these different promoters and actually do sorting based screens based upon those pathways, and so, I think one kind of
practical tip that's useful is, if you're doing these types of
screens, it's useful to have a counter screen where you
just look for modifiers of a general transcription
or translation factor, so in addition to this, you might consider having
just a constitutive promoter that drives mCherry expression, and then you could subtract
everything that affects stress transcription or
translation of mCherry as what you might call
background in that screen, and then only look at things
that specifically modify the expression of GFP, so this one probably useful constraint if you're thinking about
these types of pooled screens. If you're doing these
sort of death screens, we, in my lab, for example,
have found about 50% killing works quite well. You can find modifiers in both directions, but depending on the type of screen that you're thinking about doing, sometimes people will use a 90% killing, the sort of apocalyptic death model, where everything dies
except the five genes that are absolutely
critical for a process. That can be a really useful
strategy for, you know, getting rid of a lot of noise. On the other hand, you can
lose a lot of things that way, so other people will screen at a 10% kill, and then you have much more dynamic range for things that sensitize to the drug, so depending on the particular paradigm that you're interested in, whether you're looking for
synthetic lethal combinations, whether you're looking
for critical transporters for the entry of a drug, for example, different levels of
killing might be useful for those types of screens. Okay, I'm just gonna
briefly go over this slide, because I still need to make one point, which is that, depending
on the type of cell line that you're using, a very
important consideration is just the promoter or the vector system you're using to express an
shRNA, Cas9, or a guide RNA. We have collected, from
ourselves and a number of other folks, including Luke, who's been incredibly
generous with sending us some of these expression vectors, just an array of different
promoters that have been quite useful because cell
types differ enormously in the extent to which
they will reliably express a guide RNA or a Cas9, and
so one of the biggest issues in actually conducting these screens is establishing a cell line
that reliably expresses Cas9, and so, you know, I would suggest, if you're actually
thinking about doing this, talk to someone like Luke, who has enormous amount of
experience in doing this, or myself or a number of other people, and you know, potentially
just test a number of different constructs, because actually, trial and error can
actually help enormously in finding an effective system for just expressing Cas9 and a guide. Okay, so these are sort of
practical things to keep in mind. One other practical
consideration I thought I would just mention is the
importance of maintaining library representation,
so these pooled libraries, you know, for the shRNAs
we had 25 shRNAs per gene. Fortunately, for CRISPR guide RNAs, it's possible to really shrink
the size of those libraries, 'cause it's much, much easier to find an effective guide RNA, and with the improved algorithms
that Luke has described, from Max's work, for example, you can now shrink the library
to maybe five guide RNAs. Some people are even trying one guide RNA, which I think is kind of
ludicrous, but in principle, there are so much easier to
find effective guide RNAs that you can really shrink
the size of these libraries. Having said that, it's
actually quite important to maintain effective
representation of these libraries, so when we do a screen, we
typically try to maintain 1,000 fold representation. This is the noise you see
between two replicates at 1,000 fold. Actually, not too bad,
but if it were possible to improve that coverage
to like 50,000 fold, you can see the noise goes almost to zero, so it's something to keep in mind that the amount of coverage you get
actually can really influence the noise in these types of screens, even with a really
improved CRISPR library. Okay, so I'm now gonna describe some of the genome-scale
screens that have been done, and a few of the first ones with CRISPR were using active Cas9
and were conducted by David Sabatini's lab, Feng Zhang's lab, and what they basically did
was to take an oligo design, which, in effect, incorporated
some of the same rules that Luke described in terms of optimizing guide RNA properties. A lot of this stuff has been published, and effectively targeted
what were conserved axons, and they thought to go as
five prime as you could with the idea that they
wanted to make a guide library that would work on as many
different transcript variances they could find, and if
you put a stop code on as early as possible in the gene, you have the best chance
of actually effectively deleting that gene's function. So, in general, this worked pretty well. This is sort of a V1 library, and I'll show you some
other examples where some of these rules have been
tweaked and improved upon, but the idea was you take an oligonucleotide synthesis platform, you generate a genome-scale library. I think, in this case, they did maybe three to six guides per gene, and then the workflow is
basically exactly the same as what I described. You have a lenti-viral library. Each vector code for one guide. You infect it into cells,
and then you can do a screen. And so, one of the first
screens that they did was actually just a growth screen, so this is effectively called a
screen for gene essentiality. You wanna find all the genes
that are required for growth or viability in a cell. One of the interesting things
you can see in this plot, so here, you're seeing that the ratio of cells compared day three to day 14, so this is, you know, growth
over a two week period, and what you can see is that you now, after 14 days of growth, start to see this population
of cells that are lower in abundance than they were at day three, and that is because
they're dying, of course, so you knocked out a gene that was important for their growth. You can see what some of these genes are. They're key cellular processes, RNA processing, RNA binding, ribosome biology. These are core machineries that are absolutely required for cell growth. The interesting thing, actually,
is that it's much easier to break something in a cell than it is to make it go faster. There are very, very few genes which, when you knock them out in a cancer cell, actually make those cells grow faster, compared to the thousands
of genes, actually, depending on how you
draw the threshold for growth phenotype that actually break or significantly impair cell growth. So, this was nifty. One of the other things
that they did in this paper was to do a drug resistance screen, so I just charted growth screen where you're knocking out a gene and you're just looking at across time, at an early time point
and a late time point. In this screen, what they did
was they put the library in, and then they treated
with this BRAF inhibitor that normally kills the cell, except if they knock out these genes, which, it turned out,
are different, you know, direct members of this pathway or genes that had been previously shown to cause resistance to
this BRAF inhibitor. So, one of the cool things
that they highlighted in this work was that all
four of the guide RNAs targeting this gene, for example. You're right, Luke. It's totally impossible to see this thing, but in F2, for example, all
four guides actually show the same phenotype. You know, they're considerably
higher in the treated cells than they are in the untreated cells, and this is something you
would basically never, ever see in an shRNA screen, so you would never see all four shRNAs giving you the same phenotype. Here, you see all four guide RNAs can cause significant resistance. And the other, you know,
interesting observation is that you don't have
1,000 genes that are required for HIV resistance. You have eight genes or 20
genes or something like that, so you have a really narrow list of almost certainly right hits, and when they actually
go in and validate these, you know, 90 plus
percent of them validate, so I think it shows the
power of these screens to get really clean phenotypes
for gene knock outs. One caveat that I thought
I would point out, from our own work, is that
if you looked at this, you might say well, all
four of these guides are working really well. In fact, I just said that, but in fact, there's
a lot of heterogeneity in the performance of those guides, and you don't really see that until you really look carefully
at the phenotypes for all the different guide RNAs, not just in a resistance screen where you're really driving
all those things to cause, you know, an increase in growth, but in a drop out screen, so in this plot, what we're showing is
actually the phenotype of a really critical essential gene, ABL. BCR AbL is the gene
that drives K562 cells. If you knock it out,
they're dead as a doornail. And what you can see here is that, of all the guide RNAs that target it, there's a huge range, and in fact, some of them are killed completely, and some of them really have
a pretty mild phenotype, and so, it's important to consider that, with any screen, you're
going to have a range of guide RNAs phenotypes. Okay, so one of the other things that we've done that's been quite useful
with these types of screens is to actually do a drug
target identification, so I showed you a couple of
examples where people had looked for modifiers of a drug, and in most cases, those drugs
already have a known target, so they're targeting some kinase, which is thought to be really important for cancer cell growth, and you wanna find a second target, which is important, for example, for, you know, a potential
synergistic target for that drug or to understand how that
particular cancer works. Another useful application
of these screens has been to identify de
novo, the target of a drug that does something interesting, but you actually have no
idea how that drug works, so in this paper, together
with the Cleary Lab, who had done a chemical screen, this is actually quite common in the pharmaceutical industry. People do, you know,
giant chemical screens. They find some drug that
magically kills leukemia cells, in this case, but not normal cells, so it looks extremely interesting, but they have no idea,
actually, how this thing works. What we were able to do is
to take an shRNA library and this is actually just
a half genome screen, and systematically knock
down one gene at a time, with the idea that, if you knock down the target of that drug,
now that cell is much, much more sensitive to the drug than all the other cells, so you can see, with
actually pretty remarkable, actually maybe opening too
remarkable specificity, we were able to pull
out what happened to be the molecular target of this drug, which is this NAD biosynthesis
gene called NAMPT. You might say why didn't you
find all the other NAMPT genes? So, one thing I should say
is we probably should have just stopped doing screens
entirely at this point. It was a really nice result. It's a half genome screen, and probably, because this was an early shRNA library, we had bad shRNAs for the
rest of the components of that pathway, but it does illustrate the general principle that you can use these genome-scale screens
or near genome-scale screens to do unbiased drug target identification if you don't know the target of a drug. So, in another example of a similar type, in my lab, we compared,
we actually tried to find the target of this drug. This is now an anti-viral drug
which targets the host cell, so there are a lot of
different anti-viral drugs in the world. Typically, they target the virus itself, and so, if the virus mutates, you just need to find a new
drug for that new virus. This is a pain, and so, if
you were able to find drugs that actually target a
host process that caused a protective phenotype
against a range of viruses, that could actually be quite useful. And so, we set out to
actually identify this target, which had been identified
by GlaxoSmithKline in a similar, you know,
large scale chemical screen for compounds that inhibit the replication of a range of different viruses, so they found this thing, GSK983, that looked like it was
effective at inhibiting adenovirus SV40 and HPV-16, and they actually reported
a kilogram-scale synthesis of this compound, so this
is another common thing that happens in the
pharmaceutical industry. The do a lot of work, probably millions of dollar
worth of research on this drug. They synthesize kilograms
of it, which means they're testing it in animal models, and then, inexplicably, they just drop it, and they have no idea
what the compound did. Typically, that's why they drop it. They don't understand how it's toxic. We were able to, so I
shouldn't, by we, I mean, Richard Deans, who's a
chemistry student in the lab, together with Chaitan Khosla, who synthesized this compound, and we used the fact that it's toxic to actually find its target
so we could add the drug. We knew that it caused
some growth phenotype, and we could look for modifiers of it, and the reason I'm spending some time describing this setup to you is because what we did was to actually
systematically compare an shRNA screen with a CRISPR screen, and it was actually very
interesting to do this comparison, so you know, the screen
workflow is exactly the same as what I had described to you. In both cases, you're infecting
a lenti-viral library, in this case, one shRNA per cell, in this case, on guide RNA per cell, and we do parallel screens. One pool gets treated with the drug. One pool is untreated. At the end, you do
sequencing and you compare and you ask which genes, when
knocked down or knocked out, modify the activity of this drug. So, I'm gonna cut through
an enormous amount of work to describe what is really, I think, the most interesting part of this story. So, what I'm showing you in this plot here is the phenotype in the
shRNA screen on this axis and the CRISPR screen on this axis, and so, to make a very long story short, the target of this drug is this gene. It's DHODH, is a critical
component of de novo perimetry nucleotide biosynthesis, so viruses, when they infect a cell, massively up-regulate nucleotide synthesis so they can support their own replication. If you can't do that, then
the virus can't replicate, and so that's why this drug was actually an effective anti-viral. Unfortunately, it's also
toxic, because the cells need those nucleotides to support
their own DNA replication. It turns out, there's
actually quite a nice therapeutic window there,
where viruses need a lot more nucleotides than a standard
cell does to stay alive, but the interesting thing
here is that you can see, when you knock down
this gene with an shRNA, you could actually, it's
one of the strongest sensitizing factors to GSK983, and that's how we were able to find it, is because we could knock it down and show that it actually sensitized
against the drug. In a CRISPR knock out screen, this is an absolutely essential gene, so if you delete it, those cells just die, so really, there was no
phenotype in the CRISPR screen, because by the time we
actually took that measurement, actually, you know,
several weeks had passed and all those cells, for the
most part, had dropped out. I think there's actually some
interesting counterpoints to this idea. I think a number of people have now seen, in a CRISPR screen, you can
sometimes see a modification of an essential gene by getting, you know, in some cases, hybrid
morphs that give you, you know, deletion of one
allele or two alleles, and it's telling it has five alleles, so there are some counter examples, but I think we've seen,
in a number of cases, this very interesting difference between, you know, the utility of
a knock down phenotype, which you might get with a
shRNA screen or a CRISPRi screen and a complete knock out,
where you're really eliminating all of the function of a gene. Now, conversely, these genes, which are negative regulars of mTOR, only showed up in a CRISPR screen, so the best shRNA in the world might give you 98% knock down,
and if that is not enough to give you a phenotype, you won't see it, so these two negative regulators at mTOR, which, ironically, actually
show up in just about every screen we do for some reason. mTOR is a very important
gene that's involved in a lot of different processes. Only could be found
with the CRISPR screens, so you really need complete deletion, so the interesting kind
of take home from this is that sometimes it's
good to have knock down and sometimes it's good to have knock out, and we've actually found, by
comparing the two screens, sometimes you get a much
more complete picture of the biology. Just as a side note, we found not only the de novo synthesis pathway,
but the salvage pathway. Those cells have two ways
of taking up nucleotides. They can either make them themselves, or they can take it up from the outside. We found both of these
pathways in the screen, and we're now exploring
a combination strategy where we block de novo
synthesis and salvage to actually improve the activity of these compounds as anti-virals. One other side note, by the
way, comparing shRNA knock down and CRISPR knock out screens
is that we've actually, when we've done these types of comparisons with the same libraries side by side, it's not that one library is
crap and the other one is good. They actually both do a pretty good job of calling essential genes,
so if you take a gold standard essential gene set, so
these are genes that have been found in like 100 cell lines to always be required for
growth when you knock them down, in this case, with an shRNA library, and you say that's my gold standard, and so, whenever I find that, I'm right. If I don't find it, I'm wrong, and you draw this type of ROC curve, so up and to the left is better. You can see that they actually, both the CRISPR library
and the shRNA library actually do a pretty good job
of calling essential genes, but when you combine the two, you actually can do better,
which means that there's non-redundant information
in those two screens and it's actually useful
to combine the two. Yeah? - [Attendee] So has anybody
looked at designing, basically, for want of a better word, crappy guide RNAs that
are more likely to cause like a negative three,
negative six, negative nine, so more likely to give
you that intermediate knock down phenotype? So you make a slightly crappier protein, but you don't get an
early stop on the line? - That's a really good question. So, in some cases, I
think, if you're targeting the protein coding region, it's not a trivial problem
to figure out where, exactly, you would put those guide
RNAs to have a crappy effect. I'll show you a study
where they actually scanned across a couple protein coding regions to find what would be good,
and maybe in those examples, you could say well, this
would probably be bad or relatively bad, so that you
have an array of phenotypes. I think, in CRISPRi, it
may be easier to conceive of a system where you would really try to, I don't know, Luke, how you
guys would think of doing this, but step-wise, move away from the promoter until you realize that,
you know, instead of having as efficient knock down as you can, you have, you know, a
range of expression levels. There, it might be a
little bit more uniform to do that kind of
thing, but it's actually an important question, because
sometimes you really would like to have a range of phenotypes. Yeah. By the way, feel free to interrupt. I realize I'm kind of rambling on here, so if any of you guys have any questions, feel free to shoot away. Okay, so one of the nice
properties of this system for drug target identification
is you don't need to modify the drug, so, you know, typically, if you wanna find the target of a drug, you'd have to stick a
cross-linker on there. You potentially destroy its activity. It's not sensitive to the
strength of the interaction, and you can find those targets
in the biological context. Okay, so, in the examples I've discussed so far, typically, the guide
RNAs have been targeted to a protein coding
region, but there have been a number of other screens
that have been done where people have started to
look at the non-coding genome, so of course, in the context of a CRISPRi, you're typically targeting upstream of the transcription start site,
but in a number of studies, people have now used
active Cas9, and actually, increasingly, a number of
other of these variants that Luke described to
really start to probe, you know, what are the functions of not just protein-coding genes, but actually regulatory elements, and so this is a study that
was done in Feng Zhang's lab, for example, where they tiled, you know, 100 KB upstream and
downstream of this gene, which turned out to be, you know, one of these genes that they had found in their previous BRAF screen. And the workflow is effectively the same. You take this library of guide RNAs. Now, not targeting the
protein-coding region, but targeting a very large region of upstream and downstream
regulatory elements that control the
transcription of that gene and ask for which of those
elements might actually affect the transcription of a gene. Earlier on, actually, this
group and Dan Bauer's group together with Stuart Orkin,
did a quite similar screen. I happen to like this
diagram better this way, but these guys actually did it first, to tile, again, the regulatory regions that are around BCL-11, so this is a gene which has an enhancer that actually affects
the transcription of, again, one of these
fetal hemoglobin genes. So, in this case, they're
looking for modifiers of this BRAF inhibitor. In this case, they're looking for, when I knock out this element
of a regulatory region, how do I affect the expression
of this fetal hemoglobin, and this, I actually do in a
flow cytometry based screen, so they do an antibody
staining for fetal hemoglobin. They do cell sorting
for either things that, when you knock out that element, either increase hemoglobin or decrease it, and then they actually
call hits based upon which regulatory element modified
the expression of that gene. And this is the type of data
that they get out of it, so if they take, you
know, thousands of guides and they tile them all along
these regulatory regions, and they ask for which
things actually affect the expression of those hemoglobin, they can pull out a locus like this, where they actually see
differential expression, actually in human cells
compared to mouse cells, where you have this specific region which has a GATA1 transcription
factor binding site, so you can actually map
important regulatory regions for the transcription
of genes by using Cas9 to tile and, in principle,
create deletions that prevent the binding of those
transcription factors. And so, in the end, they can, you know, take these sort of data points, where they have modifications
and expression of a gene, and then relate those to the binding sites that have been determined
through chip-seq, as Luke described, where
you can actually now try to functionally annotate which of these transcription
factor binding sites or 3D contact loops are important for controlling gene expression. So, this is, you know,
a significant departure from the typical protein-coding
screens that have been done. Now, you're really asking
quite complex questions in some cases, about which of these different regulatory control elements drive expression of a gene. Another really nice
example comes from Luke, sorry, John and Max in Jonathan's lab, together with Daniel Lim. In this case, so I think
Luke had described this briefly in his previous talk, but what they were able to do is to target long non-coding RNAs in cells, and so this is actually a uniquely, incredibly powerful application
of the CRISPRi system, so if you think about it,
CRISPRi has been very useful, and I'll show you some slides in a minute where the same screens have been done extremely well with CRISPRi, but what's nearly impossible
to do with a active Cas9 screen is to actually target
these non-coding RNAs, so these are, in some
cases, very long RNAs, which are not coding for a protein, so you might put an active
Cas9 in the middle of it, but what would that actually do to the function of a
very long RNA for which, you know, particular elements
may not be important. In this case, they were
able to use CRISPRi to actually target the
promoters of those long, non-coding RNAs, and screen
for essential gene phenotypes, so some of this data,
I think, we've showed, but the general principle
was that they were able to do growth screens in
a range of cell types targeting the expression
of those different long non-coding RNAs. You know, the group were
actually able to find some very interesting differences, different cell types that had different long non-coding RNAs that
were essential for growth. So, this is an example
of a screen in which they were scanning the
protein coding region, so this is to the question
that was asked earlier. How would you design a guide
RNA that was especially bad. In this case, they were
actually looking at sort of the opposite effect, so, you know, how would you design a guide
RNA that was especially good for targeting this
chromatin modifier, BRE4. So, what they did was to take guide RNAs really tiling the entire
protein coding gene and look for guides which,
when they knock it down, actually cause a growth
phenotype in these cells, and the interesting thing that they found was that, when you put a
guide RNA in some of these really conserved, you know,
known functional elements of the protein, so these are
actually some of the active elements of the protein. These tend to be much more impactful when you knock them out, and the thought is here that, you know, in some cases, you have
a guide RNA that targets the middle of this region, and with a significant probability, you actually don't create
an indel or a stop codon, whereas these guide RNAs, even
with an in-frame deletion, may have enough of an
effect that, you know, if you have a very highly
conserved kinase domain, for example, you may
lose a single amino acid, and that may have an enormous impact on the function of that protein. That would not be present
if you were to lose a single amino acid here, so their rationale is, by
targeting highly conserved, functional regions of proteins, you could actually have more
effective gene deletion, and in some cases, they've
actually found this allows you to map the functional
domains of a protein, so if you didn't know that
these were the important functional domains, you
might actually find them by tiling across a gene
and actually seeing, well, this is the place where
the protein really hates to have an indel placed, or
even an in-frame deletion. Okay, so a really important caveat that I think should be kept in mind for a lot of these
screens with active Cas9 is that you are putting an
active nuclease in the cell, massively over-expressing it,
and expressing a guide RNA, so this is something that, in some sense, may be especially important
in the context of a screen, where you're constitutively
expressing Cas9 protein and a guide RNA over
a long period of time. In some therapeutic applications, where you're transiently
expressing a guide RNA or an RNP, it may not be
as important to consider these types of off-targets, but actually, in this case, what a number of groups have found, this is a work from Bill
Hahn or a group of Novartis. We've actually found quite
similar things in our own data, and a number of groups have found this, including Sabatini early on. What they showed was, if
you look at the behavior of guide RNAs across a
chromosome in a cancer cell line where you have clearly amplified regions, guide RNAs that target
these amplified regions are extremely toxic, whether
or not there's a gene there, and this is because if
you have enough copies of, you know, a piece of DNA,
you can actually cause a significant amount of DNA damage. In some cases, so
significant that the cell will just effectively die, because you have this massive, you know, DNA damage response, so
this is especially important given that, you know, many
of us are using these, you know, highly amplified and messed up cancer cell lines to do our work in, and K562 cells, for example, BCR-ABL locus is massively amplified. It's critical for growth. It turns out if you put
a guide RNA in between, you know, the amplified
genes, you can actually still see massive cell death,
so this is something that needs to be kept in mind when
you're doing these screens. We've actually found even a single cut has a measurable affect on growth, so I'm showing you a library that we built to try to make an effort at
controlling for this effect, so we built our own sort of genome-wide CRISPR cutting library, where we had a set of negative
controls that targeted what we thought should be
safe regions in the genome, so these are regions
that have no annotated, there are no genes there, there are no chromatin marks there, there are no regulatory elements, there are no, you know,
long non-coding RNAs, so across 170 cell lines,
there's nothing at all there that should be, what we think, functional, and we put a bunch of controls there with the idea that we're
gonna see what the effect of cutting at one site
would be in the genome. So, this is the work of David Morgens and Michael Wainberg in the lab. To cut to the chase, the sort of interesting
observation here is, what I'm showing you here, is
the distribution of guide RNAs for either gene targeting
guides, so you can see, there's a range of phenotypes. Some of them are very highly deleterious when you knock them out. Some of them are, you
know, slightly protective. If you compare the phenotype
of a non-targeting guide RNA, so this is a guide RNA that
we computationally predict should not have any match in the genome with tolerating up to three mismatches, versus one of these
safe-targeting guide RNAs, which should have exactly one
cutting site in the genome, you can see that the
distribution of the safe guides is actually considerably broader than the non-targeting guides,
which means that, actually, these have much more of an
effect than non-targeting, and so, you can actually
even see this when you look at the substitution position, so depending on the mismatch position, the effect of those safe targeting guides actually differs, which
reflects known properties discovered by Jennifer and
others about, you know, where a Cas9 can tolerate a mismatch. So, the take-home message
is you should actually, if you're doing a Cas9 screen, or even an individual re-test experiment, if you're gonna use a control, use a guide that cuts somewhere in the genome, because there is a
measurable affect on growth and some actually measurable
DNA damage that occurs, so don't use a non-targeting guide, 'cause it's not really
an appropriate control. - [Attendee] This effect
probably depends a lot on what cell line you're
using, because P53 mutants, for example, have more
sensitivity to single breaks. - Yeah. (attendee speaking off microphone) That's definitely right. In fact, you can see that here, so we did it in, I believe,
three different cell lines and you can see in some cell lines it didn't quite matter as much. It was not a perfect
correlation with P53 status, but you're absolutely right. In some cases, it matters
quite a bit more than others. Yeah. Okay, so fortunately, Luke
gave a fantastically complete introduction to CRISPRi
and a, so I'm not gonna describe this in any more
detail than he just did, other than to say those
significant problems with toxicity due to nuclease cutting can largely be eliminated
by using either CRISPRi or CRISPRa to do these
screens, so this is, because you're not using
an active nuclease, you're using the dead Cas9
just as a targeting domain, you can really, to a pretty large extent, ignore those off-target
cutting effects that have been, you know, a significant consideration. In some cases, it matters
a lot more than other, but one of the major
advantages of this approach is that you don't have to
worry about the DNA damage you cause with an active nuclease. And so, as Luke mentioned, there are several different
strategies to do this. I'm not gonna belabor that point, and the other thing that we found, even in our own screens, is
that the point that Luke made about the importance of
finding the position, the right position to put those guide RNAs is something that I think
everyone can attest to, and so what I would say is,
if you're interested in doing these CRISPRi experiments,
the new and improved library is really significantly new and improved. It really actually works a lot better that the V1 library, and so I think one of the major improvements
was just annotation of the transcription start
sites, if I'm correct, so with an improved transcription
start site annotation, A, you were able to do, and you know, the significant improvements
in the algorithm that Max made, they can now very effectively
effect transcription or oppression across the genome, and so these libraries
work actually quite well. So, here's a couple of examples
in two different papers, so in this one and the new one. So, one of the screens they did early on was one of my favorite
toxins, the ricin toxin, near and dear to my heart. Don't eat it, but actually,
it's really an amazing toxin, actually, so it gets endo-cytosed, it traffics from the endosome
to the golgi to the ER. It gets unfolded, recognized
as a misfolded protein by the ERAD machinery,
retro-translocated back into the cytoplasm where it refolds, and only then can it kill
the cell by de-purinating ribosomal RNA and
shutting down translation, so it's just really a
kind of amazing toxin that does all these neat
trafficking and folding events, and we could use that
to find all the genes in a genome-wide screen that were involved in its trafficking or processing. What was shown in this really nice figure is that you can actually
see, depending on whether you knock down with CRISPRi
or activate with CRISPRa, the expression of these previously
demonstrated ricin hits. You can see an opposite
phenotype, which is really cool, because it means that you
can not only, you know, recapitulate some of the
same biology that was found in other screens, but you can actually be even more confident. If you could flip the
phenotype from repression to activation of sensitivity, for example, you can be even more confident in the phenotype of that gene. And this is just to show, using the same kind of gold
standard essential genes in the new and improved library,
you can see a near perfect recapitulation of essential
genes with the CRISPRi library, so these libraries work really quite well, equivalently well to
the best CRISPR active Cas9 libraries that we've used. Okay, so, in the last bit, I'm gonna describe a number of different sort of applications of these screens, so one really cool application
that was just published, so you're gonna get a
talk, I believe, tomorrow from Tom Norman, who worked
together with Brett Adamson in Jonathan's lab, and
together with Aviv Regev in a number of other labs. They had described very recently
this kind of extraordinary, powerful redoubt for a screen,
which I'm not gonna go into in any great depth, because
I'm sure Tom will do it, but just to highlight,
in the context of screens that you might do, it's now
possible to not just look at a single gene phenotype,
protection and viability or the activation of a reporter, but to look at, effectively, the modifications to the transcriptome, so a really high-dimensional
multi-plex output to go with a multi-plex input
of a genome-scale screen, so Tom will describe that in more detail, but I wanted to just put this
in the context of screens, because it really, in principle, expands the range of
phenotypes you could look at. So, yeah, in the last little bit, I wanna describe two
applications that we've been pursuing in my lab for
these types of screens. So, as I mentioned,
everyone and their mother is doing some kind of a screen. You end up with a long list of hits, and you might say, well, I
know what the first 10 do, but what about the next 50. How do I actually understand,
at a systematic level, how these genes actually work together? And so, early on, when I was a post-doc together with Martin
Kampmann in Jonathan's lab, we had done a genome-wide shRNA
screen for ricin modifiers, and what you can see in this map is that I'm displaying the
phenotypes of a single gene in combination with a
series of second genes, and so, if it's yellow,
that's better than expected, so the combination of the two genes, when knocked down, is
a much better phenotype than you would have expected, and blue is significantly worse than you would have expected. And so, depending on the
nature of that interaction, it can tell you something
about how those genes actually work together, so
the really cool thing about these maps is, if you
cluster genes based upon the similarity of this pattern, what you can effectively
do is reconstitute a lot of the biology of the cell, so we did a genome-wide ricin screen. We took a lot of the
hits from that screen, and we clustered them together in this map by their function, and
what you can see is that many of the genes, even if we didn't know what all those different
trafficking steps were, we could have found them because here are all the ribosomal genes. Here are all the COP1
vesicle trafficking genes. But the really cool thing
is if you have a gene of unknown function, so here is C5orf44. Previously, we had no idea
what this gene actually did, but we can see that,
genetically, it looks like a member of this vesicle
tethering trap complex, and so one of the really
nice things about this was it came up with a prediction for the function of this gene. Genetically, it has a
pattern of interactions that looks like another complex that we do know the function for, so now we have a prediction
for what this thing does, and we could actually show that, indeed, this is a physical member of this complex, and in fact, defines two new complexes, there are sub-complexes of TRAPP. So, these were very
powerful tools for kind of a systems level understanding
of what the hits from a screen would do. So, in my lab, we're
interested in applying sort of a new application
of this pair-wise genetic interaction screen, in this case, to look for, specifically,
cancer drug combinations, and so why might you wanna do this? Of course, cancer drug
resistance is a very widely appreciated problem. There's this famous
picture that many of you have probably seen of
this unfortunate patient who presented with a melanoma. He was treated and had a really
nearly miraculous response to this BRAF inhibitor,
which completely shrank all of his tumors, but
as almost always the case in cancer, what ended up happening was, eventually, the patient
relapsed and all the same tumors came back and were now resistant
to this BRAF inhibitor, so there's an enormous interest in the pharmaceutical industry, and this has, of course, been an interest for a very long time in trying to find combinations of targets
that would allow you to prevent the escape
of these cancer cells to drug resistance. In addition, of course,
it's very expensive to develop these kind of drugs, and so, if you could repurpose
existing FDA approved drugs, there would be a lot of utility in that, and of course, testing
all those combinations, even if you did know how
they would work together is an enormous effort, and
so what we wanted to do was to actually model drug
action by using CRISPR to target the targets
of FDA approved drugs, so we wanted to systematically
look for drug combinations by targeting pairs of
genes for which they had a corresponding target. This is the work of Kyuho Han
and Edwin Jeng in the lab, very talented post-doc and
a great graduate student. They were helped by David and Amy, and the idea is we generate
a very large library, in this case, 500,000 pairs of guide RNAs that target combinations of
genes which have drug targets, and the idea is can we pull out synthetic lethal combinations that might be effective in cancer. So, I'll skip over a lot
of work that these guys did to express pairs of guide RNAs. There are now a number
of different systems that get around some of the
problems that Luke described to express pairs of guide
RNAs, so we built one. The Weissman Lab built one. The Ventura Lab built one. I think, as Luke pointed out, it's nice to have some different options for how to do these things, 'cause one of the key
problems is if you have identical promoters, you
can't actually sequence the guide RNAs directly, and
you get a lot recombination between the guides, so in the end, we built the system that
can express pairs of guides that we can actually
do deep sequencing on, directly sequence the guide pairs, and what we did was we took genes, which had a corresponding drug according to these drug databases. We selected ones that, by themselves, were not lethal because if
a single gene kills a cell, you're not gonna find a synergy, and we took genes that
had a mild phenotype. We selected the three best guide RNAs from our genome-wide screens, and we made a pair-wise
combination library. In this case, targeting about 207 drug targets, which, in the end, gives you roughly 500,000 pairs, corresponding to 20,000
possible drug combinations. In the end, the workflow
for this is exactly the same as all the screens that
I've described to you. We now take a double guide library instead of a single guide library, and we infect that into cells, and we calculate interactions based upon how different is the expected
pair from each individual gene when you knock it down,
and we want to find the really rare synthetic
lethal combinations that completely kill a
cell only when you have those two genes knocked out. So, the data are highly reproducible. It doesn't matter if you flip the guide from one position to the other. Replicates are highly reproducible, and here's what the data look like. So, largely, you have a huge cloud around zero,
so what I'm showing you here is the genetic interactions core. In one replicate, on this axis and another replicate on the other axis, and what you can see
is that, by and large, most genes actually, in
the set that we took, don't interact at all, so
they have zero interaction. They're exactly what
you would have expected, and this is probably not surprising, because we really took
a random set of genes that have nothing to do with each other. They just happened to have a drug target, and we decided to look
and see which of them, when you knock them out in combination, actually are synergistic,
and the answer is very few. What's reassuring is,
if you knock out a gene and then you use a second guide
RNA against that same gene, now the phenotype of the double mutant is actually better than you'd expect. That's because once you
knock out that gene, it doesn't matter if you
target it with the second gene, for the most part. There's effectively the same phenotype, and so it's better than
you would have predicted, but what's really nice is
we could pull out this rare, actually quite potent,
synthetic lethal combinations which, when you knock out
the two genes together, those cells really have
a hard time growing. So, in the end, we decided
to focus on this combination of a Bcl-X long inhibitor
and an Mcl-1 inhibitor. These are two anti-apoptotic genes, and you might say to yourself, well, of course two anti-apoptotic genes are gonna be synthetic lethal. In fact, cancer cells differ enormously in the extent to which they depend on particular anti-apoptotic
genes for their survival, and so it was actually interesting that we could pull out this pair, that happened to also have
two pretty effective drugs that can be used against them. We also did find a number
of other, you know, pairs which were synthetic lethal. Some of these make perfect sense, 'cause they're two
different isoforms of AKT, two different DNA-damaging,
DNA damage repair pathways, and so we felt like we
were getting, you know, legitimately synthetic
lethal combinations. So, to make a long story short, we were able to show that the
combination of these two drugs corresponding to the
two genes that we found to be synthetic lethal in this screen were actually quite potently synergistic when we used them together. What was really cool, though, is we could actually see
that, so here's K562 cells, where we're treating the cells with these two drugs in combination. You can see really potent synergy. When we use these either
EBV transformed LCL cells or primary cells, so these are CD34 hematopoietic stem cells, there's actually much less synergy between those two drugs in this case, and so it was actually
quite nice to see that the pair that we found were not just universal synthetic lethals. They were actually quite
specific to cancer cells, and in this case, when we
made imatinib-resistant cells, so these are K562 cells
that have been grown, so in some sense, K562 is
like the least interesting cancer to solve, because there's
already a fantastic drug, imatinib, which targets
BCR ABL driven cancers, but in some cases, you
still get resistance to this drug, imatinib, and
so if, when we generated it in the lab, an imatinib
resistant version of K562 cells, it was actually really nice to see that this same combination of drugs was still lethal to those cells. So, this was kind of satisfying, because we took what was
really an enormous library of guide pairs and we screened for very rare synthetic lethal combinations, which we could pull out and then show that the corresponding drugs were actually synthetic lethal in addition. So, that was nice, although,
if you actually go back to the type of genetic interaction map that I showed at the beginning, if you did this same interaction map with the data that I just
showed you from this drug map, you would see mostly black space, because, as I mentioned, there's really very low
frequency of interaction, so if you tried to do
the type of clustering based upon similarity
of interaction patterns that I showed was useful in ricin, you get virtually nothing
in terms of useful clusters coming out of an interaction map for this drug map, and this
is not that surprising, because the frequency of just
two random genes in the genome has been shown in, actually,
a number of different studies in yeast, Drosophila, human
cells, is actually quite low, 'cause the number of
synthetic lethal combinations with any given gene is
probably maybe 20 to 50 across the entire genome,
depending on the genes. Some genes interact much more than others, but on average, the
frequency of interactions is actually quite low. So, we went back to our
favorite toxin, ricin, and we did a genome-wide
CRISPR screen now, instead of an shRNA screen. You might say, well,
that's incredibly boring. Why the hell would you do that? But it's because you can actually
find a lot of new biology. Strikingly, a huge number
of new genes come out of a CRISPR screen that
we had, you know, done virtually identical thing with shRNA. But now, with more
powerful knock out tools, we get a lot of new genes that come up. So, for example, all the
genes that are involved in glycosylation of cell surface proteins, we didn't really see in the shRNA screen, but are among the strongest
hits in the CRISPR screen, so this ricin is a lectin
that has to bind to glycosylated residues on the cell surface. If you don't make that
glycosylation residue, ricin can't get in, and so,
when we knock those genes out, the cells behave as if
they never saw ricin. It was kind of amazing. The interesting thing was, in this study, we could show that even those genes, so here are the same genes
we've found previously, these trafficking genes, but then also, for example,
all the glycosylation genes clustered next to each other, so we were happy to see that you could use the same guide pair platform
to do these same types of interaction screens we had done before. Okay, so, in the last few minutes,
I wanna tell you about this new tool that we built. So, Luke briefly outlined
the strategy for doing mutagenesis using dCas9
to recruit deaminases, and I would like to describe, sort of, one of the applications
that we've found for this, that's more of sort of a
screening type application, and so, if you think about
the way that people have used Cas9 to do mutagenesis, so traditionally, you know, work from Jennifer
and a number of others have shown that you can
create these indels, which will cause loss of function. Actually, people have used,
as I showed you in these other tiling screens, the same
active Cas9 to look for functional regions of proteins. However, another obvious application is to introduce by homologous recombination a library of variants, and so,
a number of different labs, including work from Jason Drury's lab, have shown that you can take active Cas9, a single guide RNA, and then a library of oligonucleotide donors to
introduce one mutation at a time into a specific site. What this allows you to
do is then start with a library of point mutants that result from that
donor being integrated, and then do some kind
of a selection to find, you know, what are the
particular point mutants that affect your protein of interest. And so, what we were trying to do is, instead of having to do this every time we wanted to make a new mutation, that is make a cut, make
a new oligo-donor library, and introduce it one
at a time at each spot you wanna make a mutation, what we tried to do was to make a diverse, functional mutant libraries in situ, without having to use an HR donor, and I emphasize functional here because we wanna use, we would
like to avoid indels, so for a lot of protein
evolution applications, you don't really wanna
destroy the protein. You wanna find a new function, so you would like to avoid, if you can, introducing stop codons. So, we turn to the way
this is done in nature, so with really remarkable specificity, during antibody development in B cells, your body has a way of
targeting mutagenesis specifically to the variable
region of an antibody, and really, almost nowhere
else in the genome. It's kind of an amazing process, but you get quite diverse
populations of point mutants developing in that diverse
region of the antibody, and that is what allows you
to have very high affinity antibodies against your favorite pathogen. This is because this enzyme, AID, generates mutations by
de-animating cytosine, which is then repaired in a
variety of different ways, and I won't really belabor that point, but just suffice it to
say you can actually get quite a diverse population of mutations just resulting from this cytosine. In fact, even away from
the actual cytosine you're removing the base
from, you can get mutations because you recruit
error-prone polymerases, which actually incorporate
other nucleotides where they shouldn't. So, this is work from Gaelen Hess, a very talented post-doc in the lab, who took dCas9, so we, being nerds, we called this CRISPR-X
to generate mutations. Fans of the X-Men, but
this is probably a mistake, but anyway, but the idea
was you can recruit, as Luke described, using these
mS2-fused variants of AID to the genome and create a diverse population of point mutants. So, Gaelen spent some time
engineering this enzyme to get a hyperactive variant of it, so he chopped off the
normal export signal, so in normal B cells, you
actually want to keep this out of the nucleus most of the time, 'cause it's actually quite
damaging to have this thing bopping around, removing cytosines, so it's normally exported. He chopped off the export signal, used a variant that had
been identified in bacteria that had a higher activity, and then fused it to Cas9, and in the end, what you could do was to
actually use a guide RNA to target a population of point mutants, really exactly where you
put the point mutant, and what was satisfying to see is, so, David Lu, actually a few months before we were able to publish work, showed that you could use
a different deaminase, so Luke pointed this out,
so they used an apobec. We had been working on this
for about a year and a half. It was actually kind of sad
when they first published it, but I think what we
actually ended up doing was quite different, and
it's very interesting. So, what they showed was
you can take this apobec and get really remarkably
precise conversion of a C to a T, effectively with their
new and improved version within like a five base pair window, so their proposed application
was if you have a disease where a C is supposed to be a T, you can put in that deaminase and actually do a therapeutic editing. In our case, what we're
doing is creating a window of actually reasonably
diverse point mutants, with the improved version
about 100 base-pair window of the guide RNA, and in fact, we get C to basically everything, G to basically everything, and
even some measurable levels of As and Ts being converted, so you can generate a quite
diverse library of point mutants exactly where you put the guide RNA. So, this was kinda nifty,
because now you have this population of mutants,
and you can actually start to do the type of
classical genetic screens in a way that I described
on the first slide, where you create a population of mutants, and now you can ask what is the thing that gives me a new function? So, one of the initial
kind of applications that we thought about was to do evolution of GFP, because we know exactly what turns wild type GFP into EGFP. It's brighter and the mutations
that do this are known, so we put guide RNAs around this locus, and GFP would cause mutants to be formed, and then we sorted for
a GFP variance that had brighter fluorescence by flow cytometry, and after one round of sorting,
we could already recover this known S65T mutation, so G to C, which is what's responsible for converting wild type GFP to EGFP, so it's actually, you know, of course we knew
what should happen here, but it was satisfying to see. We could mutagenize this
population and recover a variant that gave us a new function. More interesting was to
do a tiling mutagenesis of PSMB5, so this is
a protiozome sub-unit, and is the target of this
chemotherapeutic drug, bortezomib, and so what we did was we
took a library of guide RNAs targeting the entire
protein coding region. We introduced this mutagenesis machinery, and then we pulsed with drug bortezomib, and asked for cells which
now become resistant to this chemotherapeutic drug, and what you can see is we could recover in a PSMB5 targeting
library, but not in a library targeting these safe
regions in the genome, mutations that confer resistance, which was actually quite nice to see. In fact, what was interesting
is this axon three is actually not expressed in K562 cells, and we saw no mutants coming out there. So, when we looked at where
these mutations map onto PSMB5, so fortunately, the
structure of PSMB5 is known, its structure in complex
with bortezomib is known, and previously, people had identified in resistant cancer cells,
mutations in the binding pocket of PSMB5 for bortezomib,
and so we were happy to see that we could actually
recover those same mutants that set right in the binding pocket, but we could also identify
a number of other mutations that occur even on the
complete opposite side of the protein, so would not
really have been predicted from the structure, but in fact, clearly affect the binding
somehow of this drug for its target, and so
we're really kind of excited about using this to map
drug protein interactions more generally, and you
can imagine using this type of mutagenesis for a lot
of different applications, so you know, the Lu lab
is doing these sort of therapeutic editing applications. We're envisioning more,
you know, evolution of protein function, potentially mapping of drug target interactions. Another interesting thing we
observed with this technology is you can actually cause mutations, not only in a coding region,
but actually upstream of the transcription factor binding site. Sorry, of the transcription start site. And that's potentially
interesting because previously, AID had been thought to
require transcription to induce mutations, but
what we could show is that, as far as 3KB upstream of
the transcription start site, we could put a guide RNA
in this mutation machinery and still observe mutations. These are here in yellow. So, we think that this might also open up the space of mutagenizing
regulatory elements that are not actually
transcribed to try to find, for example, promoter
variance transcription factor, binding site variance that
would affect binding activity. So, we're kind of excited about the different applications for this: generation of different antibodies, evolution of enzymes and aptamers, and the nice thing about this system is you really only need a guide RNA, but you can generate a
diverse library of mutants, effectively doing a screen by tiling a protein coding region, and I didn't mention this, but of course, you can multi-plex this by just targeting multiple sites at the same time, so you can express two
guide RNAs in a cell. You can co-evolve two sites
that would potentially have some new function
if mutagenized together. Okay, does anyone have any questions so far? Otherwise, I have a
couple of quick points on kind of retesting hits I can go through. Yeah? (attendee speaking off microphone) So, not too much. With the AID system? Yeah, so you do see some basal level that's slightly elevated, but really, you know, if you look, like, it's probably not worth
going all the way back to make this point, but
you can see, in this case, for example, by and large,
you really only get mutations where you put the guide
RNA, so yes, there's some, like, infinitesimally
elevated level if you just do, you know, whole genome
sequencing, for example, but you know, at the
level of substantially enriched mutations, it's
really not detectable outside of where you put the mutations. Yeah. Okay, so, the last two minutes,
I wanted to just describe a couple of other strategies for kind of following up on hits from screens, so one of the things that
we found to be useful, you do a whole genome screen. You may not wanna clone
100 different guide RNAs against your 100 top hits to actually see which of these are actually real, and so, one of the things
we've found to be useful is to generate kind of a batch library. Here's for shRNA, but the
same thing has been useful for guide RNA libraries. If you do a whole genome
screen, you can now generate a mini-library, for
example, of your favorite, you know, 100 genes, and retest
them at much higher coverage and get much higher quality data, and that allows you to really, more efficiently process through the hits to get effective ones to follow up on. And of course, you can do this genetic interaction map strategy. We also find these
competitive growth assays to be quite useful, so
you can infect a cell with a guide RNA targeting
your favorite gene and a fluorescent marker, mix them with a wild type population, and then study the behavior
of those populations in a mixture, and you can,
you know, for example, if this gene causes drug resistance, you expect it to take over and become, the population should be
100% green if you start out from a 50/50 ratio. So, this is another strategy we've used to kind of follow up on
hits from these screens. Of course, you can tag genes. I think you guys are probably
gonna be talking about gene tagging and gene knock
outs in some of these workshops, but this is a gene We had no idea what it
did from the screen. You can use CRISPR to knock in GFP and actually localize
this thing in the genome and see where it goes, and of course, this is greatly facilitated
by these RNP studies, which I think you guys
are gonna learn about in the afternoon, so of
these neutrals really make tagging a gene with GFP much easier. So, that's pretty much it. I have some references,
a lot of people to thank for the work here. A lot of the early work was done together with people in the Weissman
Lab, including Luke, Max, and Martin, and people in my lab, who did the work that I showed you, Richard, David, Michael,
Edwin, and Gaelen. Thank you. So, I'll take any other
questions you guys have. (applause) (attendee speaking off microphone) - [Attendee] So you say
that AID is capable of using point mutations only at, basically, (speaking off microphone)
areas that are transcribed. Does that also apply to things
like long non-coding RNAs? - So, we think we can see mutations, even upstream of a
transcription start site. So, we haven't tried
long non-coding RNAs yet with this thing. When we've tried to put it
in a completely dead region in the genome, you know,
transcriptionally off compact chromatin, we really
don't see any activity, so there probably is some requirement for at least accessibility. We don't really know yet how
much transcription is required. You know, even at the
promoter that I showed, where we can put a guide 3KB upstream of a transcription start,
still see mutations, I can't guarantee you that there's zero transcriptional activity at that spot. There may be some, you
know, transcription going in the opposite orientation. I would expect, if the long
non-coding RNA is transcribed, you'd have a decent
change of mutagenizing it, but we haven't tried it yet. (inspirational music)