Genome editing with the CRISPR-Cas9 system

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
- [Voiceover] In this module we'll discuss genome editing using the CRISPR-Cas9 system in mammalian cells. Traditional gene targeting is technology challenging and relies on the process of homologous recombination. Spontaneous homologous recombination occurs at a very low frequency, and thus is an intrinsically inefficient process, which has required the use of antibiotic selection and other tricks to isolate the rare cells in which gene mutagenesis has been successful. Genome editing takes advantage of new technologies that let you introduce double-strand breaks anywhere you like in the genome. By causing a double-strand break, you can dramatically improve the efficiency of mutagenesis, whether you're simply trying to knock out a gene, or trying to knock in specific DNA variant or stretch of DNA. Genome editing tools have now been well validated to work both in vitro, in cells in culture, as well as in vivo, in organisms ranging from fruit flies to zebra fish to mice, to even non-human primates. The cell has two major ways in which it can repair double-strand breaks. One method is non-homologous end joining. It takes the two ends and simply puts them back together. But this is an error-prone process that often results in the insertion or deletion of nucleotides. The other method by which the cell can repair the break is homology-directed repair. Ordinarily, the cell will use a sister chromatid or chromosome as the repair template via homologous recombination. The repair template allows the area of the double-strand break to be cleanly replaced. You can exploit the homology-directed repair mechanism by providing the cells with large quantities of a traditional double-strand targeting vector. Alternatively, you can provide a single-strand DNA oligonucleotide that matches the sequence around the site of the double-strand break. In either case, you can fool the cell into inserting a mutation into the genome by putting the mutation in the middle of the repair template. Here's a schematic of the two repair mechanisms. On the left, you can see how homology-directed repair allows you to perform site-directed mutagenesis and create a specific mutant cell line. On the right, you can see how non-homologous end joining results in the introduction of a variety of small indels into the genome, generating a variety of mutant cell lines. Over the past decade, a number of different genome editing tools have emerged into widespread use, including zinc finger nucleases, meganucleases, and TALENs. Each of the tools has its advantages and disadvantages. The most recent advance is the CRISPR-Cas9 system, which has created significant excitement in the biomedical community because of its efficacy and its ease of use. The CRISPR-Cas9 system is based on a recently-characterized adaptive immune system found in bacterial species, and used by the bacteria to protect against foreign DNA molecules. The system comprises both protein and RNA components. The protein, called Cas9, has a variety of functions. It can act as a helicase and unwind to double-strand DNA. It can recognize and bind a particular DNA sequence, and recognize and bind RNA sequences. It can produce a double-strand break in DNA. In the simplified system that's now being used for genome editing in mammalian cells, the RNA component is a so-called guide RNA, or gRNA, that's about 100 nucleotides in length. This guide RNA is also known as the CRISPR RNA. Cas9 binds to this guide RNA, which itself hybridizes to one strand of double-strand DNA, as indicated by the red oval. Cas9 also binds to several adjacent nucleotides in the DNA. Thus, a triple complex of protein, RNA, and DNA is formed. The specificity of this complex is encoded in the first 20 nucleotides of the guide RNA, indicated here in blue. By changing this 20-nucleotide sequence, one can change the DNA sequence to which the protein RNA complex will bind. Once bound, the complex will generate a double-strand break in the DNA. There are some clear advantages to the CRISPR-Cas9 system. The Cas9 protein is a fixed component. It remains the same, regardless of which DNA sequence you wish to target. This contrasts with other genome editing tools like zinc finger nucleases and TALENs, where new proteins must be produced for each new DNA sequence that's to be targeted. With CRISPR-Cas9, it's the RNA component that's changed. In order to change the specificity of the CRISPR-Cas9 complex, all you need to do is change the first 20 to 21 nucleotides of the guide RNA. Because all this requires is very simple molecular biology, it only takes a day of laboratory work to create a new guide RNA. Indeed, it's so straightforward to make guide RNAs, that you can make a large library of guide RNAs all at once. For example, a library that covers all of the genes in the genome. Another advantage of CRISPR-Cas9 is its multiplexing capacity. If you wish to target two genes at once, you can mix Cas9 with two different guide RNAs matching the two gene sequences. CRISPR-Cas9 complexes will form and create double-strand breaks into two genes simultaneously. With the use of several guide RNAs, you could potentially target several genes at the same time. Here's a schematic showing how genome editing with the CRISPR-Cas9 system works. If you want to knock out a gene, you use a guide RNA whose first 20 or so nucleotides match a sequence in the coding portion of the gene. This DNA sequence is known as the protospacer. Of note, the protospacer must be adjacent to a DNA sequence that is known as the PAM, highlighted here in red. We'll learn more about this in a few slides. Cas9 and the guide RNA form a complex on the protospacer in genomic DNA and create a double-strand break. One way the cell can repair the break is non-homologous end joining. It takes the two ends and simply puts them back together. But this is an error-prone process that often results in the introduction of indels, which can result in frame shift mutation that prematurely truncates the protein. If you mutate both alleles, you can generate a full gene knockout. No homologous recombination is needed. No antibiotic selection is needed. Let's say you, instead, want to knock in a mutation. Again, you designed your guide RNA to match the desired site in the genome and introduce a double-strand break. The other way the cell can repair the break is homology-directed repair. Along with Cas9 and the guide RNA, you provide a double-strand DNA vector, or a single-strand DNA oligonucleotide, containing your mutation, along with homology arms to serve as the repair template. At some frequency, the cell will incorporate the mutation into the genome. Again, no antibiotic selection or any other tricks are needed. Let's highlight a couple of common research applications of the CRISPR-Cas9 system. It's increasingly being used to generate knock out and knock in mice. In vitro transcribed RNAs, one a messenger RNA encoding Cas9, the other the guide RNA, are injected into single-cell mouse embryos. The intent is that mutagenesis occurs at the target site in the genome in some of these embryos. The resulting blastocysts are implanted into surrogate mothers, and after three weeks, pups are born. These pups are then screened for mutations at the target site. This methodology works with high efficiency, in some cases approaching 100% mutagenesis rate. The obvious advantages are that knockout mice can be generated without ever needing to use mouse embryonic stem cells, and the process is much quicker than the traditional approach of making knockout mice. Another common application entails the use of human pluripotent stem cells, whether human embryonic stem cells or induced pluripotent stem cells, to perform disease modeling. One either starts with a wild-type stem cell line or with an induced pluripotent stem cell line bearing a patient-specific mutation. CRISPR-Cas9 is used to either introduce a disease-associated mutation, or to correct the patient-specific mutation. In either case, the result is the generation of isogenic stem cells lines that have the same genetic background, epigenetic background, and so forth. These matched stem cells lines are then differentiated into the cell type of interest, whether it's cardiac myocytes, endothelial cells, neurons, hepatocytes, and so forth. In principal, any phenotypic difference observed between the differentiated cell lines can be attributed to the disease mutation. A significant advantage of CRISPR-Cas9 is its efficiency. However, the danger of using a tool that's designed to cleve the genome at a target site is that it might also cleve the genome at a different site and cause so-called off-target mutagenesis. This phenomenon could potentially confound one's experiments. In general, off-target effects are thought to be most likely to occur at sites in the genome with sequence similarity to the on-target site. Accordingly, several web servers have been developed that allow you to enter your on-target site and search through the genome for potential off-target sites, with a small number of mismatches to your on-target site. This can be helpful in prioritizing among several candidate guide RNAs for a project. As you may wish to choose the guide RNA that seems to have the least potential for off-target effects. A number of variants of the CRISPR-Cas9 system are now actively being used in research applications. Almost all of them are derived from the naturally-occurring system found in the bacterial species Streptococcus pyogenes. At least for now, the Strep pyogenes Cas9 and its associated gRNA architecture are the standard in the field. There has been extensive work characterizing its on-target and off-target effects. CRISPR-Cas9 adapted from another species, Staphylococcus aureus, has recently been introduced. One potential advantage is that Staph aureus Cas9 is about three-quarters of the size of Strep pyogenes Cas9. Staph aureus Cas9 is just small enough to fit into an adeno-associated virus, or AAV vector, along with the guide RNA. This makes it possible to use CRISPR-Cas9 for a variety of in vivo genome editing applications. Initial studies suggest that Staph aureus CRISPR-Cas9 can have similar on-target efficiency, along with less off-target effects, compared to Strep pyogenes CRISPR-Cas9. Here's one system by which to introduce Strep pyogenes Cas9 and a guide RNA into mammalian cells. You can express them from DNA plasmids. The guide RNA can be expressed from a plasmid with a U6 promoter, as shown here. Remember that the first 20 nucleotides of the guide RNA can be changed so as to determine the genomic DNA sequence to which the CRISPR-Cas9 complex will bind. The remainder of the guide RNA remains exactly the same. It's very easy to custom design a guide RNA to bind to a desired DNA sequence. In this system, two complementary single-strand DNA oligonucleotides, or oligos, are used to insert the desired 20 nucleotides into the plasmid in such a way as to put them at the 5 prime end of the guide RNA. A single ligation reaction is all that's needed. Conveniently, the Cas9 protein needs no alteration. The same version of the protein can be used for targeting of any genomic DNA sequence. In the plasmid shown here, Strep pyogenes Cas9 is expressed using a strong promoter called CAG. The plasmid co-expresses a green fluorescent protein or GFP, which is convenient for marking cells that are successfully expressed in Cas9. After the guide RNA plasmid is completed with a single-ligation reaction, the two plasmids can be introduced into cells, typically, by using the techniques of transvection or electroporation. Of note, the two-plasmid system shown here is one of many dfferent systems that are available to express Strep pyogenes CRISPR-Cas9 in cells. Here is an analogous system by which to introduce Staph aureus Cas9 and a guide RNA into cells. This system also uses two plasmids, which are similar but not interchangeable with the two plasmids used for Strep pyogenes that were shown on the last slide. The Staph aureus guide RNA is different from the Strep pyogenes guide RNA. One difference is that the protospacer length for Staph aureus is 21 nucleotides, rather than 20 nucleotides. Here are some rules for designing the Strep pyogenes CRISPR guide RNA. First, the protospacer is 20 nucleotides in length, so one must choose a protospacer of that length in genomic DNA. Second, the protospacer must be positioned just upstream of a 3-base pair element that matches the sequence NGG, which means any nucleotide followed by two guanines. This element is known as the protospacer-adjacent motif, or PAM. The PAM is directly recognized by Cas9. Without the PAM, no complex can form. Next, the 5 prime portion of the guide RNA must match the protospacer. It is this portion that hybridizes the complementary stand of DNA, the mechanism by which sequence recognition occurs. Of note, because you're using the U6 promoter, there's a specific constraint. The guide RNA must start with a guanine in order for it to be transcribed. Thus, you should add a G to the beginning of the protospacer, making it a 21-base sequence that you're placing at the 5 prime end of the guide RNA. The extra base at the very beginning of the guide RNA does not affect binding of the complex to DNA. Here are some suggestions for choosing a site to target in the genome. First, it's important to note that the double-strand break generated by Cas9 occurs three base pairs upstream of the PAM in the position indicated here by the red line. When mutations occur by non-homologous end joining, they tend to occur right at the break site. It's also important to realize that the CRISPR-Cas9 complex can form on either strand of double-strand DNA. You should always check for protospacer PAM combinations on both strands in order to find the optimal one. In general, your goal should be to choose a guide RNA that will position the double-strand break as close as possible to the actual site at which you wish to introduce a change in the DNA sequence, whether it's an indel to knock out a gene, or a variant you're trying to knock in. When searching for well-positioned protospacer PAM combinations, you may find several good ones. You can then prioritize among the candidates. For example, you can profile their possible off-target bindings sites elsewhere in the genome and choose the one that appears to be most favorable in that respect. Finally, if possible, it's best to avoid protospacers that have lots of guanines and cytocines, or to put it another way, is GC-rich, as this has been suggested to increase the chance of off-target effects. The rules for designing the Staph aureus guide RNA are largely the same, with a few critical distinctions. The protospacer is 21 nucleotides in length, rather than 20 nucleotides. The protospacer must be positioned upstream of a different PAM. The Staph aureus PAM is more complex than the Strep pyogenes PAM, with the sequence NNGRR, where R is appearing, whether guanine or adenine. The optimal PAM is thought to be slightly longer, with the sequence NNGRRT. As before, the 5 prime portion of the guide RNA must match the protospacer. Because you're still using the U6 promoter, the guide RNA must start with a guanine in order for it to be transcribed. Thus, you should add a G to the beginning of the protospacer, making it a 22-base sequence that you're placing at the 5 prime end of the guide RNA. Here are some more general suggestions for choosing a target site in the genome for your project. If you're trying to knock out a gene, there's quite a bit of flexibility with respect to target sites, because all you need to do is introduce a frame shift early in the coding sequence of the gene. The exact location is usually not important. Because it's ideal to make the truncated protein product as short as possible, you'll typically want to target a sequence in the first exon that contains coding sequence. Sometimes, however, this may not work if the gene in question has alternative start sites, or alternative splicing of exons. It's always worth checking in the USCS Genome Browser to see what genome transcripts have been identified, and if there is a lot of heterogeneity among the transcripts with differing start sites or splicing patterns. It's best to target the earliest coding exon that's shared by all of the transcripts. If you're trying to knock in a variant, your site selection will be constrained by the need to place the double-strand break as close as possible to the site of the variant, ideally less than 10 base pairs away. Keep in mind that when you're identifying the site of a mutation, particularly one that has been reported in the literature, you'll need to use the complementary DNA or cDNA sequence. That is, a coding sequence in which all of the introns have been removed to do this. However, when you're designing the guide RNA, you'll need to use the genomic sequence surrounding the site. If you use the cDNA sequence, there's a chance that your site is near an exon/intron junction, and your protospacer may inadvertently span across two exons. Of course, this guide RNA will fail to bind to the genome, since it doesn't take into account the presence of an intron in the midst of the sequence. Let's now consider an example of CRISPR design. Imagine that we're trying to make a cellular model of the cholesterol disorder known as familial combined hypolipidemia. The responsible gene is ANGPTL3, with loss of function mutations resulting in the disorder. The most commonly found mutation is the S17X nonsense mutation. Here's our task: To design a Strep pyogenes guide RNA that will let us target the site of this mutation in a wild type cell. This will potentially allow us to do two things. It will let us try to knock in the specific S17X mutation into the genome. However, because the site is very close to the beginning of the coding sequence, we could also use this guide RNA to try to simply knock out the gene by introducing frame shift mutations. Here's the start of the coding sequence of ANGPTL3. This sequence is taken from the human genome sequence, so we don't have to worry about missing exon/intron junctions. Highlighted in red is the site of the S17X dinucleotide mutation, which changes a TCC codon into a TGA stop codon. To find a suitable protospacer, we must first look for Strep pyogenes PAMs matching the sequence NGG. If you look in the vicinity of the mutation here, you'll see that there's no nearby NGG. However, remember that you can design guide RNAs that match either strand of DNA. So we can also look for PAMs matching the sequence CCN, which corresponds to NGG on the opposite strand. Here we find three CCN sequences near the desired mutation site. Let's consider each of these PAMs one by one. For the first one, because we're now working off the opposite DNA strand, the protospacer will extend in the downsteam direction. The 20-base protospacer, once you've determined the reverse complement sequence, is shown. If you map the location of the double-strand break, it'll be three base pairs away from the PAM, as indicated here by the red line. The break will occur 10 base pairs away from the site of the mutation, which is okay, but not optimal. The protospacer is not GC-rich, so that's an advantage. There's another important consideration when choosing the protospacer, and that's whether the protospacer and/or PAM overlap the site of the mutation. In the example shown here, the mutation site falls within the protospacer, which is an advantage. Why is this an advantage? Consider the following scenario where the protospacer and PAM do not overlap the site of the mutation. CRISPR-Cas9 introduces a double-strand break. The desired knock-in mutation is successfully introduced into the genome by homology-directed repair. Because the protospacer and PAM have not been changed in the knock-in mutant allele the guide RNA is still a perfect match for the genomic sequence, and CRISPR-Cas9 can go back and re-cleave the same DNA. If an indel then occurs via non-homologous end joining, then the knock-in mutant allele will be disrupted. It's possible that the experiment will ultimately yield no clean knock-in alleles. This scenario can be avoided, or at least mitigated, if the protospacer or PAM is altered by the knock-in mutation, resulting in a sequence mis-match. Then it becomes less likely that re-cleavage will occur. It's worth noting that single or even double mismatches may not eliminate re-cleavage, especially if the mismatches occur near the end of the protospacer that's far away from the PAM. Mismatches near the PAM tend to have more of an inhibitory effect. Disruption of the PAM itself, so that it no longer matches the sequence NGG, will almost certainly eliminate re-cleavage. Here's the second possible protospacer. The break will now occur just one base pair away from the mutation site. The protospacer is not GC-rich, which is an advantage. The mutation site falls within the protospacer, which is an advantage. Here's the third possible protospacer. The break will occur a little further away than the last one, four base pairs away. The protospacer is not GC-rich, which is an advantage. While the protospacer would not be affected by the mutation, the PAM itself would be, and that would essentially eliminate the possibility of re-cleavage, which is an advantage. On paper, the second protospacer appears to be the optimal one, since it results in cleavage very close to the mutation site. However, the best way to chose among the three candidates may be to actually test them in human cells to empirically assess which has the highest on-target efficiency in vitro. If the second protospacer shows much less activity than the first and/or third protospacer, then it may not be the best choice after all. Let's assume that the second protospacer turns out to be the best choice. The next step is to design all of the nucleotides that we can use to place the protospacer sequence into the plasmid that will express the guide RNA and cells. Recall that we have to add an extra guanine, shown in red, to the beginning of the protospacer sequence, shown in blue. We can use these template oligos to design the oligos that will specifically target the site of the ANGPTL3 S17X mutation. Note that the templates shown here are displayed in such a way as to convey how they'll hybridize to form a small double-strand DNA insert that can be ligated into the vector. When the desired protospacer is encoded into the oligos, you see the result at the bottom of the slide. At this point, we can simply purchase these oligos from a vendor. If we were simply trying to knock out the gene, we'd be done, since all we'd need are the guide RNA plasmid, which we can now complete with a single ligation step, and the fixed Cas9 plasmid that can be used as is. However, if we wanted to knock in the specific S17X mutation, we'd need a repair template as well, ideally a single-strand DNA oligonucleotide. It's worth emphasizing that knocking in a mutation relies on homology-directed repair. However, even in the best-case scenario, non-homologous end joining will occur in parallel with homology-directed repair. So even if you're adding a single-strand DNA oligo as a repair template, you'll like obtain a mix of cells, some with the S17X mutation, and others with indels at the site of the desired mutation. There's not yet a standard method to enrich for the first type, which we want, and prevent the second type from occurring, which we may or may not want. Although such methods are under development and are starting to be reported in the literature. To design the oligo repair template, you can simply take the desired mutation and flank it with at least 40 nucleotides of homology on both sides, taken directly from the genomic sequence. In this example, the dinucleotide S17X mutation is embedded in the middle of a single strand DNA oligo, with 40 nucleotides of homology on either side. We can simply purchase this oligo from a vendor. In practice, we'd probably choose to use even longer regions of homology, as it's feasible to obtain oligos that are up to 200 nucleotides in length, and there's data to suggest that longer homology arms will increase the efficiency of homology-directed repair. The final step is to develop a method by which to screen for mutations introduced at the target site. The most straightforward way to do this is to design PCR primers that will amplify a region surrounding the target site in the genome, ideally with the target site located in the middle of the amplicon. The amplified PCR product can be used to assess the overall mutagenesis rate through the use of assays that detect mismatches among the DNA sequences present in the PCR product. The PCR product can also be subjected to Sanger sequencing, or next-generation sequencing, in order to identify the specific mutations introduced at the target site.
Info
Channel: AHAScienceNews
Views: 38,047
Rating: 4.9631338 out of 5
Keywords: CRISPR, Cas9, Genome Editing, ATVB, 2015, American Heart Association (Nonprofit Organization), Health (Industry)
Id: h18HmFtybnQ
Channel Id: undefined
Length: 34min 19sec (2059 seconds)
Published: Mon Apr 06 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.