In this video, I want to briefly
introduce the idea of transgenic
organisms, also called genetically
modified organisms. These are organisms
whose genomes have been permanently
altered through genetic engineering. Note
that this is different from adding a
plasma to the organism, because the
plasmid is not part of the genome and
can be lost from the organism. Studying a
purified protein can take us only so far
in understanding its function. Ultimately,
we want to understand its role in the
living organism, where its expression is
subject to regulation, and where it
interacts with all the other molecules
present. Manipulating a gene in an
organism's genome is a good way to gain
information about what the gene product
is doing in that organism. We can make
three main types of changes to a gene: we
can replace a normal gene with a
modified version, we can eliminate or
knock out the gene, or we can add a gene,
which generally results in higher
expression than usual. Observing the
changes that occur in any of these
situations can give valuable information
about the role of the gene. A powerful
method to edit genomes is the CRISPR-
Cas9 system. CRISPR stands for
clustered regularly interspaced short
palindromic repeats. Cas9 is an
endonuclease that creates double-strand
breaks in DNA.
Unlike restriction endonucleases, each of
which cuts DNA at a defined, unchanging
target sequence, Cas9 is directed to
cut specific sites on DNA by a guide RNA
that binds to the enzyme. Without the
guide RNA, Cas9 will not cut DNA. This
means that researchers can direct Cas9
to cut DNA at specific sites by
providing an appropriate guide RNA the
guide RNA is about 30 nucleotides long,
and contains about 20 nucleotides that
specify the site at which Cas9 will
cut the DNA. Because the sequence used
for recognition of the cut site is so
long, a complementary sequence will occur
quite rarely just by chance. For example,
a given 20-nucleotide sequence is
expected to occur about once
in every 4 to the power of 20 bases in a
random sequence of DNA, or about once in
a trillion bases. The largest genome
known has about 149
billion bases in it, which is about 50
times larger than the human genome. So
it's quite unlikely that a targeted 20-
nucleotide sequence will be present more
than once in a typical organism's genome.
Thus we should be able to target sites
for gene editing very specifically using
the CRISPR-Cas9 system. Now in
practice, the specificity of CRISPR is
reduced from what is theoretically
possible, because the system is tolerant
of a few mismatches between the guide
RNA and the target site. After Cas9
has made a double strand break in the
genome, the organism's double-strand break
repair machinery will try to fix the
break. If it does so by non-homologous
end joining, then it is likely that bases
will be lost at the repair site. If the site
is toward the beginning of a gene,
there's a good chance that the gene will
be inactivated or knocked out. Otherwise,
the organism might repair the double-
strand break by homologous end joining.
If the researcher has provided some
replacement DNA, there's a chance the
organism will copy that
during its repair of the double-strand
break, in which case an altered version
of the gene of interest could be created.
Alternatively, an entire gene could be
added at the site of repair. CRISPR-Cas9
has made it much easier to create
transgenic organisms, enabling us to
study gene function in the whole
organism. But to be useful, we need to
have a lot of information about the
genome of the organism under study. In
the next series of videos, I'll go over
some basics of sequencing and
annotating genomes.
