In 1977 Fred Sanger’s lab developed a method for determining the DNA sequence of short fragments. I touched on this briefly in a prior post published here at the time of Dr. Sanger’s death.
Over the subsequent decades the technique was refined and eventually transformed into a single-tube automated reaction, however the basic method remains the same. There are three basic principles that underlie the Sanger dideoxy DNA sequencing method.
The first principle is that DNA is sequencing is a modified replication reaction that occurs whenever a cell divides. This is accomplished by stringing nucleotides together according to the original DNA molecule used as a template. For a brief review of this replication reaction, see the animation below from HHMI.
The second principle is gel electrophoresis, the use of acrylamide gels to separate DNA strands based on their length. Acrylamide forms a weblike polymer sieve through which molecules (like DNA) can move. Because larger molecules get hung up on the threads of this web more often than smaller molecules do, the larger ones cover less distance in the same amount of time. Also, because DNA has a uniform negative charge spread out along its length, when an electrical current is run through the gel, the DNA will migrate toward the positive pole. If the acrylamide is made at just the right density, the DNA fragments can be separated to such precision that single base differences in length are distinguishable.
In the animation below, four tubes are prepared, each with fragments of one size. These are loaded into ‘wells’ in an acrylamide gel and then subjected to an electrical current.
The second principle comes from the nature of DNA itself and the chemistry of the nucleotides that make it up.
DNA is a long polymer made up of many nucleotides. The name, DNA, stands for deoxyribose nucleic acid, which describes the molecule chemically. The prefix ‘de-‘ means that DNA nucleotides lacks something that standard ribonucleic acids have. The ‘oxy’ part tells us what is missing, an hydroxyl (-OH) group. (See figure below) The first hydroxyl group is the one that determines the difference between DNA and RNA.
A ’di-deoxy’ molecule lacks an additional hydroxyl group (dideoxy= two hydroxyls missing)
This second hydroxyl is removed from a position that forms the backbone of the molecule and is required for the next nucleotide to attach in a polymerization reaction. Without this, DNA replication comes to a screeching halt. If a sequencing reaction, which is a form of a polymerization reaction, includes a portion of these dideoxynucleotides, then the incorporation of this nucleotide will terminate the reaction at a known base.
Because DNA is comprised of the four bases, (A)denine, (T)hymine, (C)ytosine and (G)uanine, deoxynucleotides with each of these four bases are required for DNA synthesis. If a synthesis reaction is supplied all four of these in amply supply, then synthesis will proceed smoothly. If one of these is omitted and replaced with only the dideoxynucleotide version, then synthesis will proceed until that dideoxynucleotide is incorporated. Because this nucleotide lacks the hydroxyl group required to attach a subsequent nucleotide, the reaction stops.
This doesn’t give us much information, however, because we can only read up to the first of each type (A,T,C or G). What is done then, is that all four deoxynucleotides are supplied, but in each of four tubes, a small proportion of dideoxynucleotides is added. In this way, the synthesis reactions can proceed until a dideoxynucleotide is added, but this may happen at a different occurrence of this nucleotide in each instance of synthesis.
Consider the template sequence below in black. Replicative strands are made using deoxyribonucleotides (in black) and dideoxy-A (in red).
When these fragments are run on a gel, we can visualize a band at positions corresponding to the occurrence of each ‘A’ nucleotide in the sequence.
In the same way, three additional reactions are run including dideoxynucleotides of each flavor and then run on separate lanes of the gel. Altogether, these four lanes provide a complete account of the original DNA sequence.