16 September 2009

History of genomics. Part 2: DNA technologies

Alexander Panchin, M.Sc. of the A.A. Harkevich Institute of Information Transmission Problems of the Russian Academy of Sciences, postgraduate student of the Faculty of Bioengineering and Bioinformatics of Moscow State University.

The first part of the article describes how the first methods of reading genetic sequences appeared, what they were and how genomics moved from reading individual genes to reading complete genomes, including complete genomes of specific people.

Before telling the story of the emergence of new sequencing methods, I would like to tell you about several extraordinary technologies related to DNA that appeared quite recently at the peak of genomics development. Figure 1 shows the images obtained by self-assembly of DNA. There can be millions of such pictures in a test tube, and they form themselves with the right selection of conditions [1]. These works appeared in 2006 and were called "DNA origami", in a more general form – "DNA nanotechnology".


Fig. 1. Images of ordered structures made up of DNA molecules under an electron microscope

But for the development of genomics, another type of technology plays a key role – the technology of sequencing (reading) genomes. We have already analyzed the Gilbert-Maxam method of 1973, followed by the Sanger chain termination method of 1975 and Hood's improved Sanger method of 1985, which allowed the creation of the first automatic sequencers. As the history of the Human Genome project has shown, these methods were still very expensive and slow - sequencing the first human genome cost more than $3 billion and took 13 years. At the beginning of the XXI century, the limit of sequencers terminating the chain was reached: in one capillary, 1000 nucleotides can be read in 1 hour for the price of $ 1, in one machine – up to 100 capillaries. The price of $ 1 is actually minimal – in Russia, the same 1000 "letters" are read for $ 15-20.

Today, there are more and more advanced sequencing methods that can be divided into two groups. The old methods are based on all sorts of sophisticated ways to obtain a large number of identical DNA molecules on a certain carrier, to amplify the signal when reading the DNA molecule. The latest methods are based on the development of ultra-precise devices capable of analyzing single molecules.

Before talking about the first type of devices, it is necessary to disassemble the so-called polymerase chain reaction (PCR), which allows you to multiply a single DNA molecule up to a million or even a billion copies. The technology was developed in 1984-1986 by Carey Mullis [2]. At a high temperature (about 95 degrees), the DNA molecule denatures: it turns from a double-stranded molecule into two single-stranded ones. At a lower temperature (usually 40-60 degrees), a short fragment – primer can be attached to a single-stranded DNA molecule, which in its sequence is complementary to approximately 20 nucleotides of a long DNA molecule. At a temperature of about 72 degrees, a special enzyme, DNA polymerase, isolated from bacteria living in hot springs, can complete a single–stranded DNA molecule, to which a primer has joined, to a double-stranded one. If we know the sequence of 20 nucleotides in the left part of a large DNA molecule and in its right part, then we can, using two primers (left and right), multiply a DNA fragment with exponentially increasing speed. To do this, the reagent mixture must be cyclically heated to 95 degrees (denaturation), cooled to 60 (primer annealing), heated to 72 (chain completion) and denatured again (the temperature is indicated approximately and depends on the specific primers and polymerase). Schematically, this is shown in Fig. 2. Surprisingly, for a successful PCR reaction, it is theoretically enough to have a single "matrix" molecule, that is, you can take DNA from one cell and multiply any fragment of its DNA any number of times.


Fig. 2. Three cycles of polymerase chain reaction. Primers are shown in dark red and dark green.

Polymerase chain reaction (PCR):



Fig. 3. Technology 454. 44nm balls are used, on which DNA molecules multiply.

In the same 2005, the laboratory of George Church [4] proposed another method using exactly the same balls, only smaller. Such balls, the size of a micron and storing many identical copies of the DNA molecule, are glued to a special piece of glass. Next, a mixture of primers of the form XNNNNNNN is added, where N is any nucleotide, and X is a known nucleotide to which a dye molecule of a certain color is attached (4 colors for each "letter"). Without going into details, let's just say that by what color the primers stick to each ball, we will know which nucleotide is on this ball. Primers of the form XNNNNNNN indicate the first "letter", NXNNNNNNN – the second and so on. The price is about 9 times less than the price of Sanger sequencing.

The third original method of reading DNA was proposed by Illumina in 2005 and implemented as a Solexa platform [5] on the way to the "$1000 dollars – one genome" plan. The DNA molecule is cut, and two different adapters are sewn to it on both sides. Exactly the same adapters that serve as primers are glued to a special piece of glass. Few DNA molecules are taken, and they stick in a random place of the glass, after which the enzymes necessary for PCR are added. Since all the primers are glued to the glass, new copies of the DNA molecule will be formed side by side. So a whole "forest" of identical molecules grows around each initial DNA molecule (Fig. 4), and millions of such clusters are formed. Next, free–floating primers and terminating nucleotides of all four types labeled with fluorescent dyes of different colors are added - so a primer and only 1 nucleotide will be attached to each DNA molecule, the color of which can be determined using a laser. Then the nucleotide is chemically modified so that the next terminating nucleotide can stick to it, and so on.


Fig. 4. Illumina Solexa - DNA clusters.

But a few years after the appearance of these sequencers, a lot has changed. Modern detectors are able to register signals received from individual molecules, which eliminated the task of amplifying the signal by amplifying DNA (as was done using Solexa or 454). Since 2008, the question "how is the sequence of a DNA molecule read?" can be answered: "we take a molecule and read it." Helicos has developed a new generation sequencer [6]. According to the new method, DNA is cut into pieces, and a long sequence of adenine molecules ("letters A") is attached to one of the ends of the fragment with the help of a special enzyme, and a glowing label is placed at the end. DNA molecules consisting of many "letters T" are glued on a special piece of glass. When the sample is poured onto the glass, the DNA molecules adhere by sections containing nucleotides "A" to the nucleotides "T" attached to the glass according to the principle of complementarity. Next, a special microscope scans the glass and finds the position of each stuck segment of the DNA molecule by the glow of the label (Fig. 5). The label is washed and each of the four types of nucleotides with a label is added in turn. They look at the glow of the molecules again, wash the label again and attach the next nucleotide. A special computer records the position of millions of flashes after each reaction. Such a system allows you to read billions of nucleotides per day.


Fig. 5. Helicos technology. Each glowing dot is a separate nucleotide!



Sequencing using Helicos.

But this, it turned out, is not the limit. In 2009, a revolutionary new method of reading DNA appeared using the same tool that a cell uses during division – DNA polymerase, an enzyme that doubles DNA. The device developed in the laboratory of Stephen Turner [7] is truly unique. Single DNA polymerase molecules are attached to the bottom of special cells located on transparent glass (Fig. 6). An area of negligible size right around the fixed enzyme is illuminated using a special laser. Nucleotides of all four types are added to the cell, marked with different luminous markers. The area that the laser analyzes is so small that the molecule normally does not stay in it for a long time, floats away and leaves no signal. If, during DNA doubling, the molecule is retained by polymerase, then the molecule is delayed in the scanned area and leaves a signal. After attaching the nucleotide, the glowing label falls off and floats away. Thus, the reading speed of such a device becomes equal to the speed of the polymerase.


Fig. 6. SMART Pacific Biosciences technology – reading single molecules during their connection to the growing DNA chain using DNA polymerase.

The Pacific Biosciences video can be viewed on the company's website.

Thus, over the past 10 years, there has been a revolution in the ability to read genomes. The price of sequencing thousands of nucleotides fell from less than $1 using the Sanger method in the 1990s to 5 cents with 454 technology in 2005, then in the same year to 0.2 cents with Illumina technology and finally 0.05 cents with Helicos technology. The price of reading DNA with the latest sequencers from Complete Genomics and Pacific Biosciences, which are likely to appear on the market this year and next, is still unknown, but the forecasts are very optimistic.

The question arises: what has genomics given and can give people? The answer is simple: a lot.

The read genomes provide the basis for the construction of evolutionary trees, which will help to understand exactly how the evolution of living organisms on our planet went. When the cost of reading animal genomes approaches the threshold of $1,000 per genome, it will be possible to read the genomes of all living organisms, investigate any kinship relationships between any groups of living organisms and develop a new taxonomy.

Secondly, we are learning more and more about the role of our genes. Thanks to the comparison of the human and ape genomes, we learned that it is not the number of genes that makes us different from other animals, but their quality, features. By comparing the genomes of different animals, we can find out what each of the genes is responsible for, and maybe even improve one or another gene to bring out a new, more adapted organism. Genetic data is actively used in the creation of medicines or new crops. 
Recent studies on the study of single mutations in genes show a link between certain genetic variations and various diseases and features of the human body and psyche. So there are genes that determine whether we will go bald, what eye color we have, height, how much we are at risk of diabetes or a certain form of cancer, whether we have a predisposition to sports, whether taking coffee, alcohol or aspirin is useful for us. Since 2007, companies such as 23andMe (their genetic diagnostics for $399 is the best invention of 2008 according to Time magazine's regting), Navigenics or DecodeMe began to appear, which use so-called DNA chips to determine genetic characteristics of this kind. The ability to read entire genomes will greatly improve the predictive abilities of such projects, reliability, granularity and accuracy of predictions, and the adequacy of recommendations (Fig. 7).

Fig. 7. Our knowledge about man in the "pre-genomic era"

Figure 8. Our knowledge of man at the beginning of the genomics era

Today we still do not know how to change the genes of adults, but we can prevent many genetic diseases if we know the cause and start prevention or treatment in advance. Thus, studies of a gene associated with a progressive form of myopathy, processes when it is read in a cell, have allowed the development of drugs that save those who would otherwise be doomed. The study of gene functions opens the way to personal pharmacogenomics – an individual approach to treatment. People with different genes should be treated in different ways: one medicine will help someone, another will help someone, and it is important to know.

Of course, such opportunities will give rise to new social, ethical and philosophical problems. Health insurance companies may want information about the genome of customers to decide whether it is profitable for them to give insurance – to give them a discount for a person who genetically promises to be healthy or to force a potential patient to pay more for insurance? Is it right? Shouldn't athletes compete in different categories along the alleys of genes related to physical fitness, just as light-weight boxers don't fight heavyweights? Is it possible to use genetic information to give preference to one or another employee of the company in promotion or employment? In the US, such discrimination is already prohibited, but the US Army conducts its own genetic tests. Already today, knowledge about possible health-threatening genes helps to choose the healthiest embryos during in vitro fertilization, but is it possible to change the genes of future people in a zygote (sperm or egg) or in a fertilized egg?

In any case, as the history of recent decades shows, the development of these technologies is inevitable, as it is inevitable that someday they will become available (if not mandatory) for everyone. It remains for us to prepare for the coming changes and participate in them.

Literature:

  1. Rothemund PW: Folding DNA to create nanoscale shapes and patterns. Nature 2006, 440(7082):297-302.
  2. Mullis K, Faloona F, Scharf S, Saiki R, Horn G, Erlich H: Specific enzymatic amplification of DNA in vitro: the polymerase chain reaction. Cold Spring Harb Symp Quant Biol 1986, 51 Pt 1:263-273.
  3. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z et al: Genome sequencing in microfabricated high-density picolitre reactors. Nature 2005, 437(7057):376-380.
  4. Shendure J, Porreca GJ, Reppas NB, Lin X, McCutcheon JP, Rosenbaum AM, Wang MD, Zhang K, Mitra RD, Church GM: Accurate multiplex polony sequencing of an evolved bacterial genome. Science 2005, 309(5741):1728-1732.
  5. Bennett ST, Barnes C, Cox A, Davies L, Brown C: Toward the 1,000 dollars human genome. Pharmacogenomics 2005, 6(4):373-382.
  6. Harris TD, Buzby PR, Babcock H, Beer E, Bowers J, Braslavsky I, Causey M, Colonell J, Dimeo J, Efcavitch JW et al: Single-molecule DNA sequencing of a viral genome. Science 2008, 320(5872):106-109.
  7. Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B et al: Real-time DNA sequencing from single polymerase molecules. Science 2009, 323(5910):133-138.

Portal "Eternal youth" www.vechnayamolodost.ru
16.09.2009

Found a typo? Select it and press ctrl + enter Print version