19 October 2009

Personal genomics: it's time to put things in order

A selection of materials on the state of the genetic analysis market has been published in the October issue of the journal “Nature”.

The industry of personal genetic testing services is developing so rapidly that regulatory authorities do not have time to control the activities of companies working in this direction. To date, more than a dozen companies offer inexpensive diagnostic kits for home use. Analyzing the genome is extremely simple: you order a sample kit online and mail a test tube with a saliva sample or a scraping of the cheek mucosa. A few weeks later, on the company's website, you can get acquainted with your personal list of risks of earning various diseases, compiled based on the results of scanning from five hundred thousand to a million genetic markers and analyzing their combinations.

However, the value of such testing is still controversial. That is why the industry is in dire need of strict regulations that will serve not only the health of customers, but also the reputation of the companies themselves.

In September, the UK Human Genetics Commission issued a set of principles aimed at helping clients of companies engaged in genetic testing and improving the quality of personal genomics services. However, these principles relate primarily to advertising, which should contain reliable information about the limited possibilities of genetic testing, including the fact that the results obtained cannot be equated with medical diagnoses.

Sooner or later, the government will still have to intervene in the situation, since the self-regulation of the industry does not provide sufficient protection of consumer rights. Given the incredible speed of the industry's development, government representatives have a difficult job ahead, and in the meantime, companies offering DNA testing should accept the rules of the game and provide customers with only clinically valuable information, accompanied by a detailed explanation of unclear points.

Over time, the cost of such services will inevitably decrease, and their popularity, respectively, will grow. However, simultaneously with the increase in the number of people who want to get reliable information about aspects of their health that need attention, the level of accuracy of the data provided is decreasing. The reason for this lies in the fact that constantly conducted genome-wide associative studies reveal more and more new markers of the risk of developing various diseases, which increasingly complicates the interpretation of the data obtained. In particular, researchers often pay too much attention to markers that are not the most indicative of complex features.

To date, more than 1,000 variants of genes associated with various signs and diseases have been identified. Based on these data, companies offer DNA testing to those who wish, revealing the secrets of their genotype. Such information is of great value from the point of view of increasing the effectiveness of the prevention of various diseases, but the effectiveness of its interpretation is far from perfect today.

This leads to the fact that the information received by consumers will eventually be reduced to a vague statement of an uncertain possibility of a slightly increased risk of a certain obscure disease. In this case, most clients will simply pay more attention to regular medical examinations. However, if an indefinite, slightly increased risk applies, for example, to breast cancer, some particularly suspicious individuals may take radical and, possibly, senseless measures, such as preventive mastectomy.

The Agenda for Personalized MedicineSome experts doubt the accuracy of the results of genetic testing.

However, Craig Venter and his colleagues, comparing the results of testing for predisposition to 13 diseases provided to five individuals by two DNA analysis laboratories, concluded that the reproducibility of the primary data obtained by 23andMe and Navigenics specialists is very high and corresponds to the 99.7% claimed by the companies.

There are two more controversial issues: do the predicted risks have clinical value and how accurate can the correlation between a genetic variant and a certain disease be? Some of the people who have used genetic testing services claim that different companies have provided them with different forecasts for the same disease.

The study of this aspect showed that the estimates of the absolute risk or probability of an individual developing a particular disease in 23andMe and Navigenics differ quite significantly. This parameter is calculated from two other indicators: relative and average population risks. The relative risk depends solely on the individual genotype, and the average population risk depends very much on how the population is determined. Navigenics shares the average population risks for men and women (for example, heart attacks occur more often in men than in women), whereas 23andMe mainly takes into account age (for example, the likelihood of developing rheumatoid arthritis increases with age). Discrepancies in the definition of the population reduce the quality of interpretation of the absolute risks of developing diseases.

Even with the exclusion of the influence of the average population risk variable, the authors found that, on average, in the results provided by 23andMe and Navigenics for 5 individuals, only about 60% of the forecasts of relative morbidity risks quantitatively coincided. Moreover, for some diseases, the reproducibility of the results was higher than for others.

The main factor causing discrepancies in the results is the set of markers chosen by the company to calculate the relative risk. Risk markers are identified during genome-wide associative studies, in which hundreds and thousands of millions of markers in the genomes of healthy and sick individuals are analyzed. Each marker is represented by different alleles. Alleles that are more common in patients with a clear diagnosis are classified as risk alleles whose risk coefficient is higher than 1. For example, in 38% of patients with Alzheimer's disease, the AoE gene is represented by the ApoE4 risk allele, whereas in the control group this allele occurs in only 14% of cases. The risk coefficient for ApoE4 in this case is calculated as the ratio of the probability of occurrence in patients to the probability of occurrence in the control group (0.38/0.62)/(0.14/0.86) and it is 3.7. The greater the difference in the occurrence of the allele in sick and healthy people, the more pronounced its association with the disease. Conversely, alleles that provide protection from a particular ailment are less common in people suffering from it and have a coefficient higher than one.

The companies offering genetic testing use the results of the same studies, but today 23andMe and Navigenics work with different marker panels for the same diseases. A number of tokens are recognized by both companies. Moreover, companies, as a rule, equally assess the degree of predictivity of such markers.

One of the diseases whose prediction accuracy turned out to be the same for both companies is celiac disease (hereditary intestinal disease). For this disease, both companies used one strong marker found in 90% of patients with this disease. The risk coefficient for this marker is 7. In addition to this marker, Navigenics uses seven other weak markers that have virtually no effect on the final result of the relative risk assessment. These data indicate that, in general, the number of markers does not affect the reproducibility of the forecast, which largely depends on the existence of a single strong marker.

The absence of such universally recognized strong markers leads to significant discrepancies in the results provided by different companies. For example, according to the results obtained by 23andMe, the relative risk of developing psoriasis in one of the study participants is 4.02, whereas according to Navigenics calculations, this indicator is only 1.25.

Another problem is the use of markers with uncertain risk coefficients. For example, one of the markers of type 2 diabetes used by Navigenics, according to the specialized literature, has the highest risk coefficient of all the markers used by Navigenics for this disease. However, the company warns that the effect of this marker is statistically unreliable and its presence in the genome may not affect the likelihood of developing the disease. It is obvious that the average client will not pay enough attention to such a result.

The results obtained during the analysis formed the basis of the recommendations below for improving the work of companies providing genetic testing services.

• Indicate the genetic contribution of the tested markers.
To date, markers identified during genome-wide association studies do not fully explain the heritability of diseases. For example, according to modern data, the heritability of celiac disease by about 60-65% cannot be explained by the presence of known genetic factors, and the existence of unknown markers of the disease leads to false negative results of genetic testing. To avoid misunderstandings related to this, companies are recommended to indicate the percentage of the genetic contribution of the markers they use to the heritability of the disease, as well as the proportion remaining for factors unknown at the time of the analysis. Currently, companies usually indicate the ratio of the genetic contribution to the contribution of environmental factors.

• Pay more attention to forecasting high risks.
The majority of genetic forecasts (about 80%) provided to date contain information only about a slight increase in the relative risk (ranging from 0.5 to 1.5) of developing a particular disease compared to the average population risk. In this regard, companies are recommended to focus customers' attention on diseases whose risk of development can be predicted with a high degree of probability. This will allow clients to choose the best methods of prevention. At the same time, do not neglect the information about a slight increase in the risk of developing diseases.

• Genotyping risk markers directly.
If a company is unable to genotype a marker described in the literature, it usually resorts to the principle of non-equilibrium coupling (non-random association of alleles) to select a substitute marker. According to the authors, about 1% of individual markers do not fall under the rule of nonequilibrium coupling. Obviously, in such cases, the use of a substitute marker is fraught with false results. Given the huge number of markers being tested, when analyzing the genome, such a small deviation from the rule will invariably lead to at least one error. Direct genotyping of specific key markers of diseases will help to avoid this.

• Test pharmacogenomic markers.
According to current estimates, about 100,000 people die every year in the United States from the side effects of various drugs. Identification of variants of genes responsible for drug metabolism can significantly improve the situation. In this regard, the authors recommend that companies test the maximum possible number of tokens of this class.

• Use the same strong markers.
The companies have agreed to use clinically confirmed markers, however, the markers themselves and their number are chosen at the discretion of the company's employees. This lack of consistency leads to a discrepancy in the results provided to clients. The introduction of the results of new associative genome studies will eventually lead to the elimination of such inconsistencies. However, today the optimal solution is to switch to the use of a universal complex of strong markers.

Recommendations of the scientific community

• Monitor the results of lifestyle changes.
One of the main issues concerning the feasibility of genetic testing is the long-term impact of their results on the lifestyle of clients and, accordingly, the resulting improvement in their health. This issue is already being actively studied, and the results obtained as a result of the research will be used to improve the effectiveness of testing and improve the reputation of the industry as a whole.

• Conduct prospective research.
The coincidence of forecasts provided by different companies is not equivalent to their accuracy and significance. To date, it is impossible to confidently state that the forecasts of any of the companies are more accurate. To effectively assess the clinical significance of prognoses and, accordingly, the prognostic value of markers, it is necessary to conduct prospective studies with the participation of tens or even hundreds of thousands of clients of companies.

• To confirm the informative value of markers in the genomes of different ethnic groups.
Genome-wide associative studies are conducted mainly on populations of European origin. However, disease-associated markers may differ for different populations due to differences in the frequency of occurrence of alleles and features of nonequilibrium coupling. Therefore, specialists should confirm the informativeness of the markers used and the genetic sequences linked to them.

• Switch from genotyping to sequencing.
Sooner or later, personal genome sequencing will become economically feasible. The advantage of sequencing over genotyping is that it reveals the full range of genetic variations of an individual, whereas genotyping only allows you to draw logical conclusions. However, it should not be forgotten that in addition to the identification of genetic markers, interpretation of the results obtained is extremely important, therefore sequencing also does not solve the problem of prediction accuracy.

Human genetics: bullseye or by?When conducting genome-wide associative studies, scientists have identified hundreds of genetic indicators of various diseases.

Kelly Rae Chi examines three of them in order to assess the "accuracy" of this approach.

The technology that appeared five years ago, which makes it possible to compare the genomes of individuals by analyzing tens of thousands of known single-nucleotide differences scattered throughout the genome, caused a great resonance not only in the scientific community, but also in the general public. These differences, the so-called single nucleotide polymorphisms, or "snips" (from the English single nucleotide polymorphisms, SNPs), are control points in the study of genomes. The approach is based on the idea that certain variations of the genotype underlie the predisposition to various diseases.

As part of genome-wide associative studies, scientists scan the snips of thousands of people. If such a scan reveals an association between a DNA variant and a high risk of developing a particular disease, this indicates a part of the genome that is most likely partially responsible for the underlying mechanisms of the disease.

Many researchers readily accepted this hypothesis, however, despite a fairly large amount of evidence, it does not work in all cases.

Below are three examples. One is an almost flawless proof of the effectiveness of the approach. The second demonstrates the difficulties of interpreting the results outside the biological context. And the third shows that at the current level of development, genetic testing in some cases cannot provide adequate information.

"Direct hit" in hemoglobinIn 2007, researchers conducted a genome-wide scan of the genes of healthy adults in search of snips associated with very high or very low levels of fetal hemoglobin.

Among the findings, special attention was drawn to the variants of the BCL11A gene found in many populations, located on the 2nd chromosome.

In most people, fetal (fetal) hemoglobin is almost completely replaced by an adult version of this protein shortly after birth. In some individuals, relatively high expression of fetal hemoglobin persists throughout life. Usually this does not affect the state of health in any way, however, with such hereditary diseases of the blood system as sickle cell anemia and beta-thalassemia, the presence of fetal hemoglobin significantly softens the course of the disease.

Snips are often located outside the gene sequence, so the findings of genetic testing in such cases are only control points that provide approximate coordinates of the desired gene. In the described case, the hit was, on the contrary, directly into the gene. The protein encoded by the BCL11A gene has been known to control the expression of other genes and is associated with the progression of malignant tumors. Knockout mice without the BCL11A gene were even created, but no one even suspected its role in hematopoiesis.

Experiments on cultures of human hematopoietic progenitor cells have shown that suppression of the expression of the BCL11A gene increases the production of fetal hemoglobin. And in the course of subsequent work on mice, scientists found that this gene regulates the suppression of fetal hemoglobin production during the development of the body.

However, the exact mechanism of functioning of this gene has not yet been deciphered. Scientists working on this task hope that their results will eventually help people suffering from the above-mentioned hereditary blood diseases.

Thus, the BCL11A gene is a find demonstrating the value of the results of genome-wide scanning, which in this case even pessimists who do not believe in the expediency of the approach cannot disagree with.

Around and around schizophreniaThe study of the genetics of schizophrenia has come to a dead end many times.

Numerous associative studies have identified promising target genes that could not withstand further testing. Obviously, the scientific community had high hopes for fashionable genome-wide studies. However, the first four studies did not reveal reliable associations. As part of the study, the results of which were published last year, scientists analyzed the genomes of 500 patients with schizophrenia and 3,000 healthy people. After that, the genomes of another 16,000 people were tested for the presence of 12 identified candidate sequences. As a result, three gene variants were found that were statistically significantly associated with schizophrenia, but only one of them (ZNF804A) is a gene encoding a protein with unknown functions.

To confirm the significance of this marker, scientists used functional magnetic resonance imaging to compare the brain activity of 115 healthy people, almost half of whom had at least one copy of the ZNF804A variant associated with a high probability of developing schizophrenia. It turned out that people with the studied variant of the gene have an increased activity of interaction of certain areas of the brain, similar to that observed in schizophrenia.

In part, the problem with finding genes associated with schizophrenia is due to the vagueness and subjectivity of the criteria for the disease, as well as the wide range and differences in the severity of its symptoms. In this case, magnetic resonance imaging turned out to be a very useful help. Now researchers are studying the functions of ZNF804A and the functioning of various variants of this gene.

In this case, despite the seemingly good results, there are many doubters in the scientific and medical environment. They claim that the images obtained by magnetic resonance imaging alone are not sufficiently indicative, and the changes observed in schizophrenia are very insignificant. The authors of the described work do not agree with them, and disputes over the informative value of variations of the ZNF804A gene in the diagnosis of schizophrenia continue.

In full growthThe results obtained in the study of genes responsible for body length are clearer than in the case of schizophrenia, but not more informative.

In 2007, an analysis of the genomes of 5,000 people showed that the variant of the HMGA2 gene partially explains the variability of growth – by about 0.3%. Since then, more than 40 loci involved in the formation of growth have been discovered. Taken together, all identified genetic markers explain the variability of this trait by "as much as" 5%.

It is believed that the growth of a person by 60-80% is due to his genotype. It is obvious that even knowing more than 40 uninformative markers practically does not give us any information, and only optimists are able to say that today we know much more about the genetics of growth than we knew 3 years ago.

To date, scientists know almost nothing about the role of the loci they know in determining human growth. As in a number of other cases, most of the snips associated with growth are located between genes or within genes with unknown functions. Due to meager funding and lack of biological information, research in this area is practically standing still.

Some identifiable loci are involved in molecular mechanisms whose participation in the growth and development of the organism has been known for a long time. A 1995 study showed that a gene related to the HMGA2 gene has an effect on growth: mice without this gene were distinguished by a short body length, and animals with a truncated version of HMGA2 developed gigantism.

It is most difficult to predict the influence of loci located outside the genes. Such loci do not always have an effect on nearby genes, they can change the activity of DNA sites that are millions of nucleotide bases away from them.

Scientists call growth a "model trait" because it is easy to measure and it is a relatively stable value compared to signs such as blood pressure or blood glucose levels. However, when conducting associative studies of the genetic basis of growth, in order to obtain only weak associations, it was necessary to analyze the genomes of tens of thousands of people. This fact indicates that even the study of such seemingly simple signs as growth can become an exceptionally difficult task.

Literature:Putting DNA to the test // Nature 461, 697-698 (2009)

Pauline C. Ng, Sarah S. Murray, Samuel Levy & J. Craig Venter An agenda for personalized medicine Nature 461, 724-726 (2009)
Kelly Rae Chi Human genetics: Hit or miss? // Nature 461, 461b 712-714 (2009)

Evgeniya Ryabtseva
Portal "Eternal youth" http://vechnayamolodost.ru19.10.2009

Found a typo? Select it and press ctrl + enter Print version