06 April 2015

The complete Encyclopedia of oncogenes

On the way to a detailed catalog of cancer genes

Vyacheslav Kalinin, "Elements"Creating a detailed catalog of cancer genes is an important task, the implementation of which will allow selecting the optimal therapy for cancer for each patient.

To compile a catalog of cancer genes mutating with high (>20%) and average (2-20%) frequency, it is required to analyze an average of 2,000 pairs of "tumor/norm" for each gene, that is, for the 50 most common types of cancer, this is about 100,000 pairs. Now this is no longer an unsolvable problem, since over the past 10 years the cost of DNA sequencing has decreased a million times and will continue to decrease.

Currently, cancer ranks second among the causes of human death, second only to cardiovascular diseases (approximately 8 million people die from cancer every year in the world), and in some developed countries, for example in Denmark, cancer has already come out on top.

Cancer is a complex, dynamically developing disease represented by more than 200 known types and forms. Each of them requires an individual approach, an individual treatment strategy. At the genetic level, different cancers are characterized by different "architecture" – sets of somatic mutations, rearrangements of chromosomes, as well as epigenetic anomalies, such as changes in the profile of gene methylation. The consequence of these events is a change in the activity of genes and (or) their products.

Detailed information about the anomalies associated with the occurrence and development of cancerous tumors is required for the diagnosis and effective treatment of cancer, to determine the optimal therapy, as well as for the development of new anti-cancer drugs. The objects of research are anomalies of the molecular structure of DNA, RNA, proteins and epigenetic anomalies (in particular, methylation). This approach has been adopted as a general strategy implemented by several national and international consortia that unite dozens of institutes, universities and clinics. Hundreds of researchers are involved in the work. The greatest successes and the most valuable data were obtained as a result of the search for genes associated with the initiation, development and maintenance of malignant cell transformation.

Oncogenes and anti-oncogenesA delicate balance must be maintained in the body between the activity of genes and their products, which, on the one hand, ensure the growth and division of cells, and on the other hand, prevent unlimited growth and division.

Excessive activity of the former or suppression of the function of the latter lead to uncontrolled cell growth, the emergence and development of malignant neoplasias – cancerous tumors.

Cancer–related genes can be divided into two types - oncogenes and anti-oncogenes (tumor suppressors), whose products can, respectively, stimulate or suppress the formation and development of tumors. A special place is occupied by microRNAs (miRNAs) – short (on average ~22 nucleotides) non-coding RNAs. To date, approximately 2,000 different microRNAs have been identified. They are able to suppress the translation of mRNA read from 30-60% of human genes. Some microRNAs (oncomiR) contribute to the malignant transformation of cells, others can work as anti-oncogenes. A normal gene, a proto–oncogene, can turn into an oncogene (Fig. 1), which stimulates cell growth constantly or at certain stages of the body's development.

Fig. 1. Under the action of a carcinogen, ionizing radiation or spontaneous mutation, a proto-oncogene can activate and turn into an oncogene that induces cancer. Drawing from the website en.wikipedia.orgThe transformation of a proto-oncogene into an oncogene occurs as a result of a relatively minor modification of its natural function.

There are the following main pathways of proto - oncogene activation:

  1. A mutation inside a proto-oncogene or in its regulatory elements that changes the structure of a protein and increases the activity of the protein (enzyme) encoded by it or enhances the expression of the corresponding gene.
  2. An increase in protein concentration due to an increase in its stability in the cell, an increase in the half-life period and, accordingly, an increase in activity.
  3. Gene duplication (increase in the number of copies), resulting in increased protein concentration in the cell.
  4. Translocation of a gene that causes an increase in its expression or the appearance of an aggressive hybrid gene.

Oncogenes are, for example, genes of the Ras family (socr. from ‘Rat sarcoma’) – GTPases involved in the transmission of signals that stimulate cell division.

The function of anti-oncogenes is the opposite of the function of proto-oncogenes. Anti-oncogenes control various processes that prevent malignant transformation of cells:

  1. Suppression of overexpression of genes that ensure cell proliferation.
  2. Implementation of DNA repair (DNA damage during suppression of repair enhances mutagenesis and, as a consequence, activation of proto-oncogenes and inactivation of anti-oncogenes).
  3. Coordination of cell proliferation with DNA repair. If DNA repair is suppressed, they inhibit cell division and initiate apoptosis.
  4. Control of adhesion and mechanisms of contact inhibition of dividing cells.

In general, anti-oncogenes put a barrier to unlimited cell growth. The loss of the function of the anti-oncogene destroys this barrier. The most well-known and often mutating in cancerous tumors of many types is the anti-oncogene TR53. The product of TR53 is a phosphoprotein regulating the transcription of a number of different genes. In a normal cell, it is inactive. In case of emergency events, it is activated and acts as a "guardian of the genome", performing various anti-cancer functions (Fig. 2):

  1. Activation of the DNA repair system.
  2. If the DNA is damaged, TP53 delays the mitosis of dividing cells, blocking the transition from the G1 phase to the S phase and giving the repair system time to repair the damage.
  3. If it is not possible to eliminate DNA damage, TP53 includes a program of cell death – apoptosis.

Fig. 2. In normal cells, p53 is inactivated by the negative regulator mdm2. With DNA damage or other stresses, the p53-mdm2 complex breaks down. Activated p53 inhibits cell division until the damage is corrected, or induces an apoptosis program. Drawing from the website en.wikipedia.orgIf cancer is caused by a proto-oncogene, it is usually sufficient to activate this proto-oncogene on one of the two paired cell chromosomes.

But if cancer has arisen due to the loss of the anti-oncogene effect, then, as a rule, mutations or the loss of both copies of it are required.

Comparative sequencingTo date, approximately 300 oncogenes and tumor suppressors have been identified.

Cancer genes are searched for using comparative sequencing – that is, they compare the sequences of nucleotides in the DNA of tumors and normal tissues and then identify somatic mutations missing in the DNA of normal tissue, which occur more often than just random events.

This strategy is implemented in several stages. First, it is required to obtain samples of tumor tissue from patients with a reliably diagnosed and detailed picture of the course of the disease. In this case, the samples should be, if possible, free of normal cells. For comparison, samples of normal tissue or blood of the patient are used. DNA is isolated from tumor tissue and normal tissue and examined by sequencing. Recently, sequencing has been carried out using a new generation of platforms that allow for the complete sequencing of the human genome quickly and relatively inexpensively. The results of comparative sequencing are then analyzed using specially developed highly complex mathematical and bioinformatic techniques.

The general goal of these projects is to create a detailed catalog of genome structure anomalies associated with the initiation, progression and maintenance of oncological neoplasms. Such a catalog will allow not only to obtain a lot of new data on the molecular biology of cancer, but also to improve methods of diagnosis, treatment and prevention of cancer, to identify new targets for the development of anti-cancer drugs. Systematic research in this direction has already made it possible to identify many new cancer genes and even entire classes of cancer genes.

Atlas of Cancer GenomesTo date, the data obtained on large samples of "tumor – normal tissue" for more than 30 forms of cancer have already been analyzed.

The most notable successes were achieved by the consortium "Atlas of Cancer Genomes" (The Cancer Genome Atlas), implemented mainly by institutes and universities in the USA, with the significant abbreviation TCGA – the same letters denote the four nucleotides that make up DNA. Founded in 2005, TCGA regularly publishes the results of its research in leading scientific journals. It is not possible to tell about all the publications of the consortium here. Here are just the results from the latest article on squamous cell cancers of the head and neck. This heterogeneous group of cancers is the sixth most common and accounts for ~5% of all cancer cases in the world.

348 authors participated in the research. 279 tumor/norm pairs were analyzed. Most tumors associated with human papilloma viruses had mutations in the spiral domain of the PIK3CA oncogene. New anomalies were found, including loss of TRAF3, amplification of the E2F1 gene involved in cell cycle control. In tumors associated with smoking, inactivating mutations of the TP53 and CDKN2A genes were almost always observed, amplifications of chromosome sections 3q26/28 and 11q13/22 were detected. Tumors of the oral cavity, relatively favorable in terms of the possibility of treatment and chances of recovery, contained activating mutations of the HRAS or PIK3CA genes in combination with inactivating mutations of the CASP8, NOTCH1 and TP53 genes. In cases of other subgroups of this cancer, inactivating mutations of the NSD1 gene, the product of which is associated with chromatin rearrangements, inactivating mutations of the AJUBA and FAT1 genes controlling enzymes of the Wnt signaling pathway, mutations activating the oxidative stress factor NFE2L2 were found.

"Moderate" and "rare" oncogenesSome cancer–related genes mutate quite often - in many or at least several types of cancer.

Therefore, it is not surprising that they (in particular, TP53) were the first to be characterized. But most cancer genes are detected with a moderate frequency (2-20%) or less. The main problems arise when identifying rare cancer genes. Thus, a recent study of 183 lung adenocarcinomas showed that 15% of patients do not have a single mutation in 10 gene classes known for this disease, and in 38% of cases three or fewer mutations were identified (M. Imielinski et al., 2012. Mapping the Hallmarks of Lung Adenocarcinoma with Massively Parallel Sequencing).

Due to damage to DNA repair systems in cancer cells, mutagenesis is more or less enhanced. The frequency of these tumor-induced somatic mutations for different tumors may differ by several orders of magnitude. Therefore, when identifying "moderate" and especially "rare" genes, a significant problem arises: how to distinguish cancer-related genes and mutations from the background, from numerous random mutations unrelated to cancer?

The Harvard group, led by G. Getz, developed the MutSig software package (socr. from ‘Mutation Significance’), which analyzes the frequency and spectrum of mutations in various regions of the genome in various forms of cancer, as well as a number of other factors, and allows you to isolate genes that reliably mutate in cancer (Fig. 3).

Fig. 3. Different parts of the genome mutate with different frequency in various forms of cancer. The MutSig algorithm takes into account the frequency and spectrum of mutations, a number of other factors, and allows you to find genes that reliably mutate in cancer. Figure from the article by L. Ding and M. Wendl, 2013. Differences that matter in cancer genomicsMutations reliably associated with cancer

In two articles (M. S. Lawrence et al., 2013. Mutational heterogeneity in cancer and the search for new cancer-associated genes and M. S. Lawrence et al., 2014. Discovery and saturation analysis of cancer genes across 21 tumour types), the authors studied the results of analysis of exomes collected from various databases (encoding regions of exon genes with adjacent sequences of DNA–non-coding gaps – introns) 4742 pairs of "tumor – normal tissue" belonging to cancers of 21 different types.

The number of samples for individual types varied from 35 to 892.

3,078,483 single nucleotide substitutions were detected in tumors compared to normal tissue, 77,270 single nucleotide deletions or insertions, 29,837 di-, tri- or oligonucleotide deletions or insertions. Of the single nucleotide substitutions, the vast majority (2,294,935) did not change the coding sequences. Of the remaining single substitutions, 540,831 were so-called missense mutations (leading to amino acid substitutions in proteins), 207,144 were synonymous (not leading to amino acid substitutions) nucleotide substitutions, 46,264 were nonsense mutations (leading to early termination of protein synthesis), 33,673 were mutations that disrupt mRNA splicing (assembly of mRNA coding sequences from the corresponding blocks). The data on the "depth" of sequencing and the purity of tumor samples allow us to estimate the sensitivity of the analysis by more than 90% (Fig. 4). The frequency of mutations per unit genome length for different types of cancer differed by more than 5 orders of magnitude (from 0.03 to 7000 per million DNA nucleotides), and the mutation spectra differed greatly.

Fig. 4. The number of "tumor/norm" pairs (along the ordinate axis) required to determine 90% of genes that reliably mutate in cancer with a probability of 90%. Estimates are given depending on the type of tumor, the average frequency of background mutations (along the abscissa axis), the excess of the frequency of mutations of cancer genes over the background (colored lines). For most types of tumors, the number of analyzed samples is still insufficient even to detect genes mutating with a frequency above the background of 5% or less. The black dots show the number of samples already analyzed. Figure from the article M. S. Lawrence et al., 2014. Discovery and saturation analysis of cancer genes across 21 tumour types.Mutations significantly associated with cancer were found in 224 different genes.

For different types of cancer, the number of mutant genes varied greatly (from 1 to 58). For 7 types, it was less than 10, and for two (breast and endometrial cancers of the uterus) – more than 30. Only 22 genes were significantly associated with more than three types of cancer. The analysis made it possible to identify almost all previously known genes associated with carcinogenesis. 33 genes were also found, mutations in which were not previously associated with cancer. These genes are associated with cell division, apoptosis, genome stability, regulation of chromatin activity, immune response, RNA transformations and protein homeostasis. Among another 81 genes, there should also be cancer-related genes.

Based on the results obtained, the authors made calculations: how many pairs of "tumor – normal tissue" need to be analyzed in order to find reliably mutating genes in cancer, depending on the type of cancer, the frequency of mutations per unit length of the genome for this type, the frequency of mutations of this gene for this type of cancer. Calculations show that for 17 of the 21 analyzed types of cancer, there is still not enough data to identify genes mutating with a frequency of no more than 5% of the background. And for 7 types – even at a frequency not higher than 10%. It was also determined that in order to compile a catalog of cancer genes covering 90% of cases of the disease, it is required to analyze about 650 "tumor/norm" pairs if the average mutation rate is ~0.5 per million DNA base pairs (as in neuroblastoma). Or even about 5,300 pairs, if the frequency is ~12.9 per million (as in melanoma). In total, in order to compile a catalog of somatic mutations for genes mutating with high (>20%) and medium (2-20%) frequency for ~50 known types of cancer, it is required to analyze an average of 2000 pairs of "tumor/norm", that is, about 100,000 pairs in total.

In general, the creation of a detailed catalog of cancer genes is an important task, the implementation of which will allow selecting the optimal cancer therapy for each patient: effects on certain signaling pathways or other processes damaged in each case. Such a catalog is also needed to select targets for the development of anti-cancer drugs, to create new experimental animal models and cell lines for cancer research, to test new drugs and treatments.

Source: Cancer Genome Atlas Network. Collaborators (348). Comprehensive genomic characterization of head and neck squamous cell carcinomas // Nature. 2015. V. 517. P. 576–582.

See also:

  1. Marcin Imielinski et al. Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing // Cell. 2012. V. 150. P. 1107–1120.
  2. Michael S. Lawrence et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes // Nature. 2013. V. 499. P. 214–218.
  3. Li Ding & Michael C. Wendl. Differences that matter in cancer genomics // Nat Biotechnol. 2013. V. 31. P. 892–893.
  4. Michael S. Lawrence et al. Discovery and saturation analysis of cancer genes across 21 tumour types // Nature. 2014. V. 505. P. 495–501.

Portal "Eternal youth" http://vechnayamolodost.ru06.04.2015

Found a typo? Select it and press ctrl + enter Print version