19 January 2022

Non-random mutations

The more important a gene is, the less likely it mutates

Alexander Markov, "Elements"

Mutation1.jpg

Fig. 1. Collection of mutagenesis data and correlations found between the rate of mutagenesis and epigenetic labels. From 107 plants propagating by self-pollination, one randomly selected seed was taken, a descendant was grown from it, a random seed was taken from it again, and so on. After 24 generations, sequencing and searching for mutations that did not exist in the original plant were carried out. This design of the experiment makes it possible to minimize selection: only mutations incompatible with life or leading to complete infertility should be rejected. At the next stage, with the help of several statistical tests, it was shown that the collected mutation data really characterize mutagenesis and were not greatly distorted by selection. The figure at the top right shows the results of two such tests (see explanations in the text of the news). The correlations found below between the frequency of mutations and various epigenetic markers and features of nucleotide sequences (GC content — the proportion of nucleotides G and C; H3K... — different variants of methylation and acetylation of histone H3; CGm, CHHm, CHGm — methylation of cytosines in different nucleotide contexts, the letter H denotes any nucleotide, thunder G; Access. — the openness of chromatin, determined by the number of nucleosomes, Expression — the level of expression). A drawing from the discussed article in Nature.

The study of a large array of mutagenesis data in the model plant Arabidopsis thaliana showed that mutations occur with different frequency in different parts of the genome. The rate of mutagenesis can be predicted by epigenetic signs, such as the level of DNA methylation, chromatin openness and histone modifications. The distribution of these epigenetic labels, in turn, depends on the functional load of DNA sites. As a result, it turns out that the frequency of occurrence of new mutations is inversely related to the functional importance of this part of the genome and the strength of the purifying selection acting on it. In other words, in the most important areas, new mutations are not only more actively cleaned by selection, but also occur less frequently. In particular, the rate of mutagenesis is lower within genes compared to external (non-transcribed) regions and in vital genes that work constantly, compared to those that are used only sporadically (for example, they are turned on in response to some external stimuli). Apparently, during the evolution of some organisms, under the influence of selection, molecular mechanisms have developed that reduce the frequency of mutations in the most important parts of the genome. Their work is based on the involvement of repair enzymes and other factors that protect DNA from damage to certain epigenetic labels. The study showed that non-random mutagenesis plays an important role in the evolution of genomes. Some characteristic features of molecular evolution, which are usually explained by the action of selection (for example, accelerated accumulation of differences in less important parts of the genome), are actually largely explained by non-random mutagenesis. Which, however, is itself the result of evolution under the influence of selection.

1. Evolution of mutation rate

The postulate about the randomness of mutations is included in all textbooks of evolution and it is unlikely that it will ever be deleted from there. Although it has long been known that it is not absolute and needs a lot of reservations and clarifications, the number of which is steadily growing with the development of science.

Mutations are really random in the sense that living beings do not have mechanisms that allow them to "calculate" which mutation will be useful for them in these conditions, and carefully introduce this mutation into their genome. At the same time, genomes mutate in all organisms without exception, the phenotypic effects of mutations are not calculated by anyone in advance, and it is impossible to predict exactly which mutations will occur in this chromosome during the next replication. In this respect, mutations are random, in many other aspects they are not. Many organisms have managed in the course of evolution to develop mechanisms that somehow regulate and optimize the mutation process. Several illustrative examples are discussed in my book "The Birth of Complexity", where one of the chapters is entirely devoted to this topic (it is called "Controlled mutations").

Nor can such an important characteristic of the mutation process as its speed be called accidental. The rate of mutagenesis depends, in particular, on the work of enzymes that carry out DNA replication and repair, and it evolves under the influence of selection. Since most of the non-neutral (affecting fitness) mutations are harmful, selection, as a rule, helps to minimize the rate of mutagenesis. Although there are exceptions here: for example, some viruses require a high rate of mutagenesis for everyday survival, so mutations that reduce it below the permissible limit are rejected by selection.

But even those organisms that do not need to constantly mutate in order to survive here and now, still cannot reduce the rate of mutagenesis to zero, because other evolutionary forces oppose selection to reduce it. Two of them are considered the main ones. Firstly, ultra-precise replication and repair systems are likely to be too costly: cumbersome, consuming a lot of energy, etc. Therefore, at some point, the "cost" of further improvement of these systems ceases to pay off with the benefit of a further decrease in the rate of mutagenesis (in other words, selection to slow down mutagenesis is balanced by selection to simplify and reduce the cost of molecular systems that monitor the preservation of genetic material).

The second reason is related to genetic drift. Beneficial mutations (including mutations that reduce the rate of mutagenesis) can be supported by selection only if their usefulness exceeds a certain threshold depending on the effective population size (N e, Effective population size). In order for selection to help the mutation spread, it is desirable that its beneficial effect (the amount by which the mutation increases the efficiency of reproduction) be greater than 4/N e (and certainly not less than 1/Ne). Otherwise, the mutation will be at the mercy of drift, not selection, that is, it will behave not as useful, but as neutral, and it will have very little chance of fixing itself (reaching 100% frequency). The lower the rate of mutagenesis, the less harm it causes, and the weaker the beneficial effect of mutations that slow down mutagenesis even more. Therefore, at some point such mutations cease to be supported by selection. It is assumed that this level of mutagenesis, corresponding to the equilibrium between selection and drift, is the final result of the evolution of the mutagenesis rate in many organisms (M. Lynch et al., 2016. Genetic drift, selection and the evolution of the mutation rate).

2. Is it possible to selectively control the rate of mutagenesis?

In recent years, data have begun to appear indicating that the rate of mutation can vary greatly in different parts of the genome and in different genes, and that these differences may be related, firstly, to the activity (level of expression) of the gene, and secondly, to epigenetic labels (which largely determine this activity), such as methylation and acetylation of lysines in histone H3 or, say, "chromatin openness" determined by the frequency of nucleosome arrangement (see, for example: F. Supek, B. Lehner, 2017. Clustered Mutation Signatures Reveal that Error-Prone DNA Repair Targets Mutations to Active Genes; X. Chen et al., 2012. Nucleosomes Suppress Spontaneous Mutations Base-Specifically in Eukaryotes).

In this regard, the question arises: could any organisms have developed special adaptations during evolution that would reduce the rate of mutation not in the entire genome at once, but selectively - for example, only in the most important genes, random mutations in which are the most dangerous?

Theoretically, this seems to be possible under certain conditions, although it is not entirely clear how often these conditions are met (I. Martincorena, N. M. Luscombe, 2013. Non-random mutation: The evolution of targeted hypermutation and hypomutation). Of the two limitations mentioned above, in the case of selective slowing of mutagenesis, one weakens and the other increases. The cost constraint is weakening: In theory, it should be easier and cheaper to protect individual genes from mutations with special care than the entire genome. The limitation associated with drift increases. After all, mutations occur thousands of times less often in a single gene than in the genome as a whole (simply because the gene is thousands of times shorter than the genome). This means that from an additional decrease in the rate of mutagenesis only in this one gene, fitness will increase quite a bit, even if it is a very important gene.

It would be nice, of course, to come up with a mechanism that reduces the rate of mutagenesis in all important genes at once. This drift should not interfere so much, since the total length of important genes is quite large. But is such a mechanism possible? Does it exist in any real organisms?

3. Collection of mutagenesis data in Arabidopsis thaliana

An article by a group of researchers from the USA, Germany, France and Sweden, published on January 12 in the journal Nature, shows that at least one well—studied organism is a model plant of the Arabidopsis thaliana — has the desired ability. Arabidopsis is almost certainly not unique in this, but let's not get ahead of ourselves: strict evidence has so far been obtained only for this species.

Researchers have collected a large array of data on mutagenesis in Arabidopsis. One of the main difficulties here is that the effects of mutagenesis must be carefully separated from the effects of selection. For example, if we simply sequence the genomes of 1000 plants, compare them with each other and identify all polymorphisms (variations of the nucleotide sequence), we will get a picture that reflects not mutagenesis in its pure form, but the combined effect of mutagenesis and selection. For example, we will not see a significant part of the harmful mutations that constantly arise, but selection diligently cleans them from the gene pool.

To minimize the impact of selection, the researchers used several approaches. One of them is shown in Fig. 1 (top left). One randomly selected seed was taken from each plant, a descendant plant was grown from it, a random seed was taken from it again, and so on. After 24 generations, the genomes of the descendants were sequenced and compared with the genome of the original plant. With this approach, the selection, although not eliminated at all, is radically weakened: only lethal mutations will not get into the final sample (they will be rejected), as well as those that lead to complete infertility. Another approach is related to the detection of somatic mutations by sequencing different cells of the same plant. In this case, you can even catch mutations that, at the level of the whole plant, would be incompatible with life or reproduction.

The resulting list of mutations was then checked by various statistical tests to make sure that it reflects exactly the process of mutagenesis (that is, it was not greatly distorted by selection). The results of two such tests are shown in Figure 1 at the top right. Two indicators were used: the ratio of non-synonymous (significant) and synonymous substitutions and the ratio of the number of substitutions leading to the appearance of a premature stop codon to the number of synonymous substitutions. The idea is that synonymous substitutions are usually neutral, selection does not discard them. Significant substitutions, on the contrary, often turn out to be harmful, premature stop codons - even more so. Biologists, even in their wildest fantasies, cannot imagine that mechanisms are possible that allow a cell to selectively reduce the frequency of significant substitutions or mutations that create stop codons. Or even somehow distinguish such substitutions from synonymous ones at the DNA level. Such mechanisms almost certainly do not exist. Therefore, during mutagenesis, much more significant substitutions and premature stop codons should appear than they will then remain in the gene pool of the population under the influence of selection.

Based on this logic, the identified mutations of the protein-coding regions of the genome (they are signed "De novo" on the graphs) were compared, firstly, with the natural genetic diversity of A. thaliana (from the project "1001 genome", 1001G), and secondly, with the theoretically expected spectrum of newly emerging mutations (Null).

It turned out that both indicators were significantly higher in the identified mutations (De novo) than in natural populations of arabidopsis (1001G), and only slightly (marginally) lower than in theoretically expected mutations. Consequently, the selection did not significantly affect the collected data, which means that they can be used to study the patterns of mutagenesis. Other checks also led to the same conclusion.

4. The rate of mutagenesis can be predicted by epigenetic labels

Having made sure of the adequacy of the initial data, the authors began to analyze them. They compared the frequency of mutations in different parts of the genome with epigenetic characteristics such as chromatin openness, DNA methylation and histone modifications. It turned out that many epigenetic signs significantly correlate with the rate of mutagenesis (Fig. 1, lower graphs). For example, in sites with a high level of methylation of the fourth lysine in histone H3 (H3K4me1), the mutagenesis rate is lower than the genome average, and in sites with a high level of acetylation of the ninth lysine of the same histone (H3K9ac), on the contrary, it is increased.

According to the data on epigenetic tags, it turned out that the rate of mutagenesis for different parts of the genome can be predicted quite accurately. For example, Figure 2a shows the average rate of mutagenesis predicted from epigenomic data for sites adjacent to the start (TSS) and end (TTS) points of transcription (Upstream, Downstream — non-transcribed sites before and after the gene, Gene body — "gene body"). It can be seen that the rate of mutagenesis is increased in non-transcribed regions, especially in the immediate vicinity of the gene boundaries (probably because there are many areas of open chromatin to which all sorts of regulatory proteins should attach). Moreover, the predicted picture (Fig. 2, a) is really very similar to the one actually observed (Fig. 2, b).

Mutation2.jpg

Fig. 2. The rate of mutagenesis in the vicinity of the start (TSS) and end (TTS) points of transcription (averaged data for all genes). a is the rate of mutagenesis predicted from epigenomic data (taking into account the correlations shown in Fig. 1); b — real data on mutagenesis; c — genetic polymorphism in natural populations (based on 1135 genomes). A drawing from the discussed article in Nature.

Moreover, the distribution of mutations in the vicinity of TSS and TTS pretty closely coincides with the distribution of genetic variability (polymorphisms) in natural populations (Fig. 2, c). It follows from this (as well as from a number of additional verification tests) that the observed distribution of polymorphisms is more determined by mutagenesis than selection. The conclusion is quite sensational, because until now it has been accepted "by default" to explain such distributions by selection.

The authors also found that different parts of genes mutate at different rates, that these differences can also be predicted by the epigenome, and that natural polymorphisms are distributed in approximately the same way. In particular, of all the exons that make up the gene, the first and the last mutate most often, and polymorphisms in nature are most often found in them. The rate of mutation of exons is negatively related to the length of the untranslated regions of the gene, as well as to the number and length of introns. The longer the untranslated regions and introns are, the less often the protein-coding regions of the gene mutate.

5. The weaker the purifying selection, the stronger the mutagenesis

Does all this mean that purifying selection, which traditionally explained the distribution of genetic differences across genomes (where there are fewer differences, stronger selection was assumed there), in fact has nothing to do with it at all? At first glance, it seems to be yes, in fact it is not.

On the one hand, the study showed that the rate of accumulation of genetic differences (for example, between individuals of the same species or between close species) in certain parts of the genome is determined by mutagenesis to a greater extent than selection. Mutagenesis, in turn, is regulated by epigenetic labels.

On the other hand, the authors have shown that the lowest rate of mutagenesis (and the corresponding epigenetic labels) are confined precisely to those parts of the genome that are under the strongest purifying selection. Conversely, in those areas where purifying selection is weakened, epigenetic characteristics contribute to accelerated mutagenesis.

In particular, it turned out that the genes with the lowest rates of mutagenesis are mainly genes with the most conservative (constant, little changing in the course of evolution) functions, such as translation, for example. Such genes, as a rule, work constantly in most cells and tissues. They are necessary for everyday, basic survival (that's why they are also called "household genes"). Such genes are usually under the strongest purifying selection. In other words, mutations in them most often turn out to be harmful and are rejected. These genes have been optimized for a long time, there is no need to change them, and they really change very slowly during evolution.

On the contrary, in genes whose functions are associated with the body's response to changeable environmental conditions, the rate of mutagenesis turned out to be increased. Purifying selection has a weaker effect on such genes, and mutations in them are slightly more likely to be useful. Therefore, such genes change faster in the course of evolution.

It also turned out that the rate of mutagenesis is lowered in vital genes (disabling which is incompatible with life), as well as in genes expressed in many cells and tissues compared to genes with narrow expression profiles. In both cases, differences in mutagenesis rates correlate with epigenetic characteristics.

The effect of purifying selection on a particular gene and the limitations on its evolutionary changes (evolutionary constraint) can be judged by a number of quantitative indicators, such as hereditary and environmental variability of the level of gene expression or the ratio of significant and synonymous differences when comparing the sequences of this gene in different individuals of the same species or in close species (Dn/Ds). The authors performed many such tests and everything came together: the more important the gene, the more fundamental its function and the stronger the evolutionary restrictions on its changes, the lower the rate of mutation of the gene — both empirically measured and predicted by epigenetic signs.

6. What does it all mean?

The results obtained show that Arabidopsis (and most likely many other organisms, although this has yet to be proven) apparently developed special mechanisms during evolution to reduce the rate of mutation of the most important parts of the genome - those areas in which mutations are most often harmful. As a result, among the newly emerging mutations, the proportion of harmful ones is significantly reduced, and the overall negative impact of mutagenesis on fitness decreases.

The proposed principle of operation of this mechanism is shown in the most general terms in Fig. 3. Its details have yet to be deciphered.

Mutation3.jpg

Fig. 3. "Conceptual diagrams" reflecting the author's interpretation of the results obtained. A drawing from the discussed article in Nature.

The study showed that by their epigenetic characteristics, genes differ from intergenic sites, and important genes differ from less important ones. Moreover, these differences are not anyhow, but natural, predictable and similar in both cases. For example, the level of methylation of the fourth lysine of histone H3 (H3K4me1) is increased both in the "gene body" (Gene body in Fig. 3) compared with adjacent non-transcribed sites (Upstream, Downstream) and in vital (Essential) genes compared with less important (Non-essential). Why is the epigenetic trait of H3K4me1 distributed this way and not otherwise, and what are the molecular mechanisms (Epigenome regulators in Fig. 3) responsible for such a distribution is a separate issue, which is still very far from being resolved. Nevertheless, we know that by this feature, important parts of the genome can be distinguished from less important ones.

In addition, it is known that there are specialized molecular systems in the cell responsible for DNA repair and protection from damage (DNA protection/repair factors in Fig. 3).

The authors suggest that in the course of evolution, "DNA protection and repair factors" (or some regulatory systems that control their activities) have developed a useful property, consisting in the fact that these factors are more active, more often or more carefully work with DNA sites that have certain epigenetic characteristics - for example, an increased level of H3K4me1. This property is nothing but an evolutionary adaptation aimed at optimizing the mutation process and reducing the harm it causes.

An important practical consequence is the need to revise the mechanisms of evolution of nucleotide sequences. Much of what has been attributed to the action of selection actually appears to be the result of not entirely random mutagenesis. A lot, but, of course, not everything. For example, the lowered values of Dn/Ds in vital genes cannot be explained by mutagenesis: this is a difficult-to-do painting of purifying selection. In addition, do not forget that the very nature of the "non-randomness" of mutagenesis discovered by the authors is probably the result of evolution under the influence of selection.

Source: Monroe et al., Mutation bias reflects natural selection in Arabidopsis thaliana // Nature. 2022.

Portal "Eternal youth" http://vechnayamolodost.ru


Found a typo? Select it and press ctrl + enter Print version