29 May 2017

About the benefits of harmful mutations

Harmful mutations in the genome enhance each other's influence

Olga Vakhrusheva, "Elements"

An international group including scientists from Russia, the USA and the Netherlands has shown that the effect of harmful mutations on fitness depends on the presence of other harmful mutations in the genome. It turned out that the more harmful mutations are already present in the genome, the more harmful the subsequent mutations. This interaction between mutations allows negative selection to more effectively remove harmful alleles from the population. These results may partly explain why populations of living creatures are not dying out, despite the high rate of occurrence of harmful genetic changes.

Mutations constantly occur in the genomes of living beings – changes in the DNA sequence. Some of the resulting mutations are harmful, that is, they reduce the fitness of the individual in whose genome this mutation is present. Fitness is the contribution that a given individual will make to the gene pool of the next generation. That is, this concept describes how successful an organism is from the point of view of natural selection. Therefore, as a characteristic of the fitness of an organism, evolutionary biologists often choose the number of children in a given individual. Thus, the effect of a harmful mutation on fitness can be represented as a decrease in the probability that the carrier of this mutation will leave offspring.

The paradox of "mutational cargo"

According to recent data, on average, each newborn carries about 70 new mutations in the genome and at least 10% of them are harmful (see S. Besenbacher et al., 2015. Novel variation and de novo mutation rates in population-wide de novo assembled Danish trios). But theoretical estimates of how much mutations accumulating with each new generation should reduce fitness are incompatible with the existence of a human population – this is the so-called "mutational cargo" paradox.

For the long-term survival of the species, natural selection must increase the average fitness of individuals in the population at least at the same rate with which it is lowered by mutations appearing in each generation. A species is evolutionarily "stable" if there is a balance between the emergence of new harmful mutations and their removal as a result of natural selection. If this does not happen for some reason, harmful mutations accumulate quickly, which leads to the extinction of the species.

The more harmful mutations occur in the genome in one generation, the more "genetic deaths" must occur in order to restore the average fitness of the population. "Genetic death" occurs if an individual does not leave viable descendants, and thus does not pass on its genes to the next generation. As a result, harmful mutations that have appeared in the genome of this individual will also not be inherited by descendants.

Knowing how many harmful alleles appear on average in the genome each generation, we can estimate how much the average fitness of the population decreases from this. This estimate can be correlated with the proportion of the population that should not have left descendants as a result of negative selection.

The predictions obtained in this way suggest that if negative selection acted on each harmful mutation separately, more than 80% of people would not have to leave viable offspring (see A. Eyre-Walker, P. D. Keightley, 1999. High genomic deleterious mutation rates in hominids).

Obviously, this does not correspond to what we see. Such a discrepancy between theory and reality may indicate the existence of additional mechanisms that allow natural selection to more effectively remove harmful mutations from populations.

Possible types of interactions between mutations

Theoretical predictions, from which the "mutational load" paradox follows, are based on the assumption that mutations affect fitness independently of each other. But this is not necessarily the case. In recent years, many examples of interaction between different mutations have been described. A situation in which the influence of one mutation depends on the presence of another mutation is called epistasis.

If we talk about harmful mutations in terms of their impact on fitness, we can imagine three scenarios (see Fig. 1):
    1) harmful mutations do not interact with each other and individually contribute to a decrease in fitness;
    2) harmful mutations weaken each other's influence; 
    3) mutations enhance each other's influence.

negative_selection1.jpg

Fig. 1. A graph showing the dependence of the fitness drop on the number of harmful mutations in the genome in the absence of interactions between mutations (a) and in the case of different scenarios of interaction between mutations: mutations enhance each other's influence (b, c); mutations weaken each other's influence (d). Figure from the article by A. S. Kondrashov, 1988. Deleterious mutations and the evolution of sexual reproduction.

The last two scenarios correspond to different types of epistasis. If harmful mutations weaken each other (scenario 2), then their cumulative effect on fitness should be less than what we expect to see based on their individual effects. In this case, fitness decreases more slowly with an increase in the number of harmful variants in the genome than in the absence of epistasis (line d in Fig. 1).

Conversely, if a mutation turns out to be more harmful in the context of other mutations (scenario 3), then the decline in fitness accelerates with an increase in the number of harmful variants (lines b and c in Figure 1). In this scenario, individuals carrying a large number of harmful alleles would be disproportionately less adapted and would be quickly removed from the population, that would allow for one "genetic death" to get rid of a significant number of harmful mutations at once.

Thus, mutual amplification of the effects of harmful mutations could resolve the paradox of "mutational cargo". However, until recently it was not clear how common interactions between individual mutations at the genomic level are and what type of interactions are most common among harmful alleles.

Search for possible interactions between harmful mutations by studying the distribution of the number of harmful mutations in the population

To answer the question about the existence of interactions between harmful mutations and the nature of these interactions, in a new paper, the results of which were recently published in the journal Science (Sohail et al., Negative selection in humans and fruit flies involves synergistic epistasis), scientists analyzed the distribution of the number of harmful alleles per genome in human populations and fruit flies Drosophila melanogaster.

The fact is that the form of distribution of the number of harmful mutations per genome in a population should depend on the existence and type of interactions between mutations. And by the form of this distribution, it is possible to judge which type of interactions occurs most often.

One way to describe the distribution may be to compare the variance of the distribution with its mean. The variance of a distribution is a measure of the spread of values occurring in a distribution relative to the mean. With the same mean, the distribution with higher variance is "wider" than the distribution with low variance. The lower the variance, the narrower the distribution, and the distribution values are more concentrated around the mean.

In the simplest case of the absence of epistasis, the distribution of the number of harmful mutations per genome in the population should have the form of a Poisson distribution. To understand why this is so, we can imagine that we independently scatter mutations into random genome positions. The probability that a mutation will fall into a specific position of the genome is very small, and mutations occur independently of each other. Then the probability that in a given genome ("in a given test series") it will turn out N mutations can be calculated by the Poisson formula with the parameter λ equal to the average number of mutations per genome. It follows from the properties of the Poisson distribution that its variance is equal to the mean.

Now imagine that mutations enhance each other's harmful effects. In this case, negative selection will more quickly clear the population of individuals carrying a large number of harmful mutations. This should lead to the fact that the population will be depleted of genomes with a large number of harmful alleles. And if we look at the distribution of the number of harmful mutations on the genome in this case and compare it with the "zero" distribution that would be expected in the absence of epistasis, then this distribution should be cut off on the right (Fig. 2, compare the gray and red histograms). In statistical terms, this means that the variance of such a distribution should be less than the average. This idea is the basis of the article under discussion.

negative_selection2.jpg

Fig. 2. Comparison of the expected form of distribution of the number of harmful mutations per genome in the population in the absence of interaction between harmful mutations (mutations are independent of each other – null model, gray), in the case of "reinforcing interactions" (red) and in the case of weakening interactions or the existence of other causes that increase the variance of the distribution (blue). In the case of reinforcing interactions, the proportion of individuals with a large number of harmful mutations is less than in the null model. The "underrepresentation" of individuals with a large number of harmful mutations is expressed in the fact that a smaller part of the population is in the right part ("tail") of the distribution. As a result, the distribution in the case of reinforcing interactions is shifted to the left relative to the distribution in the absence of such interactions. Note the heavier tail of the gray distribution compared to the red one. A drawing from the discussed article in Science.

"Reinforcing" interactions between harmful mutations in human and fruit fly populations

The authors analyzed the distribution of the number of harmful mutations on the genome in several populations of humans and the fruit fly D. melanogaster. For this purpose, data on genome-wide sequencing (see Whole genome sequencing) of individuals from these populations were used.

Only mutations affecting protein-coding genes were considered in the work, since such mutations can be relatively simply divided into classes according to their effect on protein function. Mutations entering the protein-coding regions of the genome are divided into synonymous, non-synonymous and nonsense mutations. Synonymous mutations do not lead to amino acid substitution and are considered the most neutral class of mutations. Non-synonymous mutations that cause the replacement of an amino acid residue in a protein are, on average, much more harmful. But a much more harmful class of mutations is nonsense mutations, the rarest in this classification. Mutations of this group lead to the appearance of a premature stop codon in the gene sequence, which causes an early termination of protein translation and is expressed in the absence of a functional protein product.

To focus on the analysis of the most harmful alleles, the authors approximated the number of harmful mutations in the genome of each individual by the number of nonsense mutations. Then, for each sample, the distribution of the number of nonsense alleles per genome was obtained and the mean and variance of this distribution were calculated. It was shown that the variance of the distribution of the number of nonsense alleles per genome in all the samples considered is less than its average (Fig. 3, 4). That is, the distribution of the number of nonsense alleles per genome is narrower than the Poisson distribution with the same mean. Or, simply put, we see fewer individuals with more nonsense alleles than we would expect to see in the null model of the absence of interactions between mutations. Thus, for the most harmful mutations, with a high probability of destroying the gene, the distribution in human and fruit fly populations corresponds to the picture that is expected in the case of "reinforcing" interactions between individual mutations.

negative_selection3.jpg

Fig. 3. Distribution of the number of nonsense mutations, synonymous and non-synonymous mutations per genome in the population of the fruit fly D. melanogaster, superimposed on the Poisson distribution (black line) with the corresponding mean. It can be seen that the distribution of the number of nonsense mutations is narrower compared to the expected Poisson distribution (the variance for nonsense mutations is lower than expected). At the same time, the distributions of non-synonymous and synonymous mutations are characterized by overdispersion (note the heavy right tail of these distributions). Modified drawing from additional materials to the discussed article in Science.

The narrowing of the distribution of the number of nonsense mutations per genome is the result of the presence of reinforcing interactions between nonsense mutations

To make sure that this observation is not the result of technical artifacts, subsamples of synonymous and non-synonymous mutations with the same population frequencies as nonsense mutations were considered as a control.

In the first approximation, nonsense mutations are under the influence of strong negative selection, while synonymous mutations do not cause significant harm and are, for the most part, neutral. Technical noise should make a comparable contribution to the dispersion of both harmful nonsense mutations and neutral synonymous mutations. At the same time, negative selection should have an effect on the variance of the distribution of harmful nonsense alleles, but not on the variance of harmless synonymous mutations.

Therefore, if the "narrowing" of the distribution of the number of nonsense mutations is associated with more effective selection against individuals carrying a large number of harmful alleles, and not for some other reasons, we expect to see a contrast between nonsense mutations and synonymous mutations. That is, it is expected that the population will be "depleted" by individuals with a large number of nonsense mutations, but not by individuals with a large number of synonymous mutations.

Since synonymous mutations are much more common than nonsense mutations, each individual carries much more synonymous than nonsense mutations. In this regard, the average value of the distribution of the number of synonymous alleles per genome in the population is much higher than the average value of the distribution of the number of nonsense alleles. Due to the fact that the variance of the distribution depends on the mean distribution, we cannot directly compare the variance of the distribution for nonsense alleles and for synonymous variants. To get around this problem, the authors generated random samples of synonymous mutations with the same population frequencies as nonsense mutations. Such a random sample of synonymous mutations will have exactly the same mean as the distribution of nonsense alleles. For each population, 1000 samples of synonymous mutations were generated with the same average values as the observed distribution of nonsense mutations in this population. Variance was calculated for each such sample, which allowed us to obtain a distribution of expected variance values for nonsense mutations. Then the observed variance for nonsense mutations was compared with the expected distribution of variances obtained on the basis of control samples of synonymous mutations.

This procedure makes it possible to assess how surprising the variance value that we observe for nonsense mutations is, and to understand with what probability we expect to see such a low variance (and such a narrowing of the distribution) for random reasons.

Using this analysis, it was shown that the distribution of the number of nonsense mutations is significantly narrower than the control distributions of synonymous and non-synonymous mutations (for which a similar analysis was performed) (Fig. 4). This allowed us to confirm the "selective" nature of the observed phenomenon.

negative_selection4.jpg

Fig. 4. The ratio of variance to average for the distribution of the number of nonsense alleles per genome in human populations (GoNL, ADNI, MinE) and the fruit fly D. melanogaster (DPGP3). The red line corresponds to the observed value of the variance-to-mean ratio for the distribution of nonsense alleles in this population. As a control, the expected distributions for the ratio of variance to the mean are given, obtained by generating random samples of synonymous (blue) and non-synonymous (green) mutations with the same distribution of population frequencies as nonsense mutations. It can be seen that in most cases the variance for nonsense mutations is significantly lower than expected for random reasons. A drawing from the discussed article in Science

Reinforcing interactions also exist between non-synonymous mutations

In addition, it turned out that the variance of the distribution of the number of non-synonymous and synonymous mutations per genome is higher than expected in the null model (Fig. 3, 4). Among the reasons causing an increase in variance may be, for example, the presence of a population structure (see Population stratification) in the data or technical noise. In simulations and with the help of various statistical tests, it was confirmed that the increase in variance for synonymous and non-synonymous mutations can indeed be explained by the population structure and various technical artifacts.

The population structure and technical noise in simulations have always caused an "increase" in variance, but in none of the scenarios considered led to its decrease. That is, the narrowing of the distribution of the number of nonsense mutations, most likely, cannot be explained by technical artifacts.

At the same time, if the average number of mutations is controlled, the variance for non-synonymous mutations turns out to be higher than for nonsense mutations, but lower than for synonymous ones (Fig. 4). That is, we observe a decrease in variance (narrowing of the distribution) with an increase in the "harmfulness" of mutations.

Such an observation indicates the likely presence of "reinforcing" interactions not only among the most harmful nonsense alleles, but also among the much more common non-synonymous mutations. Nevertheless, the "overdispersion" of the distribution of the number of non-synonymous mutations in comparison with the zero expectation did not allow making such a statement.

The authors suggested that if epistatic interactions between non-synonymous mutations exist, then they should be most pronounced among a subset of the most harmful non-synonymous mutations. As a set of non-synonymous mutations, probably having a significant impact on fitness, non-synonymous mutations that fall into the genes most important for the existence of the organism were selected. And, indeed, if we consider only non-synonymous mutations in the genes necessary for the organism, then in the populations of humans and fruit flies there are fewer individuals carrying a large number of such mutations than expected in the absence of "reinforcing" epistasis.

In addition, the following, more general analysis was carried out for the fruit fly. All the genes were divided into several groups according to the rate of protein evolution. The most slowly evolving proteins are affected by the strongest negative selection. This means that changes in the amino acid sequence of such proteins probably contribute greatly to a decrease in fitness. It turned out that the lower the rate of evolution of a group of genes (that is, the more important the genes are), the smaller the variance in the number of non-synonymous mutations in this group. At the same time, there is no such dependence on the "degree of necessity" of genes for synonymous mutations. Apparently, this result indicates that "reinforcing" interactions are more pronounced among more harmful mutations.

Thus, in the work under discussion, it was shown that individuals with a large number of harmful mutations are underrepresented in human and fruit fly populations. The observed phenomenon is most likely a consequence of "reinforcing" epistatic interactions between harmful mutations. The existence of such interactions is a likely mechanism that allows negative selection to more effectively purify populations of harmful alleles, and can serve as an explanation for the "mutation load paradox".

Synergistic epistasis and sexual reproduction

In addition to the fact that the described results help to better understand how populations of living beings resist the constant influx of harmful genetic changes, they are also important in the light of another question in evolutionary biology – the question of the reasons for the existence of sexual reproduction.

Despite the fact that sexual reproduction prevails among eukaryotes, and transitions to asexual reproduction usually lead to the rapid extinction of a group of organisms, evolutionary biologists are still arguing about why sexual reproduction is necessary for the long-term evolutionary success of the species.

Among the widely supported theories, there is a hypothesis that in the case of sexual reproduction, recombination provides more effective selection against harmful mutations. In asexual reproduction, the mutation always remains in the genomic context in which it occurred. Recombination, which occurs during sexual reproduction, destroys the coupling between mutations and creates new combinations of mutations. In this case, it becomes possible to "collect" many harmful mutations in one genome and for one "genetic death" significantly reduce the "load of harmful mutations" in the population.

Compared with the sexual population, in which recombination each generation expands the distribution of the number of harmful mutations per genome, there are fewer individuals in the asexual population with both a very small and a very large number of mutations. That is, recombination increases the variance of the distribution of the number of mutations per genome, as a result of which both individuals carrying very few harmful mutations and individuals carrying many harmful mutations appear in the population (A. S. Kondrashov, 1988. Deleterious mutations and the evolution of sexual reproduction). The average number of harmful mutations does not change.

"Reinforcing" interactions between harmful mutations ensure the most effective selection against individuals with the most "contaminated" genomes, and recombination creates an "excess" of such individuals. It turns out that recombination each generation "expands" the distribution of the number of harmful mutations on the genome, which allows negative selection to then more effectively "narrow" it.

Due to recombination, individuals appear in the population every generation, in whose genomes a large number of harmful mutations are "concentrated". In the presence of "reinforcing" interactions, this allows for a smaller number of "genetic deaths" compared to the asexual population to remove the same number of harmful mutations and makes the existence of recombination profitable.

Thus, the presence of "reinforcing" interactions between harmful mutations, which was shown in the work under discussion, can be considered as a weighty argument in support of the hypothesis that sexual reproduction is necessary for effective selection against harmful mutations.

Portal "Eternal youth" http://vechnayamolodost.ru  29.05.2017


Found a typo? Select it and press ctrl + enter Print version