16 May 2016

Genes and education

The level of education received partly depends on genes

Alexander Markov, "Elements"

Summarizing the results of genotyping hundreds of thousands of people of European origin, a large international team of geneticists and bioinformatics has identified 74 genome regions, variations in which reliably correlate with the level of education received by a person (which is traditionally measured by the number of years of study). A significant part of the identified genes are active in the brain, especially during intrauterine development; mutations of many of them affect cognitive abilities. It is incorrect to call these genes "education genes", since their connection with the level of education is usually weak and indirect. Nevertheless, the study confirmed that the application of modern analytical methods to very large samples of people makes it possible to detect a reliable influence of specific genes even on such traits that are mainly determined by the environment and are traditionally considered not innate.


Fig. 1. The average level of education of US residents, depending on the year of birth. On the vertical axis – the level of education (measured by years of study), on the horizontal – the year of birth. The green line is for women, the brown line is for men. It can be seen that this indicator has been steadily growing over the past century. This growth is obviously associated with social, cultural and economic changes, and not at all with genetics. Nevertheless, within each cohort (a collection of people of the same age) there is a variability in the level of education, which is at least 20% determined by genes. Image from the website whitehouse.gov

At first glance, it may seem that such a feature as the level of education (educational attainment), traditionally measured by the number of years a person has spent studying, is a typical example of a non-hereditary feature determined solely by environmental factors (such as the material security of the family, the level of development of the education system in the country, etc.). But this impression is deceptive. Even from general considerations, it is clear that the level of education may well also be influenced by traits that largely depend on genes. Such, for example, as cognitive abilities or openness to new experiences.

Back in the 1980s, the analysis of large samples of twins and their parents showed that the level of education has a high heritability. Variability on this trait is at least 20% determined by genetic differences between people, and for some samples, heritability values of the order of 70% were even obtained (A. C. Heath et al., 1985. Education policy and the heritability of educational attainment).

These figures, however, reflect only the overall scale of the genetic contribution to variability on the basis of "level of education". It is much more difficult to find specific genes that affect such a trait. The fact is that the number of years of training is a "high–level" trait in the sense that genes can influence it only very indirectly, through many intermediate stages (unlike, for example, the ability to perceive a certain smell or distinguish green from red – here the path from gene to trait is extremely simple and short.

Hundreds, if not thousands of different genes (polymorphic loci) can simultaneously influence "high-level" behavioral and psychological traits, and the contribution of each individual gene can be vanishingly small. In this case, twin analysis and comparison of parents with children will show a high heritability of the trait, but all attempts to detect associations between the trait and specific genetic variants will be unsuccessful. That is, we will know that the trait strongly depends on the genes, but we will not be able to figure out which ones.

To overcome this difficulty, it is necessary to analyze huge samples of tens and hundreds of thousands of people. The larger the sample, the weaker genetic influences can be detected with its help. Such studies have become possible only in recent years due to the reading of the human genome, the rapid development and cheapening of molecular and bioinformatic methods, as well as the constant accumulation of genotypic and phenotypic data on different human populations collected by standard methods.

In 2013, the results of the first successful attempt to find specific genes related to the level of education in the human genome were published. The sample of 126,559 individuals used revealed three genes, each of which is reliably, although very weakly, associated with the duration of training (C. A. Rietveld et al., 2013. GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attachment).

In a new article published in the journal Nature (Okbay et al., Genome-wide association study identifies 74 loci associated with educational attachment), a huge team of researchers from around the world (the list of institutes and universities whose employees participated in the study alone has 186 items) reported the results of studying a sample of 293,723 persons of European origin. In fact, a meta-analysis was carried out, that is, a generalization of data obtained by a single methodology by many independent research teams in different countries. The average duration of study in the entire sample was 14.3 years (standard deviation 3.6).

In all the people whose data were used, the level of education was recorded at the age of at least 30 years, and genotyping was carried out for 9.3 million single-nucleotide polymorphisms ("snips"), which reflect almost the entire genetic variability of mankind. Since the vast majority of the 3 billion nucleotides in the human genome are conservative, that is, they are the same in all people, there is no need to analyze each nucleotide and we can limit ourselves only to variable ones.

The study was a so-called genome-wide association search (GWAS, Genome-Wide Association Study). There are many subtleties and pitfalls in this technique, but, fortunately, reliable statistical methods have been developed to overcome them. One of the main problems is related to "population stratification". This means that the population under study may be subdivided into parts (subpopulations) that differ in the frequency of occurrence of some genes and traits. Because of this, there is a danger of detecting false associations. This problem is often called the "chopsticks gene problem" (chopsticks gene). The name comes from the following parable. Allegedly, one geneticist decided to find out which genes influence the tendency to eat with chopsticks. He asked his students to report how often they use chopsticks while eating. Then he genotyped them and searched for associations. A locus was found that is strictly correlated with the use of sticks. The geneticist immediately published an article in which he announced the discovery of the "successful-use-of-selected-hand-instruments gene" (abbreviated SUSHI). A couple of years later it turned out that SUSHI is actually one of the genes of the histocompatibility complex, one of the variants (alleles) of which is much more common in Asians than in Europeans. Of course, this gene has nothing to do with the use of sticks. However, since this behavior is more widespread in Asian culture than in European culture, GWAS showed a strong and reliable association that has no biological meaning (D. Hamer, L. Sirota, 2000. Beware the chopsticks gene).

To cope with the problem of the "gene of Chinese sticks", two groups of methods have been developed. The first is based on the analysis of data on individual families: parents and children or siblings are compared to check whether the association persists within a single family. The "genes of Chinese sticks" do not stand up to such a test. The second group of methods is related to the analysis of nonequilibrium coupling of genes: if the allele we are interested in occurs more often than expected with independent inheritance in combination with certain alleles of other randomly selected genes, then there are grounds to suspect that we are dealing with the "gene of Chinese sticks".

The authors used these and a number of other methods to clear the results from false positive signals. As a result, it was possible to identify 74 polymorphic loci, each of which with a high probability really affects, albeit indirectly, the duration of training. The contribution of population stratification to the detected genetic effects, judging by the results of the analysis of nonequilibrium coupling, does not exceed 8%. Three loci identified in the previous study were re-identified in the new one.

As expected, all these genes affect the level of education rather weakly. The effect of each gene individually corresponds to three to nine weeks of study and explains from 0.01 to 0.035% of the variability on the basis of "duration of study". At the same time, the total effect of all 74 genes is less than the sum of their individual effects.

The authors had information not only on the duration of training of the tested individuals, but also on a number of other phenotypic traits. This allowed us to evaluate genetic correlations, that is, the extent to which genes that affect the level of education also affect other traits (Fig. 2).


Fig. 2. Genetic correlations between the duration of study (EduYears) and other characteristics: the size of individual subcortical structures and the brain as a whole (Brain volume), neuropsychiatric diseases (Neuropsychiatric), mental abilities (Cognitive performance), neuroticism (Neuroticism), body mass index (BMI) and Height (Height). The figure shows that alleles that increase the likelihood that an individual will receive a good education are also associated with increased intelligence, a large brain volume, a reduced likelihood of Alzheimer's disease and emotional stability (low neuroticism). A drawing from the discussed article in Nature

It turned out that genetic variants that contribute to long studies also correlate with good mental abilities (cognitive performance), a large brain volume (cranial box) and for some reason with an increased risk of bipolar disorder. Extremely weak, but nevertheless reliable positive correlations with the risk of schizophrenia and growth were also found. Significant negative correlations were found for neuroticism, the risk of Alzheimer's disease, and, judging by Figure 2, for body weight (although this is not directly mentioned in the text). The 74 genes under consideration do not affect the size of subcortical structures. These results are preliminary and need additional verification.

The functional spectrum of 74 genes turned out to be very characteristic. Among them, the proportion of genes involved in the development and work of the brain is sharply increased. Mutations in them often lead to mental retardation, a decrease in brain volume, increase the risk of Alzheimer's disease and other disorders (a typical example is the TBR1 gene). Some of them are expressed in the developing brain of embryos, regulating the division of precursor cells of neurons, the migration of young neurons and the growth of the neocortex, others play a role in synaptic plasticity (including the formation of dendritic spines and new synapses) throughout life. Of the 37 adult body tissues accounted for in The Genotype-Tissue Expression Project (GTEx) database, 13 are components of the central nervous system, and in all these tissues – and only in them! – the expression level of 74 genes associated with the duration of study was significantly increased. At the same time, it is 1.36 times higher in the embryonic brain than in the adult.

These results look logical: it is quite possible to assume that individuals with a well-developed brain will, other things being equal, show more inclination to study.

The authors emphasize that it is incorrect to call the genes they identified "education genes" for several reasons. Firstly, the level of education is determined by environmental factors to a greater extent than genetic ones (this is true for most psychological and behavioral traits of a person). Secondly, the influence of individual genes on the trait in question is small. This, however, is also a common place in the genetics of behavior. Thirdly, allelic variants of genes can influence the trait "duration of training" not directly, but through a number of intermediate phenotypes. Of course, this is also true for most of the connections between genes and behavioral traits in primates. To illustrate the latter consideration, the authors calculated based on data from two of the cohorts they studied that the revealed relationship between genes and the duration of study can be explained by 23-42% by the influence of these genes on mental abilities, and another 7% of the correlation can be attributed to the influence of these same genes on the personal characteristic "openness to new experience". To what extent these arguments are nontrivial and to what extent it really follows from them that it is not necessary to call these genes "education genes" even in a popular presentation, the reader can judge. Out of respect for the authors' point of view, though with difficulty, I refrained from heading this news "Scientists have found the genes of education."

The authors also note that from their results it does not follow at all that the influence of 74 genes on the duration of study is a certain once and for all set, unchangeable value. On the contrary, the available data allow us to definitely assert that the degree of influence of these and other genes on the trait "duration of study" strongly depends on environmental conditions. It is different for different countries and changes over time in the same country. For example, in Swedes born in the 1930s, the cumulative effect of 74 genes on the duration of study corresponded to almost an entire year of study. In the future, it gradually decreased, and for Swedes born in the late 1950s, it barely lasted up to 7-8 months. Among the possible reasons, the authors mention the reform of education, which made it more accessible, as well as a significant reduction in the average distance from the place where a person lives to the nearest secondary and higher educational institutions. This is not surprising: the heritability of traits is a variable value. It can vary greatly depending on socio-economic conditions.

Anyway, the study showed that if you try very hard and collect more data, you can find specific genes that affect a polygenic trait, even if the contribution of each individual gene is negligible. In addition, it has once again demonstrated that even complex behavioral traits, which are determined mainly by the environment and are traditionally considered non-hereditary ("acquired"), can actually have a significant genetic component.

Portal "Eternal youth" http://vechnayamolodost.ru  16.05.2016

Found a typo? Select it and press ctrl + enter Print version