16 July 2015

History with genography

World History in four letters

Elena Kleshchenko, "Chemistry and Life" No. 4-2015Published on the website "Elements of Science" 

Population genetics, genetic genealogy – like all sorts of fashionable topics, they are overgrown with myths. So what's really going on with the "Slavs haplogroup" and the "Genghis Khan chromosome"? Let's figure it out.

From father to sonOur Y chromosome is remarkable in many ways.

All other chromosomes – 23 pairs of non-sexual, as well as X chromosomes in women – exchange fragments during the formation of germ cells, each with its own pair (this process is called crossing over). But in the cells of men there is one X chromosome and one Y chromosome, so only non-sex chromosomes change in their sections. It follows that the son receives the Y chromosome unchanged from the father, except for random mutations, and the same one passes it to his son.

That is why it is convenient to build genealogies on the Y-chromosome, to establish near and distant kinship in the male line. By examining the Y-chromosomes of living people or DNA fragments extracted from the remains, you can uncover family secrets, unravel criminal cases. This is how, for example, it was established that US President Thomas Jefferson was indeed the father of at least one of the sons of his black slave Sally Hemings. The president's defenders tried to shift the responsibility to his brother or nephews, who, of course, have the same Y chromosome, but this did not agree with historical facts. In the same way, it was found out that the Duke of Monmouth, who raised an unsuccessful uprising against King James II in 1685 (do you remember "Captain Blood's Odyssey"?), may not have been lying when he called himself the son of Charles II. Comparison of Y-chromosomes played an important role in the identification of the remains of members of the royal family shot in 1918 (see "Chemistry and Life", 2009, No. 6).

A special topic is "historical detectives" for long periods of time: the Y–chromosome helps to understand issues such as the resettlement of peoples, the origin and kinship of modern ethnic groups. Not only the immutability of the Y chromosome is useful here, but also its variability. Let's say researchers compare the nucleotide sequences of Y chromosomes in representatives of different peoples and find one-letter differences in them. Each such difference is called "single nucleotide polymorphism" (SNP, or SNP, from single nucleotide polymorphism; pronounced "snip"). Sometimes small insertions and deletions are also considered substitutions. They arise as point mutations, but these two concepts need to be distinguished: SNP is the replacement of A by T in a certain position, which occurs in many people, and mutation is a one–time event; not every mutation becomes a snip.

Studying the sets of SNPs in the genomes of many people makes it possible to understand who descended from whom. If nine out of ten people have the same snip, and five have the other one, we can assume that the first one is more ancient, the second one is newer.

The older the substitution, the more diverse peoples it occurs. Similar sets of SNPs hint at kinship between populations, in a similar way it is possible to understand whether one of them is ancestral to the other, or whether they have a common ancestor. Of course, such comparisons in a large array of data are not made "by eye", there is no way to do without mathematics and computer science.

When this issue was being prepared for publication, Nature published an article on a large-scale genetic study of the peoples of Great Britain, just according to the SNP (Stephen Leslie et al., “The fine-scale genetic structure of the British population”, Nature, 2015, 519, 7543, 309-314, DOI:10.1038/nature14230). The authors compared polymorphisms in the genomes of more than 2,000 citizens of the United Kingdom, preferring the indigenous inhabitants of rural areas, whose grandparents were all born somewhere nearby. We learned a lot of interesting things – for example, that the migration of peoples from the continent to the British Isles took place after the Mesolithic, but before the Romans, and that the "Celts" are not a single population, but several subgroups.

Not only SNP, but also short tandem repeats of STR (short tandem repeat) can act as markers to help track the connections between people and nations. These are DNA sequences that are repeated dozens of times in a row, and the number of repetitions in a particular site is individual. According to STR, family ties are determined when establishing paternity, investigating criminal cases, etc. With their help, it is possible to judge the kinship of ethnic groups.


Changes in the number of STR occur more often than point mutations that give rise to SNP. Therefore, with the help of STR, modern family ties are usually studied, as well as not too ancient ones – within the last 4000 years, it is convenient to work with snips over long time distances.

It is important that population genetics works with markers that do not affect the activity of any important genes, do not change anything in the phenotype, therefore, selection does not affect them. Changes in their frequencies depend only on the rate of mutations and on random events, such as gene drift, the founder effect, etc., in other words, on which part of the tribe went to conquer new lands, which suffered the most in internecine conflict, and which was conquered peacefully and mixed with the conquerors. That is, it is from those events that historians are interested in. And according to this criterion, the Y-chromosome is also well suited: there are relatively few genes in it and selection for them is not too active.

Let's start from AdamWe will have to introduce two more terms: "haplotype" and "haplogroup".

Haplotype is a set of gene variants (alleles) in a section of a chromosome, in the entire chromosome, in all maternal (or paternal) chromosomes. That is, in the case of interest to us, the haplotype is a certain set of SNP and (or) STR, characteristic of this individual. Actually, a person, like any diploid individual, has two haplotypes, since the chromosomes are paired: in the one received from mom, there may not be such alleles as in dad's. But the Y chromosome is unpaired and unique, and there is only one haplotype in it.

A haplogroup is a group of similar haplotypes that have a common ancestor. From this definition, it is clear that a haplogroup can be either wide or narrow, depending on the selected haplotype. If we take only ancient widespread markers for consideration, such a haplogroup can be found in an entire nation or even in representatives of several nations. If you add newer markers to them, then a large haplogroup will split into several smaller haplogroups. For example, the haplogroup R, which will be discussed later, is divided into R1 and R2, and R1 includes R1a and R1b.

In practice, when examining the Y chromosome, the individual's haplotype is determined by the STR test, and the haplogroup is determined by the SNP. It is clear why: markers that change faster give a more individual characteristic. However, according to the STR-haplotype, it is possible to predict the haplogroup fairly accurately, that is, to guess which SNPs a given person should have.

A single person's haplotype or haplogroup–traits common to a group of people–can be described by simply listing these traits. For example, in the case of STR, it will be a table of two lines: at the top – the designations of sections with repetitions, and the numbers at the bottom (something like "11 11 16 15 13 ...") show how many repetitions there are in this section. This series of numbers is sometimes called the barcode of an individual or family. Indeed, there is a similarity. Such tables, at the fear of the uninitiated, have already begun to appear in publications on history and genealogy. The tables can be compared: the more matches in the lower rows, the closer the relationship.


The nomenclature for haplogroups was developed by the Y Chromosome Consortium (YCC), a team of specialists who systematize information on the evolution and diversity of this chromosome. The main haplogroups are denoted by capital letters from A to T (Fig. 1), and the subclades into which they are divided by numbers and lowercase letters. (A clade in biology is a group of organisms that includes all the descendants of one ancestor – a large branch or a small branch of the tree of life. Accordingly, the subclades are one of the branches of a large branch. A younger haplogroup is a subclade of the older one from which it originated.) Another option is also used – the alphanumeric designation SNP, which defines this subclades, is added to the letter of the haplogroup through a dash. A mutation appeared, spread, and the hitherto unified group was divided into two...

The ancestor of all haplogroups is the so–called Y-chromosomal Adam, a hypothetical man from whom all living men received their Y-chromosomes. Its official name is "Y–chromosomal last common ancestor". Unlike the biblical Adam, he was neither the first nor the only one – it is more correct to imagine him as a member of a small tribe, whose other men also contributed to the gene pool of mankind, but did not leave male descendants who have survived to our time. This lucky man lived hundreds of thousands of years ago; estimates vary greatly in different studies, but now 200-300 thousand years is considered the most likely. At about the same time, anatomically modern man appeared, and at the same time they refer to the "mitochondrial Eve" – the woman from whom we all descend on the maternal side. With mitochondrial DNA (mtDNA), the story is about the same as with the Y chromosome: a person receives it only from the mother, in an egg, so it can be used to investigate the origin of the female line. According to mtDNA, for example, the remains of Richard III, found in 2012, were identified: there was some ambiguity with the paternal line.

Is there a "Slavic haplogroup"Type the words "haplogroup" in the search engine, and Google will prompt: "Russian haplogroup", "Armenian haplogroup", "Jewish haplogroup"...

Alas (or hooray), not everything is so simple. For example, haplogroups A and B do occur mainly among African peoples, and A is characteristic of Ethiopians, San (Bushmen), Koi-Koin (Hottentots) and Nilots, and B – among Pygmies and Hadza, haplogroups M and S – in New Guinea, Melanesia, Eastern Indonesia. But in the melting pot of Eurasia and America, everything is mixed up: for many centuries in a row, people travel, emigrate in groups and singly, start a family and die not where they were born... One hears, for example, that haplogroup R1a of the Y chromosome, aka R-M420, is "Russian", "Slavic", etc. However, it is found not only in Russia, Belarus, Ukraine, but also in Estonia, Hungary, Norway, India, Pakistan, among the East Germans, among the indigenous the population of the Faroe Islands... By the way, it arose long before the appearance of the Slavs. Although in fact its occurrence in Russian populations is quite high – from a third to a half. But it is still more correct to say that this haplogroup marks certain ways of human migration.

So is it possible to establish nationality by haplogroup? For representatives of small nations that do not mix with others, and are well studied by geneticists – apparently, yes. As for the citizens of large European and Asian states, the person belonging to haplogroup R1a does not give much information – this person with a fairly high probability can be both Russian and German... Research on a broader set of markers will suggest that human ancestors most likely lived in a certain region of Eastern Europe. But it is clear that in the era of globalization it is impossible to say with full confidence that he himself was born or lives there.

Today, Y-chromosome research is a common commercial service. The reconstruction of family trees based on genetic data is carried out, in particular, by the company “Family Tree DNA" (Houston, Texas).

According to the Y-chromosome, it is possible to establish the degree of kinship in the male line between two individuals, there is a chance to find distant relatives (if they also used this service and agree to share information about themselves with potential relatives). For genealogical research, it is convenient that the surname in most countries is transmitted in the same way as the Y chromosome - from father to son. Women who want to find out something about their ancestors from their father's side can "borrow" the Y chromosome from relatives - dad, brother, uncle...

It was at the request of a relative in 2012 that two African Americans sent their genomes to the "Family Tree DNA". The result of the analysis amazed everyone: the Y-chromosome was absolutely unlike the previously known ones, it had no place on the family tree – neither an ancestor nor a descendant of others, but an independent line that separated from the common trunk before anyone else. "The chromosome of another Adam has been discovered," the journalists wrote, although it would be more accurate – "the chromosome of the unknown son of Adam." The new haplogroup was named A00 (Fig. 1, on the left), and the unofficial name "Perry's Y chromosome" was preserved for it. (Albert Perry is a common ancestor of the study participants, lived in the early XIX century.) Of course, Perry from South Carolina are not descendants of aliens or Atlanteans, their haplotype, though rare and ancient, is quite human. Later, the same haplogroup was found in 11 men of the Mbo tribe (Western Cameroon). This, of course, forced to increase Adam's age: the more diverse the descendants, the earlier the ancestor had to live. So "civil science" helped fundamental science, and family genealogical research redrawn the family tree of mankind. Perhaps this is not the last such case. There are billions of people on Earth whose genomes no one has read yet.

Returning to the topic – the Y chromosome is important for genealogy, but to find out from which regions your ancestors come from different lines, you will also need to study mtDNA and non-sex chromosomes – the Y chromosome gives only one line. The carrier of the Y-chromosome haplogroup R1a may also be a Negroid if he had a Russian great-great-grandfather on the male line, and all other ancestors were citizens of Ethiopia.

In general, with a sufficiently large set of markers (let's forget about the cost of such a study), it is possible to establish ethnic origin by DNA, but such things are done for personal or scientific curiosity, and not to fill in the "nationality" column correctly. It is better to distinguish one's own from others by cultural characteristics: what kind of insignia he wears, what gods he believes in, whether he pronounces words that a stranger cannot pronounce cleanly. And when studying biomaterials from the crime scene, it is more useful to focus on the genes that determine individual signs: blood type, skin and hair color, facial features (see "Chemistry and Life", 2014, No. 6). Although, perhaps, haplogroups will soon become one of the special signs.

Scandinavian roots of Russian princesAmong the genealogical investigations of “Family Tree DNA” that are interesting for us is the study of Y–chromosomal haplotypes of Rurik's descendants (“Rurikid Dynasty DNA Project” and “Russian Nobility DNA Project”).

On our side, in addition to genealogy specialists, the project was supported by Russian Newsweek – it all started with the fact that the staff of this magazine managed to persuade two descendants of Vladimir Monomakh to take part in the study. The first was Dmitry Mikhailovich Shakhovskoy, who lives in Paris. He turned out to have haplogroup N1c1, more precisely, subclades N1c1d1, which is called "Finno-Ugric", or even more precisely, its "Scandinavian" branch; similar haplotypes are most often found in Norway, Sweden, Finland. The same haplogroup was found in A. P. Gagarin and his cousin G. G. Gagarin.

But then it turned out that many Rurikovich haplogroup is different – "Slavic" R1a1. And a representative of the Svyatopolk-Chetvertinsky genus turned out to be a carrier of haplogroup I2a2, typical for Western Ukraine and the Belarusian Eastern Polesie. It was suggested that the founder of this family was not himself a Rurikovich, but married a woman from the Rurik family – maybe from the Izyaslav Yaroslavich family. The Svyatopolk-Chetvertinskys are considered descendants of the Turov princes, that is, it was Izyaslav, but there was also a certain semi-legendary Prince Tur in the chronicles...


Fig. 2. Yaroslav the Wise, his sons and their descendants (according to the illustration from the report of V. G. Volkov). The diagram shows not all of Yaroslav's children, but only those whose descendants participated in the study.
In general, a single I2a2 does not matter, but what to do with R1a1?

The study involved the descendants of many princely families, at least one from each family depicted in the diagram (Fig. 2). The picture was as follows: Vadbolsky, Lobanov-Rostovsky, Khilkov, Gagarin, Shakhovsky, Kropotkin, Rzhevsky, Putyatin, Myshetsky, Polish genera Puzyna and Massalsky (Mosalsky, Masalsky) – N1c1, and Baryatinsky, Volkonsky, Obolensky, Beloselsky-Belozersky, Shuisky, Karpov (perhaps the Karpovs are descendants of the Fominsky princes who lost their princely title), Drutsky-Sokolinsky – R1a1. So what kind of haplogroup did Rurik have, or at least his great-grandson Yaroslav the Wise (978-1054)? Let's put the question straight: "Scandinavian" or "Slavic"?

There was no shortage of versions. For example, it was assumed that the real "Rurik" haplogroup was not N1c1, but the native Russian R1a1, it was Yaroslav the Wise, his father Vladimir the Red Sun, and probably his great–great-grandfather, Rurik, whoever he was. And as for the descendants of Monomakh, carriers of N1c1, Monomakh was the son of Vsevolod Yaroslavich, whose origin can be argued. Yaroslav's wife, Ingigerd, the daughter of the king of the Swedes, was the bride of the Norwegian king Olav Tolstoy, later known as Olav the Saint, and, according to the sagas, she really wanted to marry him. Her father, however, could not stand Olav and eventually married his daughter to "Yaritsleiv the king". By the way, it follows from this historical fact that all Yaroslavichi (and Yaroslavna, of course, too) are at least half Swedes, and Yaroslav's own mother, the unfortunate Polotsk Princess Rogneda, according to some historians, was of Scandinavian descent. But mitochondrial DNA does not bother supporters of the hypothesis about the Slavic roots of the Rurikovich; the Y chromosome is what is important.

So, shortly before the birth of Vsevolod, Olav, fleeing from enemies, fled to Russia and lived with Yaroslav in Novgorod (Olav by that time had married Ingigerd's half-sister, Astrid, so they were related). It was then, they say, that the "Scandinavian" haplogroup got into the genealogy of the Rurikovich... But this romantic version is destroyed by haplogroup N1c1 in another branch of the genus, in Puzyna and Massalsky – they are descended from another son, Svyatoslav Yaroslavich.

A phylogenetic tree was built based on the haplotypes of modern Rurikovites to check whether the real genetic links coincide with the official genealogy. It did not coincide everywhere: for example, for some reason, Puzyna and Massalsky turned out to be closer to Shakhovsky and Kropotkin than Rzhevsky and Putyatin. But this "closeness" may disappear if you conduct a study on a larger number of markers.

Anyway, almost all N1c1 carriers shown in the diagram are descendants of one male ancestor who lived about 1000 years ago (that is, Yaroslav is just right). Vadbolsky and Lobanov-Rostov, as well as Shakhovsky and Kropotkins are more closely related to each other than to other Rurikovichi having the same haplogroup. And "almost everything" – because the ancestry of the Myshetskys seems doubtful. According to the official version, they come from Yuri Tarussky, as well as Baryatinsky, Volkonsky and Obolensky, but those have haplogroup R1a1! And Myshetsky has N1c1, but it is very different from the rest of the "Rurik" N1c1; their common ancestor lived about 1900 years ago.

It doesn't work so smoothly with the R1a1 haplotypes: for these Rurikovich, a common ancestor who lived at the right time is not derived. Their haplotypes belong to different subclades, only Volkonsky, Obolensky and Baryatinsky are genetically related to each other – their common ancestor lived about 800 years ago. According to the pedigrees, this ancestor is considered to be Yuri Tarussky, who reigned in the first half of the XIV century, presumably the son of Mikhail Vsevolodovich, Prince of Chernigov. Thus, genetic data show that the ancestor of the Volkonskys, Obolenskys, and Baryatinskys could have been Yuri Tarussky, but at the same time he was not a descendant of Yaroslav the Wise (and Rurik) in the male line.

To finally upset the opponents of the "Scandinavian Rurik" – the subclades of haplogroup R1a1, to which these three genera belong, R1a1a1g2* (R-L260), is atypical for the Baltic Sea coast, from where the "Slavic Rurik" could have come, but is typical for Great Poland and Silesia. The head of the Rurikid Dynasty DNA Project, Polish researcher Andrzej Bayor, suggested in 2008 that the genetic line of these Rurikids could have been violated by the Polish king Boleslav II the Bold. Unfortunately, haplotypes of representatives of the Piast dynasty have not yet been received. But, as V. G. Volkov, the administrator of the Russian Nobility DNA Project, notes, the black eagle in the coats of arms of the descendants of Yuri Tarussky is very similar to the eagle from the coat of arms of the Silesian Piasts.

"Now it can be considered proven that Vladimir Monomakh belonged to haplogroup N1c1, there are no less grounds to consider Yaroslav the Wise as a representative of this haplogroup, but the question arises whether Rurik is his ancestor, and if the answer is yes, then the genetic origin of Vladimir Monomakh will clarify the origin of Rurik," writes V. G. Volkov. According to current data, the ancestor of Yaroslav the Wise and Vladimir Monomakh could have originated from Sweden or coastal Finland (the latest version is put forward by Finnish scientists).

Genghis Star Cluster"Scientists have found the Genghis Khan chromosome!" – the authors of the news rejoiced at the beginning of 2003.

I wanted to ask again, like Gella in Bulgakov's novel: is it really him? However, the journalists only quoted the scientists, did not add anything from themselves. (The drawing shows Genghis Khan as depicted by a Chinese artist.)

This is how the article of the international team of researchers began: "We have identified a Y-chromosome line with several unusual properties. It was found in 16 populations in a vast region of Asia, from the Pacific Ocean to the Caspian Sea, and was present in them with a high frequency – its carriers were about 8% of men in this region, or about 0.5% of the world's population [16 million people]. Patterns of variations in this line show that it originated in Mongolia about 1000 years ago. Such a rapid spread could not have happened by chance, it must be the result of selection. Genghis Khan's probable male descendants have the same lineage, so we assume that it spread due to a new form of social selection..." (Zerjal et al., "The Genetic Legacy of the Mongols", American Journal of Human Genetics, 2003; 72, 3, 717-721, DOI:10.1086/367774).

One of the lead authors of this article, Spencer Wells, together with National Geografic magazine organized an ambitious non-profit project “Genografic”; tests for this project are performed by “Family Tree DNA". Its goal is to collect and study DNA samples of indigenous peoples on all continents in order to learn more about the migrations of peoples and about the history of mankind in general. 

Representatives of civilized peoples who have made it difficult to study their genealogy by traveling from Dublin to Arizona and from the Trans–Urals to Moscow can also take part, but for money - they need to pay for a set to obtain a DNA sample (scraping from the inside of the cheek), postage and the cost of analysis, plus a small margin that will go to further research of indigenous peoples. The participants are pleased to contribute to a great science, and at the same time to find out their "deep" ancestry – which way their personal ancestors came from Africa to Europe or to China. The project promises to check even the presence of Neanderthal and "Denisovan" markers in your genome! The results of the analyses posted in the public space are anonymous. By and large, all participants in this venture are driven by curiosity. "The greatest book on history is the one that is hidden in our DNA," says Dr. Wells.

And as for Genghis Khan, the bold assumption of the authors of the article seems to be true. Genghis Khan's haplogroup is C–M217, the most common branch of haplogroup C-M130; it is also designated C3*. It should have appeared about a thousand years ago – calculations carried out by two methods give intervals of 700-1300 and 590-1300 years. This corresponds to the years of Genghis Khan's life (about 1162-1227), although it cannot be excluded that the modern carriers of the haplogroup are descendants not only of himself, but also of his close and distant relatives in the male line. The haplogroup is found among peoples who, according to legend, descend from Genghis Khan, is widespread in the territory of the Mongol Empire, besides, its spread was too fast for random, which is confirmed by mathematical models. Those who are still afraid to commemorate Genghis Khan without irrefutable evidence, talk about the "chromosome of the star cluster". (The basic version of the haplotype was found most often, but many rare variants were also found that differ in one or two points - this can be depicted as a figure similar to a large star with short rays.)


Genghis Khan chromosome (from the article by Zerial et al., 2003). The diameter of the circles corresponds to the number of representatives of the peoples included in the study, the shaded sector corresponds to the frequency of occurrence of the same chromosome. The painted area is the territory of Genghis Khan's empire at the time of his death (1227)

What about non-random distribution? Genghis Khan's first and beloved wife Borte bore him four sons who inherited the supreme power – Jochi, Chagatai, Ogedai, Tolui, and they all gave the great conqueror grandchildren. He had at least four other wives, as well as hundreds of concubines. We remember that natural selection on the Y chromosome is not so strong. But we see a situation where representatives of one genus receive excellent opportunities both for producing offspring and for destroying representatives of other genera. Increased reproductive fitness transmitted socially... Even when the empire collapsed, the Genghisids remained the rulers of its parts, it was difficult for a person of another origin to break through to the supreme power. As the authors of the article noted, why is it not group selection in the human population, the same group selection that Richard Dawkins, the author of the concept of the "selfish gene", so vehemently denies?

More than ten years have passed since then, and the search for Genghis Khan's chromosome continues. In the same article, the Pakistani Khazars were mentioned – according to legend, also his descendants, and in fact, they, the only ones outside the territory of the empire, had a chromosome found. They also found it in Russia – such studies are conducted, for example, by employees of the Institute of General Genetics of the Russian Academy of Sciences, this is described in the book by I. A. Zakharov-Gezekhus "In the Footsteps of Genghis Khan. A geneticist in the Center of Asia" (Moscow-Izhevsk: Institute of Computer Research, 2013). The Genghis Khan chromosome is found among the indigenous inhabitants of Altai, Altai Kazakhs, Buryats, Kalmyks, Nogais, Tuvinians – that is, precisely among those peoples who lived on the territory of the Mongolian Empire. I. A. Zakharov-Gezekhus managed to find in Kazakhstan several people who trace their ancestry from Juchi, the son of Genghis Khan. However, their haplogroups were very different not only from the "star cluster" (that would be half the trouble – the origin of Jochi himself was questioned, he was born after his mother was in captivity), but also from each other. But haplogroup C3* is found with a very high frequency in some Kazakh clans: Zhalayyr – 38%, Tore – 35%, Kerey – 65%! It looks like we still have a lot to learn. Spencer Wells is right, the history of mankind is stored not only in chronicles and legends, but also in the four-letter DNA code. And we are just beginning to read this book.

Portal "Eternal youth" http://vechnayamolodost.ru
16.07.2015
Found a typo? Select it and press ctrl + enter Print version