The Jats and Indo-Aryan expansion in South Asia

There is this belief that is held by many that the high steppe ancestry in Jats is based somehow on some latter steppe migrations into the region. But obviously there is no proof for it.  The association of Jats with some Central Asian migrants and more specifically the Indo-Scythians is a myth created in the 19th century and does not have any foundation whatsoever. However some people hold onto this myth and feel a vague sense of pride in it.

Nevertheless, there is a very easy and straightforward explanation for why the Jats have such a high steppe ancestry.  Here are a few things to keep in mind.

  1. The Haryanvi and Western UP Jats have apparently the highest ‘steppe’ ancestry among South Asians.
  2. This ‘steppe’ ancestry is associated with the spread of IE langauges in South Asia with Brahmins and Kshatriyas in any region having a higher share of this ancestry than the other groups within that region.
  3. The Vedic homeland was in Haryana and Western UP, the Kuru heartland from where the Vedic cultural influence spread into interior South Asia.

Let me quote Michael Witzel which is an avowed AMT proponent,

Kuruksetra, the sacred land of Manu where even the gods perform their sacrifices, is the area between the two small rivers Sarsuti and Chautang, situated about a hundred miles north-west of Delhi. It is here that the Mahabharata battle took place. Why has Kuruksetra been regarded so highly ever since the early Vedic period?…

…It can be said that the Bharata/Kaurava/Pariks.ita dynasty of the Kurus sucessfully carried out and institutionalized a large scale re-organization of the old Rgvedic society. Many aspects of the new ritual, of the learned speech, of the texts and their formation reflect the wish of the royal Kuru lineage and their Brahmins to be more archaic than much of the texts and rites they inherited. In this fashion, the new Pariks.ita kings of the Kurus betray themselves as typical newcomers and upstarts who wanted to enhance their position in society through the well-known process of “Sanskritization.” In fact, to use this modern term out of its usual context, the establishment of  the Kuru realm was accompanied by the First Sanskritization. Incipient state formation can only be aided if it is not combined with the overthrow of all inherited institutions, rituals, customs, and beliefs. The process is much more successful if one rather tries to bend them to one’s goals or tries to introduce smaller or larger modifications resulting in a totally new set-up. The new orthopraxy (and its accompanying belief system, “Kuru orthodoxy”) quickly expanded all over Northern India, and subsequently, across the Vindhya, to South India and later to S.E. Asia, up to Bali.

This procedure is visible in the Bharata/Kaurava dynasty’s large scale collection of older and more recent religious texts: In all aspects of ritual, language and text collection, these texts tend to be more archaic than much of the inherited older texts and rites. On the other hand, the new dynasty was effective in re-shaping society and its structure by stratification into the four classes (varna), with an internal opposition between ¯arya and ´sudra which effectively camouflaged the really existing social conflict between brahma-ksatra and the rest, the vaisya and ´sudra; further, the Bharata/Pariksita dynasty was successful in reorganizing much of the traditional ritual and the texts concerned with it. (It must not be forgotten that public ritual included many of the functions of our modern administration, providing exchanges of goods, forging unity and underlining the power of the elite.)

The small tribal chieftainships of the R°gvedic period with their shifting alliances and their history of constant warfare, though often not more than cattle rustling expeditions, were united in the single “large chiefdom” of the Kuru realm. With some justification, we may now call the great chief (raja) of the Kurus “the Kuru king”. His power no longer depended simply on ritual relationships such as exchange of goods (vidatha) but on the extraction of tribute (bali) from an increasingly suppressed third estate (vi´s) and from dependent subtribes and weak neighbors; this was often camouflaged as ritual tribute, such as in the a´svamedha.

In view of the data presented in this paper, we are, I believe, entitled to call the Kuru realm the first state in India.

Witzel also states elsewhere in the text,

The famous Videgha Mathava legend of ´SB 1.4.1.10 sqq. tells the story of the “civilization process of the East” in terms of its Brahmanical authors, and not, as usally termed, as the tale of “the Aryan move eastwards.For it is not only Videgha Mathava, a king living on the Sarasvatı, but also his priest Gotama Rahugana who move towards the east. Not only is the starting point of this “expedition” the holy land of Kuruksetra; the royal priest, Gotama Rahugana, is a well known poet of R°gvedic poems as well, and thus, completely anachronistic. Further, the story expressively mentions the role of Agni Vai´svanara, the ritual fire, in making the marshy country of the East arable and acceptable for Brahmins. All of this points to Sanskritization or rather, Brahmanization) and Ks.atriyazation rather than to military expansion.

The M¯athavas, about whom nothing is known outside the ´SB, may be identical with the m´athai of Megasthenes (c. 300 B.C.), who places them East of the Paz´alai (Pancala), at the confluence of the Erennesis (Son) with the Ganges. The movement of some clans, with their king Videgha and his Purohita, eastwards from the River Sarasvatı in Kuruksetra towards Bihar thus represents the ‘ritual occupation’ of Kosala(-Videha) by the bearers of orthoprax (and orthodox) Kuru culture, but it does not represent an account of the first settlement of the East by Indo-Aryan speaking tribes which must have taken place much earlier as the (still scanty) materials of archaeology indeed indicate.

According to Talageri,

…the geographical area of the Rigveda extends from westernmost U.P. and adjoining parts of Uttarakhand in the east to southern and eastern Afghanistan in the west. Strictly speaking, in present-day political-geographical terms, this includes the whole of northern Pakistan, adjoining areas of southern and eastern Afghanistan, but, within present-day India, only the state of Haryana with adjoining peripheral areas of western U.P and Uttarakhand..

…The descriptions in the Puranas about the locations of the Five Aila tribes in northern India clearly place the Purus as the inhabitants of the Central Area (Haryana and adjacent areas of western U.P.), the Anus to their North (Kashmir, etc.), the Druhyus to their West (present-day northern Pakistan), and the Yadus and Turvasus to their South-West (Rajasthan, Gujarat, western M.P.) and South-East (eastern M.P. and Chhattisgarh?) respectively. The Solar race of the Ikshvakus are placed to their East (eastern U.P, northern Bihar). This clearly shows that the Purus were the inhabitants of the core Rigvedic area of the Oldest Books (6, 3, 7): Haryana and adjacent areas, and they, and in particular their sub-tribe the Bharatas, were the “Vedic Aryans”. Their neighboring tribes and people in all directions were also other non-Vedic (i.e. non-Puru) but “Aryan” or Indo-European language speaking tribes. The Puru expansions described in the Puranas explain all the known historical phenomena associated with the “Aryans”: the expansion of Puru kingdoms eastwards explains the phenomenon which Western scholars interpreted as an “Aryan movement from west to east” (the area of the Rigveda extends eastwards to Haryana and westernmost U.P., the area of the Yajurveda covers the whole of U.P., and the area of the Atharvaveda extends eastwards up to Bengal), and their expansion westwards described in the Puranas and the Rigveda explains the migration of Indo-European language speakers from the Anu and Druhyu tribes (whose dialects later developed into the other 11 branches of Indo-European languages) from India..

The evidence is unequivocal. Quite clearly, the Vedic culture spread into the Gangetic plains and later on elsewhere from its central locus of the Kuru realm which was in Haryana and Western UP.

So is it so outrageous that the dominant community living presently in the traditional Vedic heartland from where the Vedic culture, ritual, language and religion is suppossed to have spread across inner South Asia, also has the highest ancestry of the type which is usually today associated with the spread of IE or Indo-Aryan languages and culture in South Asia ?

So why hold onto the unsubstantiated 19th century colonial myths when the evidence is so clear and straightforward ? As Razib has pointed out, a latter steppe admixture into the Jats from groups like Scythians is also difficult to argue because the Jats lack the East Eurasian component which is present in very signficant proportion in steppe groups from Iron Age onwards.

Infact, the close ancestry sharing between the Kalash, Pashtuns, Pamiris and Jats indicates, as I have argued earlier in greater detail, that this shared ancestry with high ‘steppe’ component goes back to the days of Indo-Iranian unity within the northwest of the subcontinent because while Jats are Indo-Aryan and Pashtuns are Iranian speakers, the Kalash are representative of the Nuristani branch which is often taken as the 3rd branch in Indo-Iranian.

One question that is often asked is – why are Jats not at the top of caste heirarchy ?

There is also a good explanation for this. The Indo-Aryan expansion from its Haryana-Western UP heartland is a roughly 4,000 year phenomenon. A lot of water has flown under the bridge since then. Mahapadma Nanda, who established the first major South Asian empire is stated in the Puranas to have  destroyed the Kshatriyas, and attained undisputed sovereignty. The Kshatriyas said to have been exterminated by him include Maithalas, Kasheyas, Ikshvakus, Panchalas, Shurasenas, Kurus, Haihayas, Vitihotras, Kalingas, and Ashmakas.

As you can see, the Kshatriyas among the Kurus, along with those of other kingdoms, were already exterminated during the time of Mahapadma Nanda eons ago.  So it is no surprise that present day Jats don’t hold any special position in the caste heirarchy.

I end here by taking a detour with the beautiful story of Pururavas, who is the ancestral figure of all Vedic tribes and is most likely an Indo-Iranian ancestor from the remote past. Noticeable aspects of the story include the fact that the place of Kurukshetra, Haryana has a mention in the story as a place of action and that sheep herding appears to have been  a feature of this early nascent Indo-Aryan/Indo-Iranian period.

Pururava was a good king who performed many yajnas. He ruled the earth well. Urvashi was a beautiful apsara. Pururava met Urvashi and fell in love with her.

“Please marry me,” he requested.

“I will,” replied Urvashi, “But there is a condition. I love these two sheep and they will always have to stay by bedside. If I ever lose them, I will remain your wife no longer and will return to heaven. Moreover, I shall live only on clarified butter.”

Pururava agreed to these rather strange conditions and the two were married. They lived happily for sixty-four years.

But the gandharvas who were in heaven felt despondent. Heaven seemed to be a dismal place in Urvashi’s absence. They therefore hatched a conspiracy to get her back. On an appropriate occasion, a gandharva named Vishvavasu stole the two sheep. As soon as this happened, Urvashi vanished and returned to heaven.

Pururava pursued Vishvavasu and managed to retrieve the sheep, but by then, Urvashi ahd disappeared. The miserable king searched throughout the world for her. But in vain. Eventually, Pururava came across Urvashi near a pond in Kurukshetra.

“Why have you forsaken me?” asked Pururava. “You are my wife. Come and live with me.”

“I was your wife,” replied Urvashi. “I no longer am, since the condition was violated. However, I agree to spend a day with you.”

When one year had passed, Urvashi returned to Pururava and presented him with the son she had borne him. She spent a day with him and vanished again. This happened several times and, in this fashion, Urvashi bore Pururava six sons. They were named Ayu, Amavasu, Vishvayu, Shatayu, Gatayu and Dridayu.

The Jat Gene!

About 10 years ago there was a defunct blog called the “Jat Gene.” Standard stuff. Nothing super amazing discovered, but the Jats do seem on one end of the pole. I happen to have half a dozen Jats which cluster together. You can see where they are on the PCA plots above.

– no surprise that the Jat are on the ANI end of the ANI-ASI cline

– Please note that Jat and Ror and other such groups are distinct from Pathans and especially Baloch in that the latter groups seem to have more and later gene flow/contact from West Asian groups. Perhaps this is the Islamic period? Or perhaps this is just contact due to proximity. The Baloch and Brahui in particular are distinct because they have very little AASI. The Pathan are arguably an Iranian group with South Asian inflection, but the Baloch are just plain West Asian.

– You can see at the admixture plot below. The Jat are less (marginally) European-like than the Ror, but the Treemix indicates the Ror may actually be a mix of a very European-like group with native Indian (ANI-ASI mix). The Jat are probably the same but I don’t have the samples.
Continue reading The Jat Gene!

The Arctic home of the Aryans


The Fatyanovo culture flourished between 2800 and 1900 BC. It seems they were part of a Central European “reflux” migration. That is, their forebears were related Yamna agro-pastoralists who migrated west out of the steppe and mixed with Central European farmers. Eventually, some of these people moved back east along the edge of the forest-steppe boundary.

The Fatyanovo is the name for a group of people who seem to have introduced agro-pastoralism to the region nearly up to the Urals in northeastern European Russia. A new preprint, Genetic ancestry changes in Stone to Bronze Age transition in the East European plain, confirms what we assumed:

Transition from the Stone to the Bronze Age in Central and Western Europe was a period of major population movements originating from the Ponto-Caspian Steppe. Here, we report new genome-wide sequence data from 28 individuals from the territory north of this source area – from the under-studied Western part of present-day Russia, including Stone Age hunter-gatherers (10,800-4,250 cal BC) and Bronze Age farmers from the Corded Ware complex called Fatyanovo Culture (2,900-2,050 cal BC). We show that Eastern hunter-gatherer ancestry was present in Northwestern Russia already from around 10,000 BC. Furthermore, we see a clear change in ancestry with the arrival of farming – the Fatyanovo Culture individuals were genetically similar to other Corded Ware cultures, carrying a mixture of Steppe and European early farmer ancestry and thus likely originating from a fast migration towards the northeast from somewhere in the vicinity of modern-day Ukraine, which is the closest area where these ancestries coexisted from around 3,000 BC.

The Fatyanovo culture seems to have given rise to the rival and later successor Abashevo culture, which flourished a bit further east (beyond the Urals in part). The Abashevo in their turn gave rise to the Sintashta culture, which flourished even further east, and somewhat south.

There are two things I want to highlight. First, the Y chromosome:

Then, we turned to the Bronze Age Fatyanovo Culture individuals and determined that their maternal (subclades of mtDNA hg U5, U4, U2e, H, T, W, J, K, I and N1a) and paternal (chrY hg R1a-M417) lineages…were ones characteristic of CWC individuals elsewhere in Europe…Interestingly, in all individuals for which the chrY hg could be determined with more depth (n=6), it was R1a2-Z93…a lineage now spread in Central and South Asia, rather than the R1a1-Z283 lineage that is common in Europe.

Here is the modern distribution of Z-93:

The reason Z283 is found where in ancient times Z93 was found is that over the past 500 years ethnic Russians have expanded eastward, retracing the biogeographic route of the earlier peoples along the forest-steppe frontier.

The steppe people seem to be highly patriarchal. Though there are some non-modal lineages, samples from a specific location are often dominated by a single haplogroup, indicative of a broader kinship-based society focused around descent from an ancestor. In contrast, the origins of females as evidenced by mtDNA, diversity seems to be rather catholic. Some of the mtDNA lineages above, and later in the Sintashta, seem to derive from farmer populations in Europe whose ultimate origins were in Anatolia.

Let me define gotra from Wikipedia:

In Hindu culture, the term gotra (Sanskrit: गोत्र) is considered to be equivalent to lineage. It broadly refers to people who are descendants in an unbroken male line from a common male ancestor or patriline. Generally the gotra forms an exogamous unit, with the marriage within the same gotra being prohibited by custom, being regarded as incestThe name of the gotra can be used as a surname, but it is different from a surname and is strictly maintained because of its importance in marriages among Hindus, especially among the higher castes

The second point is to show this table:

This group has been assembling a lot of data on phenotypic SNPs over time transects in Northeast Europe. One has to take these results with a grain of salt because the predictions are trained on modern samples. I do not think, for example, that European hunter-gatherers had “black skin.” I suspect that the Mesolithic populations were genetically different enough that their “light alleles” may not be in our panels, though my suspicion is that they’d be of darker hue as Inuit people are. That being said, selection work aligns with these results that Europeans, in particular, seem to have been getting lighter in many areas down to the present.

The eye color prediction I somewhat trust since it’s quasi-Mendelian (~75% of the variance is due to one genetic location in Europeans). For the pigmentation, I would focus on the trend, not the absolute value. Anyone who has been to the Northeast Baltic (I have) knows that these are amongst the fairest people in the world. It is very unsurprising that these people have been getting paler over time.

There have been various arguments on this blog and elsewhere as to what the Sintashta people would look like.  I’ve posted the Narasimhan et al. data before. The results are broadly similar to the ones above for the Fataynovo.

The Fataynovo do not have the pigmentation genetic architecture that is similar to Nordic people. But, neither are they out of keeping with some European peoples. The Sintashta would be ~25% blue-eyed according to Narasimhan et al.’s data. In the 1000 Genomes about 10% of the alleles in Punjabis, Gujaratis, and Bengalis is the derived variant so common in Northern Europe, giving a recessive frequency ~1% of so blue-eyed, which is too high since other genes have an influence in these cases (though this allele is found in West Asia at appreciable frequencies, including in very old ancient DNA).

On the whole, these results confirm that the Aryans when they arrived in India were fair-skinned people. But, they were likely not as rosy-cheeked as the English who arrived thousands of years later, nor were their eyes quite often pale.

A risk factor for COVID-19 in South Asians

The major genetic risk factor for severe COVID-19 is inherited from Neandertals:

A recent genetic association study (Ellinghaus et al. 2020) identified a gene cluster on chromosome 3 as a risk locus for respiratory failure in SARS-CoV-2. Recent data comprising 3,199 hospitalized COVID-19 patients and controls reproduce this and find that it is the major genetic risk factor for severe SARS-CoV-2 infection and hospitalization (COVID-19 Host Genetics Initiative). Here, we show that the risk is conferred by a genomic segment of ~50 kb that is inherited from Neandertals and occurs at a frequency of ~30% in south Asia and ~8% in Europe.

The highest frequency is in the 1000 Genomes Bangladesh sample. 60%. In a study of Europeans all things equal the risk allele at this locus increases odds of respiratory failure by a factor of 1.75. This isn’t really the major factor; age and hypertension, all the things you know, matter more. But, it’s not trivial either to increase risk by 1.75.

If you are on 23andMe and got tested before the summer of 2017, the older chips has a marker for the locus that’s informative (in LD with the haplotype). This link should take you there. I’m TT homozygote. Modern human. A C is for Neanderthals.

Maharashtra genetics

Novel insights on demographic history of tribal and caste groups from West Maharashtra (India) using genome-wide data (OA):

The South Asian subcontinent is characterized by a complex history of human migrations and population interactions. In this study, we used genome-wide data to provide novel insights on the demographic history and population relationships of six Indo-European populations from the Indian State of West Maharashtra. The samples correspond to two castes (Deshastha Brahmins and Kunbi Marathas) and four tribal groups (Kokana, Warli, Bhil and Pawara). We show that tribal groups have had much smaller effective population sizes than castes, and that genetic drift has had a higher impact in tribal populations. We also show clear affinities between the Bhil and Pawara tribes, and to a lesser extent, between the Warli and Kokana tribes. Our comparisons with available modern and ancient DNA datasets from South Asia indicate that the Brahmin caste has higher Ancient Iranian and Steppe pastoralist contributions than the Kunbi Marathas caste. Additionally, in contrast to the two castes, tribal groups have very high Ancient Ancestral South Indian (AASI) contributions. Indo-European tribal groups tend to have higher Steppe contributions than Dravidian tribal groups, providing further support for the hypothesis that Steppe pastoralists were the source of Indo-European languages in South Asia, as well as Europe.

The Indo-Iranians go west!

I’ve long been curious about the Indo-Iranians who “went west”. I’ve tried to run some qpAdmin with Iranians, and the results are erratic. I think the main issue is the reference populations are quite different from the “simple” situation in India. But, I think it is plausible to say that Sintashta ancestry is lower in much of Iran than among Afghan and Pakistani Iranians, and Indo-Aryans in Northwest South Asia and upper-caste groups in South Asia. The frequencies of Indo-Iranian (Sintashta) ancestry seem closer to North Indian peasant groups, at best.

This is quite perplexing.

Additionally, looking closely at the data in regards to the well known split between “European” and “Asian” R1a1a

– In Turkey and the Levant, there is a mix between the two. I think this is indicative of Balkan migration during the Ottoman period. A small number of Bedouin, for example, have “European” R1a1a, while the single Druze has the “Asian” lineage.

– In Iran and the Caucasus, it’s mostly the “Asian” variant, except for cases where it looks like there is Slavic admixture (then it’s “European”).

– In Iran, the frequency of R1a1a seems highest in Kerman in their samples. It is, of course, the “Asian” variant.

Haber et al. found “steppe” ancestry arrived in the Levant after 1800 BC. We know from Mitanni that Indo-Iranians were part of the mediation of this.

I’ve put the “Asian” mutations and their frequencies below the fold, but look in the supplements of The phylogenetic and geographic structure of Y-chromosome haplogroup R1a.

Continue reading The Indo-Iranians go west!

The memes reflected in our genes

One of the major findings from Narasimhan et al. is that when it comes to total ancestry, Brahmin groups are enriched in the groups which have more “steppe” ancestry than you’d expect (West Eurasian ancestry is a function of steppe + IVC). That being said, Narasimhan et al. could not find evidence that Brahmins are a monophyletic clade. What this means is that Brahmins do not descend from a common group of founders, but a heterogeneous ancestral population.

How can we reconcile the consistently higher steppe ancestry with the fact that Brahmins seem to have diverse origins?

I think the answer has to do with the social ecology of India and the Brahmin role within that ecology.

In the period between 2,000 to 3,500 years ago, there was considerable genetic and cultural heterogeneity within India. This heterogeneity and population structure were “broken” and reconfigured through significant admixture. For example, where Brahmins in Uttar Pradesh have 25-30% steppe ancestry, Dalits in Uttar Pradesh are closer to 5-10%. In South India castes such as Reddys also have steppe ancestry, in the range of 5% or so. This is indicative of the spread and admixture of steppe enriched people all across the subcontinent.

But the flip side of the spread of steppe ancestry is that steppe people themselves mixed with local groups. ~25% of the ancestry of Uttar Pradesh Brahmins is from indigenous “Ancient Ancestral South Indians.” This is above and beyond the AASI ancestry from the Indus Valley population (in contrast, the Jat Rors are ~10% AASI, and well above ~30% steppe). Brahmins in Bengal and Tamil Nadu are very distinctive from non-Brahmin populations, and in their overall genome more like Uttar Pradesh Brahmins, but, both populations clearly have ancestry from local groups (~25% of the ancestry).

The reasons for why populations lose their distinctiveness are straightforward. Endogamy is not perfect. But, I would hold that the cultural customs of endogamy are going to be more persistent and strict among ritual priestly castes. My hypothesis that the original Indo-Aryan populations were invariant in terms of ancestry fraction (steppe, IVC, AASI). But the non-priestly castes would not enforce endogamy so strongly, because their status was accrued and obtained through other means than ritual purity. For the Kshatriyas, for example, status is obtained through power and domination. For Vaishyas, it is through primary and secondary production. Both these groups intermarried with local people who were militarily and economically of high status. In contrast, there were no equivalents for the Brahmins, who were spreading a particular ideological self-conception.

This is not a universal explanation. That is one reason I allude to Jat Rors. But, I think it gets at why Brahmins stand out as being steppe enriched.

AASI Y chromosomal lineage: haplogroup C


There was a conversation in the comments about which Y chromosomal lineages clearly descend from “Ancient Ancestral South Indians,” the people who have strong affinities to the eastern wave out of Africa. Though Y chromosomal lineage H is strongly localized to South Asia, it seems to have deep Pleistocene connections to West Asia, so that is not a clear candidate. Many “eastern” Y haplogroups have connections to East Asians, so it is not often clear which of the others might be AASI.

Reading a paper on Australian Aboriginal genetics clarified things. Many South Asian groups with no East Asian ancestry carry Y haplogroup C (e.g., Patels), which diversified 50,000 years ago between Australian/Papuans and Indians. This is clearly a reflection of deep-time connections across southern Eurasia and into Oceania.

A North Indian in Uzbekistan at 1550 B.C.

I was rereading the supplements for Narasimhan et. al. for the purposes of trying to adduce the best model to calculate “steppe” proportions in Iranians (someone asked I do this). In the process, I noticed this passage again:

Third, we find that one of the outliers, Bustan_BA_o2, is consistent with being admixed between an individual related to people on the Indus Periphery Cline and Middle to Late Bronze Age Steppe pastoralists, a type of admixture event we also observe in the Late Bronze-Iron Age Swat Valley that we will examine later, suggesting that the admixture events that led to the formation of the SPGT in Pakistan also occurred between outlier individuals at the BMAC and Steppe pastoralists who arrived at the end of the 2nd millennium.

Here is some detail on the site of the sample: UZ-BST-015, Site 4, Grave 4, 57-27 (I11520): Date of 1613-1509 calBCE (3280±20 BP, PSUAMS-4605). The earliest date possible on the Swat samples is 1200 BC (though 1100 BC is more likely). That means that this outlier individual is the earliest example of the genetic mix that would come to characterize much of northern India. A mix of steppe, and Iranian-farmer-related, and Ancient Ancestral South Indian (AASI).

The text of the supplements seems to imply that this individual is sui generis, a mix of Indus Periphery and steppe, which prefigures what was to come later in South Asia. But I will offer another hypothesis: this individual is a migrant, or the child of migrants, from the earliest phase of the ethnogenesis of the Indo-Aryan matrix of Northwest India.

Why physical appearance is an imperfect individual proxy for ancestry

Kalash children

Pictured above are some Kalash children. You notice in the foreground and center a child who could easily pass as European and draw no notice on the streets of Gdansk, Poland. But look at the child right behind her, I would guess she’d draw no notice on the streets of New Delhi!

Though the Kalash are noted for their fair features, most of them look more West Asian than anything else, and from what I can tell as many have a “northwest Indian” phenotype as a “European” one. Genetically we know that they are good proxies for “Ancestral North Indians” (ANI). About ~30% of their ancestry can be modeled as derive from the steppe peoples, such as the Sintashta. Indo-Aryans. The other ~70% of their ancestry is similar to that of the Indus Valley Civilization (IVC) people, which itself can be decomposed as mostly ancient Southwest Eurasian-adjacent (i.e., derived after the Last Glacial Maximum from the ancestors of Zagros farmers) and a minority of ancestry that is more like that of Andaman Island and pre-Neolithic Southeast Asians (“Ancient Ancestral South Indians,” or AASI).

Another thing to note about the Kalash is that they are genetically very homogeneous. This is due to the fact that they live in an isolated region, and their non-Muslim religion means that they have not intermarried with nearby Muslim people. What does this imply? It means that the Indian-looking girl is exactly the same ancestrally as the European-looking girl. Both have the same proportion of AASI and Indo-Aryan ancestry. That being said, the Indian-looking girl exhibits features more like that the AASI than the European-looking girl. Why?

The simple reason is that the genes which vary and encode salient physical features are a much smaller subset than the total genome. Therefore, they are subject to much higher variance from individual to individual (lower N in the denominator).

Here’s a concrete example. Compare eye color to inferring total ancestry and your total ancestry. Modern SNP-array ancestry inference relies on 100,000 to 1 million genomic positions. It is pretty good as a proxy for the 10 to 100 million SNPs out of your 3 billion base pairs that define your variable ancestry. For eye color, there are a few dozen genes at most, and more honestly a handful that really impacts variation. For Europeans, 75% of the variation of blue vs. non-blue eye color is due to variation around one genetic region, the HERC2-OCA2 locus. This means that just because someone has blue eyes, one can’t be sure that one has much European ancestry at all!

In the 1000 Genomes South Asian populations the SNPs for “blue eyes” are 2 to 10% frequency by population. Since the expression is recessive (you need both copies of the “blue eye” variant), assuming just this SNP you’d expect 0.05% to 1% manifestation of the characteristic in Indian-origin populations. The people with blue eyes have no more or less European ancestry than anyone else in their family.

Where does this leave us? You should understand from this that within a given family or ethnic group there is going to be a range of appearances, and a range is normal within many groups without exotic ancestry. Most Bengalis have 5-20% East Asian ancestry (closer to 5 in West Bengal, closer to 20 in Comilla and Chittagong). This means most of their ancestry is South Asian, and most Bengalis look just like other Indian-origin people. But a substantial minority look somewhat East Asian, to varying degrees. This is exactly what you expect when you have a minority quantum of ancestry.

Finally, many of the commenters here made a lot of assumptions about vloggers talking about their ancestry and were quite rude. I wish you wouldn’t do that. As a matter of fact, many of the inferences may actually be correct, but you don’t know for sure, and you don’t know the whole story. I’m pretty liberal on the comments of this weblog, but if you exhibit a serial pattern of rudeness I’m going to start randomly deleting your comments (if you complain about this I will immediately ban your IP).

Brown Pundits