Are Haryana Jats the closest living descendents of our Vedic forefathers ?

Recently, there was a paper on some communities of Northwestern India such as Rors, Jats, Kambojs, Gujjars & Khatris. The primary focus of the paper was the community of cattle herders from Haryana known as Rors.

This is part 1 of my review of the paper. In part 2 I shall focus on whether the evidence furnished in the paper proves a steppe migration into South Asia.

Let me first quote the abstract in full :-

The Indus Valley has been the backdrop for several historic and prehistoric population movements between South Asia and West Eurasia. However, the genetic structure of present-day populations from Northwest India is poorly characterized. Here we report new genomewide genotype data for 45 modern individuals from four Northwest Indian populations, including the Ror, whose long-term occupation of the region can be traced back to the early Vedic scriptures. Our results suggest that although the genetic architecture of most Northwest Indian populations fits well on the broader North-South Indian genetic cline, culturally distinct groups such as the Ror stand out by being genetically more akin to populations living west of India; such populations include prehistorical and early historical ancient individuals from the Swat Valley near the Indus Valley. We argue that this affinity is more likely a result of genetic continuity since the Bronze Age migrations from the Steppe Belt than a result of recent admixture. The observed patterns of genetic relationships both with modern and ancient West Eurasians suggest that the Ror can be used as a proxy for a population descended from the Ancestral North Indian (ANI) population. Collectively, our results show that the Indus Valley populations are characterized by considerable genetic heterogeneity that has persisted over thousands of years.

Pay attention to the bolded part. As per the pre-print by Narasimhan et al, the ANI is the likely population that spread Steppe ancestry and hence Indo-Aryan ancestry among South Asians by mixing with the ASI group. Now this paper on Rors says that Rors (by corollary the Jats) are the population most identical to this hypothetical ANI population. Please note – It is not Brahmins but a herder group from Haryana, which is the vert heartland of Vedic India. This is very significant because it clearly establishes the veracity of our Vedic tradition.

Let us look at this in more detail.

The ancestors of Rors and Jats from Haryana spread the Vedic civilization

As many of you here might be aware, the Vedic homeland was situated on the banks of the river Saraswati in a region which encompassed today’s Haryana and Western UP from where it eventually spread further into Northern India, principally in the Gangetic plains and beyond.

In terms of genetics therefore, one may argue that if there is a genetic signature of the Vedic people, it should be found most strongly in the original Vedic homeland and gradually reduce as one moves away from this homeland. Ofcourse, the caveat would be, that unless the modern people residing in the Vedic homeland had come to completely replace the original inhabitants of Haryana who spread the Vedic culture.

The ancient DNA research has now shown that in terms of autosomal ancestry, there is link between the modern presence of Indo-European speakers across Eurasia and the ‘steppe’ ancestry component.

In South Asia it is argued, that the ‘steppe’ component is highest among the Brahmins and decreases as one moves down the caste heirarchy and this is said to be one of the principal evidences of movement of steppe people into South Asia having spread the Indo-European language and culture. Infact, the recent Narasimhan et al paper, even went so far as to suggest,

Although the enrichment for Steppe ancestry is not found in the southern Indian groups, the Steppe enrichment in the northern groups is striking as Brahmins and Bhumihars are among the traditional custodians of texts written in early Sanskrit. A possible explanation is that the influx of Steppe_MLBA ancestry into South Asia in the mid-2nd millennium BCE created a meta-population of groups with different proportions of Steppe ancestry, with ones having relatively more Steppe ancestry having a central role in spreading early Vedic culture.

However, it has already been known since many years that the population having the highest ‘steppe’ ancestry in South Asia are not the Brahmins but the Jats, more specifically, the Haryanvi Jats. This was also noted by Razib in one of his earlier blogs.

The present study focuses on this elevated steppe related component in Jats and more specifically in a related group from Haryana known as the Rors. It is titled, ” The Genetic Ancestry of Modern Indus Valley Populations from Northwest India “. This study has the advantage that it incorporates the aDNA data from the Narasimhan et al and other recent papers.

The following is the admixture graph from the study,

As can be seen in the selected enlarged portion of the graph, the ‘steppe’ like light blue component, which is highest in some of the Northern European groups closest to the steppe, like the Latvians, Lithuanians, Russians etc., is far higher in Rors than it is in the Brahmins or any other South Asian group.

As per the authors themselves,

Outgroup f3 analysis in the form of (PNWI, X; Yoruba) showed that the Ror (and Jat) have distinct, high genetic similarity to modern Europeans (Figures 1C, 1D, and S5), far higher than the similarity observed in other NWI populations, such as the Gujjar (Figures 1D and S5). Among an extended set of South Asians, this pattern was repeated only in the Pathan population from Pakistan (Figure S5).

And,

Refined IBD analysis highlights the general trend whereby the sharing of IBD segments declines as one moves along the cline from PNWI and NI_IE toward Dravidian and Indian Austroasiatic (IN_AA) groups (Figure 2A). Strikingly, among all PNWI groups studied, the Ror demonstrate the highest number of IBD segments shared with Europeans and Central Asians, whereas the Gujjar share a higher number of IBD segments with local Indian Indo-Europeans and Dravidians than do other PNWI groups (Figure 2A).

In CHROMOPAINTER analysis, as expected, the Ror (and Jat) exhibited a significantly higher number of chunks received from Europeans than do other NWI populations studied (t test, p value < 0.01).

They also state further,

A higher level of European ancestry in the Ror and Jat compared to other South Asians (Figures 1, 2, S2, S5, and S13 and Tables S5–S8) makes these two populations outliers within the broader Northwest South Asian landscape. This could be indicative of either a possible recent gene flow from a population related to Europe or to ancient West-Eurasian-related influx, which would agree with previous studies on adaptation, wherein the Ror and Jat have stood out for their high frequency of the lactase persistence allele (LCT-13910T) and the light-skin-color gene variant (SLC24A5).

The Rors and Jats also have higher frequencies of Lactase persistence and light skin color gene variant which makes the case of their more recent ancestry sharing, compared to other South Asians, with Northern Europeans or steppe groups stronger.

Also,

We also report that, relative to other South Asians, the Ror group has high shared drift with the EHG and Steppe_EMBA groups, higher allele sharing with the Steppe_MLBA group, and higher affinity with the Iron Age (prehistorical) and early historical first South Asian ancient sources (Figures S6A, S6B, S7, S8A, S8D, and S9 and Tables S9 and S16).

Finally the authors argue that the Rors are the best proxy for the ANI ancestry in South Asians,

In summary, we demonstrate a higher proportion of genomic sharing between PNWI populations and ancient EHG and Steppe-related populations than we observe in other South Asians.We report that the Ror are the modern population that is closest to the first prehistorical and early historical South Asian ancient samples near the Indus Valley, and they also harbor the highest Steppe-related, EHG, and Neolithic Anatolian ancestry. However, compared to other adjoining groups, the Ror show less affinity with the Neolithic Iranians. The Ror population can plausibly be used as an alternative proxy for ANI in future demographic modeling of South Asian populations.

The bar graph below explains it very well, where it can be seen that the proportion of the steppe orange component is higher among Rors and Jats than either the Pathans, the Brahmins or any other South Asian group.

The admixture proportions as per the qpAdm given in the Supplementary Table 11 and it is instructive to observe that the steppe_emba proportion for Rors is estimated at 57 % of total ancestry while for Jats it is 61 %. The same proportions for Brahmins from UP, Gujarat & Bengal are 46 %, 45 % & 44 % respectively. Even for Pashtuns from Afghanistan it is 52 % and for Kalash it is 58 %. Only the Yaghnobis and Pamiris from Central Asia are estimated to have a higher proportion of steppe_EMBA at 62 % & 67 % respectively.

Before moving forward it is necessary to point out that the light blue component observed in the admixture graph which is highest among the Northern Europeans is not the same as the steppe_EMBA or steppe_MLBA ancestry. Steppe_EMBA & Steppe_MLBA are an amalgation of the light blue, the dark blue (Anatolian-Farmer related) and the light green (Iran_N/CHG) components you see in the admixture graphs. So while the light blue component which peaks in Northern Europe is significantly less among South Asians, the light green component which correlates well with Iran Neolithic type ancestry, peaks in South Asia but it present at a lot less proportion among the northern Europeans.

Infact, the authors even stress that,

The Ror and Jat peoples stand out for having the highest proportion of Steppe_ MLBA ancestry (- 63%). The proportion of Steppe ancestry in the Ror is similar to that observed in present day Northern Europeans.

Therefore, the predominance of the light blue component in Northern Europeans is not alone an indication that their ‘steppe’ ancestry is far higher than among South Asians.

Now, if steppe-related ancestry correlates with presence and spread of Indo-European languages, the above data clearly implies that the highest steppe-related and therefore IE ancestry among South Asians is among the Jats & Rors, significantly higher than in other NW groups as well as Brahmins and Kshatriyas. Jats and Rors sampled for the study, live in Haryana & Western UP, which is the Vedic homeland.

It therefore supports the ancient Indian tradition according to which the region of Haryana & Western UP was the homeland of the Vedic people from where they spread out across Northern India. It can therefore be argued perfectly well, that the Brahmins and Kshatriyas in other regions have higher proportion of ‘steppe’ ancestry than the lower classes around them precisely because they have greater percentage of their ancestry derived from the ‘steppe’ rich people from the Vedic homeland. It has long been an argument that the ‘steppe’ ancestry in higher among the Brahmins and Kshatriyas than the lower castes across all regions of India and that this was evidence of IE culture spreading in South Asia with the ‘steppe’ ancestry. But the example of Jats and Rors in Haryana puts to doubt all such claims. Instead, we can argue that the higher ‘steppe’ related ancestry in Upper Castes across India is a function of them having a greater portion of their ancestry from their Vedic forefathers who lived in Haryana & Western UP, just as is suggested by the Vedic tradition.

I may finally add that there is a closely related group based on close fst distances and similar admixture proportions that likely descends from the core group that was responsible for the spread of this ancestry into the Caucasus and the steppe. This group consists of Rors, Jats, Kalash, Pashtun, Pathan, Tajik & Pamiri. They have broadly similar levels of Iran_N (15 to 30 %), Steppe_EMBA (49 to 67 %) & Onge (15 to 25 %) as per the qpAdm modelling in table S11. Fst distances also indicate that they are quite closely related. For example, the Fst distance between Rors and Pamiris (0.0069), Pashtuns (0.0057) & Tajiks (0.0058) is similar to Fst distances of Rors with neighbouring groups like Kamboj (0.0088), Gujjar (0.0064), Khatri (0.0056), Brahmins (0.0052) & Kshatriyas (0.0062). Considering the fact that Rors (& perhaps Jats) haven’t probably admixed with Pamiris, Tajiks or Pashtuns since millenia, their Fst distances would have been even less initially. The other Indus Valley modern populations are also not very far off in terms of Fst distances with each other but the above groups seem to form a subset among them.

https://i.imgur.com/TrNPI1r.jpg

It is conceivable that an ancestral group related to these populations with similar levels of ancestry proportions as exhibited by them (but perhaps with lowel levels of AASI – since BMAC has only 5 % in comparison to Pamiris who have 15 %), spread out from North India to Central Asia and those from Central Asia venturing further towards Caucasus and from there onto the steppe.

South Asian Y chromosomes in Southeast Asia

Citation: Contrasting paternal and maternal genetic histories of Thai and Lao populations

Look at the R and H frequencies. Peach and violet respectively. Compare to South Asian haplogroup frequencies.

Pathans between Hind and Iran

There was a comment below on the positions of Pathans genetically in relation to South Asians and Iranians. The “Pathan” samples are from Pakistan, while “Pashtun” are from Afghanistan. What you can see is that the “Pathan” samples are more like Punjabis, while Pashtuns are like Tajiks. The Iranian samples are from western Iran. You can see that the Pakistani Pathans are definitely a little closer to UP Brahmins than to Iranians. The Afghans are a bit closer to Iranians than UP Brahmins.

Here’s Treemix:

Continue reading Pathans between Hind and Iran

The Jatts do descend from Scythians

A new paper, The Genetic Ancestry of Modern Indus Valley Populations from Northwest India.

Is it time for Asian Americans and Latino Americans to ask to be considered “white”? (a)

This is the next article in the series “Is it time for Asian Americans and Latino Americans to ask to be considered “white.” Please also read Razib’s Hasan Minhaj’s Patriot Act on Affirmative Action.

This panel brought up the issue of affirmative action benefiting caucasians at the expense of people of Asian heritage. According to a 2004 analysis of 1990s data Asians on average needed 140 points more on the SAT (out of 1600) than caucasians all else being equal to have the same probability of admission to elite universities.

Do any readers support race base affirmative action that benefits caucasians at the expense of people of Asian ancestry? If so, can you please share why? I have rarely met Asians who give a strong intellectual case for race based affirmative action that benefits caucasians at the expense of people of Asian ancestry other than the following arguments:

We don’t want to be personally called fascist, nazi, a supporter of the patriarchy, racist, bigoted, prejudiced, imperialist, colonialist, oppressor, hegemonic, exploiter, white supremacist (not joking, Asians are frequently called white supremacist . . . I don’t understand why) etc.
We don’t want Asians as a group being called fascist, nazi, supporter of the patriarchy, racist, bigoted, prejudiced, imperialist, colonialist, oppressor, hegemonic, exploiter, white supremacist etc.
We want to reduce the “evil eye” or jealousy towards Asians
We are guilty because of Asian privilege and Asian oppression of blacks and poor people (never met Asians over 22 who say this, but many K-12 rich Asians children believe this now)
This is our punishment because Asians are very fascist, nazi, supportive of the patriarchy, racist, bigoted, prejudiced, imperialist, colonialist, oppressive, hegemonic, exploitative, white supremacist etc. (never met Asians over 22 who say this, but many K-12 rich Asians children believe this now)
Xenophobic caucasians might attack us if we don’t support affirmative action.
Blacks might attack us if we don’t support affirmative action.

In the above discussion Asian Americans seemed afraid to share their actual views. Why are Asian Americans so scared?

To repeat, please share any other reasons you might have for supporting race based affirmative action that discriminates against Asians.

Podcast on South Asian genetics this week

As some of you know I co-host a podcast on genetics and history with Spencer Wells. The very first podcast we recorded in late June of 2017 was about India, but we were still getting the hang of it to be honest, and we didn’t cover much territory.

A lot has happened between then and now, and so it’s time for an “update,” which is going to cover many more topics. That being said, we haven’t recorded yet and so I’m open to “questions from the audience” that we might integrate. So please use this post to leave comments about specific topics…. (please note we have only ~1 hour or so so might not get to everything)

Update: Podcast recorded.

The Munda as upland rice cultivators

I’m reading Ben Keirnan’s Viet Nam: A History from Earliest Times to the Present. I picked it up mostly because over half the book does not consist of the history of the Vietnam War (a major failing I’ve noticed with books which are histories of Vietnam, as opposed to histories of Vietnamese-American relations).

The section on Austro-Asiatic languages (Vietnamese is one) has something of relevance to the “Munda question”. But before that, let me review a few things.

Until very recently many historians and prehistorians of India have suggested that the Munda people, who speak very distinctive dialects related to the Austro-Asiatic languages of Southeast Asia, are the primal people. That is, they are the aboriginals. The original adivasis.

I do not believe that this case is tenable. Because I am a geneticist, I make this judgment on genetic grounds. Chaubey et al., Population Genetic Structure in Indian Austroasiatic Speakers: The Role of Landscape Barriers and Sex-Specific Admixture, reveals what we know about the genome-wide patterns in the Munda.

1) They are highly enriched for East Asian ancestry compared to other South Asians.

2) Many Munda males carry a haplogroup, O-K18 (once O2a), that is very common in Southeast Asia, especially Austro-Asiatic groups. Additionally, it is more diverse in Southeast Asia. The Munda O-K18 branch seems to be a side shoot from the broader Southeast Asian tree.

3) The Munda mtDNA, defining the maternal line, is uniformly South Asian. This is in contrast to the situation with Bengalis, who have East Asia Y and mtDNA. This indicates that the Munda migration was heavily male-mediated.

4) The Munda carry mutations in genes that are associated with recent selective sweeps in East Asians (e.g., on the EDAR locus). Though this may be a parallelism, it’s unlikely. Rather, it is through shared common descent that this occurs.

The Genomic Formation of South and Central Asia has a graph which shows population relationships and gene flow that illustrates important aspects of the Munda ethnogenesis (Juang below):

AASI in this model = Ancient Ancestral South Indians. These are very distantly related to Andaman Islanders, Australo-Melanesian Southeast Asians, and more distantly to eastern Eurasians generally. They are likely aboriginal people to South Asia, with no West Eurasian ancestry.

The model above indicates that an East Asian (Austro-Asiatic) population encountered an AASI population and produced a daughter population. Then, that daughter population mixed with an ASI population, ASI being an old and stable mix of West Eurasian Iranian farmer (~25%) and AASI (75%).

This means two things for the Munda. First, they are very AASI enriched. This is obvious in any analysis. And, their West Eurasian ancestry is almost all Iranian farmer and not steppe. This is totally not surprising either. Using more naive model-based clustering Munda samples always seem to lack the components which are most easily adduced to be Indo-Aryan. They have very low frequencies of Y haplogroup R1a1a-Z93.

Let’s take a step back now. The fact that the Austro-Asiatic males arrived when there were unmixed AASI indicates that this was somewhat early. There are no unmixed AASI on the Indian subcontinent today. When we reach the Iron Age, by 500 BCE it is clear that Indo-Aryan society had pushed at least to Bihar. This component would bring steppe ancestry, as well as mixing into any remnant AASI.

So when could the Austro-Asiatics have arrived at the earliest? Two papers with extensive ancient DNA, Ancient genomes document multiple
waves of migratin in Southeast Asian prehistory and The prehistoric peopling of Southeast Asia give us a good sense. It seems that the expansion of Austro-Asiatic farmers dates to about 4,000 years ago. That is when the transition seems to occur in northern Vietnam.

One thing that is also evident: the East Asian gene flow into the Munda seems to come from northern Austro-Asiatic groups in Thailand, not the southern branch which resulted in the people of the Nicobar Islands and was eventually submerged by Austronesians. On a final note, a site in northern Burma yielded an individual who was clearly Tibeto-Burman, and not Austro-Asiatic, 3,000 years ago. So even at that date mainland Southeast Asia was heterogeneous.

But, considering that there is no evidence of Tibeto-Burman ancestry Munda, whose Austro-Asiatic ancestry seems to have come through Burma through a mainland route (as opposed to up from maritime Southeast Asia), I think one should push the date of their arrival before 1000 BCE. With the expansion of farming in mainland Southeast Asia at around ~4,000 years ago, that puts the arrival of a distinctive Munda culture in South Asia to between 2000BCE and 1000 BCE. It is entirely reasonable that during this period there were unmixed AASI in eastern South Asia, though the admixture graph may also be picking up assimilation Austro-Melanesian ancestry in southern China/Southeast Asia.

This is where Viet Nam: A History from Earliest Times to the Present comes in: the author suggests that the early Austro-Asiatic farmers were dry-land rice farmers who occupied uplands. The reason being that reconstructed Austro-Asiatic common words for rice culture is indicative of dry-land practices, with later wet-rice terminology often being borrowings from Tai and Austronesians.

I don’t know enough Indian archaeology and agricultural history to comment further, but, a visual inspection of where Munda are concentrated does suggest upland farming….

American Caste (a)

Caucasian Intelligentsia

Some at Brown Pundits have expressed dismay at my using the phrase “caucasian intelligentsia”, maybe because they see this as a criticism of white people. To be clear I am not criticizing people of European ancestry or European influenced culture; but rather a very subtle, pernicious and dangerous colonization of the mind. Some caucasians and non caucasians are at the epicenter of this imperial oppressive hegemonic system. And many good caucasians and non caucasian are fighting against the caucasian intelligentsia. Malcom X and Ali ably describe this caucasian intelligentsia. They call this phenomenon the “white liberal.” Since many caucasian liberals fight against the caucasian intelligentia, I am uncomfortable with the term “white liberal.” I also do not completely agree every aspect of what Malcolm X and Ali say. With that caveat, please listen to the whole thing. the clip is only six minutes long. Some great quotes:

“The white liberals [caucasian intelligentsia] from both parties cross party lines to work together toward the same goal”
“The white liberal differs from the white conservative in only one way . . . the liberal [caucasian intelligentsia] is more deceitful, more hypocritical than the conservative”
“Both want power but the white liberal [caucasian intelligentsia] is the one who has perfected the art of posing as the negroe’s friend and benefactor and by winning the friendship and support of the negroe the white liberal [caucasian intelligentsia] is able to use the negroe as a pawn or weapon”
Ali says that we should love our race and culture

Most but not all of the caucasian intelligentsia comes from post modernism. A sizable minority of the caucasian intelligentsia comes from other caucasian pathologies that can be elaborated on in another post. To massively oversimplify post modernism seeks to negate all metanarratives and universalist norms to observe the world as it truly is. In practice however, modern post modernists see the world through a very narrow incomplete, misleading and biased western ethnocentric filter. In practice they dispute the core of European Enlightenment liberalism and Eastern philosophy, three features of which are:

All humans are created equal and endowed with inalienable rights, including the right of life, liberty and pursuit of happiness (eastern philosophy conforms)
All humans have the right to freedom of art, speech and thought (eastern philosophy conforms but adds freedom of intuition and feeling)
All human beings are potentially powerful, potentially wise and sovereign (eastern philosophy conforms but adds “divine”)

Post modernists disagree with all of these, seeing them as tools of oppression, hegemony, exploitation, colonialism, imperialism, sectarianism, bigotry, prejudice, racism. Post modernists see free expression, liberty or the concept of everyone being potentially wise, powerful and sovereign; as potentially violent and potentially oppressive.

Post modernists are trying to damage the self confidence (Atma Vishwaasa in Sanskrit) of black Americans of African ancestry similar to what they did to former European colonies in Africa and Asia during the 1800s and 1900s. To quote from a previous Brown Pundit post:

Continue reading American Caste (a)

Ancient Indian Genetics At ASHG

At ASHG next Monday Niraj Rai will be presenting this poster, Reconstructing the peopling of old world south Asia: From modern to ancient genomes.

South Asia was one of the first geographic regions to be peopled by modern humans after their African exodus. Today, the diverse ethnic groups of South Asia comprise an array of tribes, castes, and religious groups, who are largely endogamous and have hence developed complex, multi-layered genetic differentiation. From such a complex structure, several questions have stood out from the research of our group and others that are only beginning to be resolved using modern sequencing techniques and targeted sampling of populations and archaeological specimens. Here, for the first time we have used ancient genomics approach to understand the deep population ancestry of Indian Sub- continent. Despite the rich sources available of modern Indian populations, success from ancient DNA specimens in the subcontinent have been limited. We have successfully analysed several museum samples and fresh excavation from the different part of India which provides us a wonderful opportunity to be able to relate these modern populations genetically with those in the past and build complex models of population mixture and migration in India. Using ancient genomics data from the human remains who have lived about 4-5 thousand years before present in North West and South of India, we are trying to understand the population history of Iron age people and their genetic relation with the North West of Indians and Iranian Farmers. Furthermore, we are providing a solid Genetic evidence that substantiates archaeological and linguistic evidence for the origins of Dravidian languages and the language of the Indus valley people.

I’ll probably be trying to make sure I catch Rai at the poster. I’m most interested in the South Indian samples. If they date to more than 4,000 years before the present, it will be quite interesting.

Below the fold is my response to a comment on The Roots of Indo-Iranian cultural genesis. My response is in bold. JR’s responses to my original comment are in italics.

Continue reading Ancient Indian Genetics At ASHG

Takeaways from the golden age of Indian population genetics

There are lots of strange takes on the India Today piece, 4500-year-old DNA from Rakhigarhi reveals evidence that will unsettle Hindutva nationalists. I’m friendly with the author and saw an early draft. So I’m going to address a few things.

The genetic results are becoming more and more clear. A scaffold is building and becoming very firm. In the 2020s there will be a lot of medical genomics in India. But before that, there will be population genetics. Ancient DNA will be the cherry on the cake.

Here’s what genetics tells us. First, a component of South Asian ancestry, especially in North India, and especially in North Indian upper caste groups, seems to be the same as ancient agro-pastoralists who ranged between modern Ukraine and modern Tajikistan. Genetically, these people are very similar to certain peoples of Central and Eastern Europe of this time, though there is a varied dynamic of uptake of local Central Eurasian elements as they ranged eastward.

This ancestral component is often called “steppe.” This ancestral component is a synthesis of ancient European hunter-gatherer, Siberian, and West Asian. The steppe component seems to arrive in Central and South Asia after 2000 BC.

Second, another component of South Asian ancestry is very distinctive to the region. It is deeply but distantly related to branches of humanity which dominate Melanesia and eastern Eurasia, up into Siberia. The magnitude of the distance probably dates to ~50 thousand years ago, when the dominant element of modern humans expanded outward from West Asia, east, north, and west. These people are called “Ancient Ancestral South Indians,” or AASI. Their closest relatives today may be the natives of the Andaman Islands, but this is a very distant relationship.

AASI is the dominant component of what was once called “Ancestral South Indians,” or ASI. It turns out that “ASI” themselves were a compound synthetic population. This was long suspected by many (e.g., David W.). What was ASI a compound of? About ~75 percent of its ancestry was AASI, but the balance seems to have been a West Eurasian component related to farmers from western Iran. We can call this group “farmers.”

With a few samples from outside of the IVC region, and one (or two) samples from within the IVC region, geneticists are converging upon the likelihood that the profile in the greater IVC region before 2000 BC was a compound of these farmers with the AASI. But even within the IVC region, there seems to have been a range of variation in ancestry. The IVC was a huge zone. It may not have been dominated by a single ethnolinguistic group (even today there is the Burusho linguistic isolate in northern Pakistan). Note that the much smaller Mesopotamian civilization was multiethnic, with a non-Semitic south and a Semitic north (Sumer and Akkad).

The key point is that it is very likely the IVC lacked the steppe ancestral component. That it did have AASI component. And, it did have a farmer component with likely ultimate provenance in western Iran. Additionally, there were smaller components derived from pre-steppe Central Eurasian people.

While the steppe people arrived in the last 4,000 years, and at least some of the ancestors of the AASI are likely to have been in South Asia for 40,000 years, the presence of the AASI-farmer synthesis genetically is conditional on when a massive presence of western farmers came to affect the northwestern quarter of South Asia. It seems unlikely to have been before Mehrgarh was settled 8,500 years ago. The genetic inferences to estimate the time of admixture between AASI and farmer are currently imprecise, but it seems likely to have begun at least a few thousand years before 2000 BC. range of 8,500 and 6,000 years ago seems reasonable.

So 4,000 years ago the expanse of the IVC was dominated by a variable mix of farmer and AASI. One can call this “Indus Valley Indian” (IVI).

Just like ASI, there was an earlier abstract construct, “Ancestral North Indian” (ANI). Today it seems that that too was a compound. To be concise, ANI is a synthesis of steppe with IVI. The Kalash of northern Pakistan are very close genetically to ANI. This means that while ASI had West Eurasian ancestry, albeit to a minor extent. And ANI had AASI ancestry, albeit to a minor extent. The main qualitative difference is that ANI had a substantial minority of steppe ancestry.

To a great extent, the algebra of genetic composition across South Asia can be thought of as modulating these three components, farmer, steppe, and AASI.* Consider:

Bhumihar people in Bihar tend to have more steppe than typical, but not more farmer than typical, and average amounts of AASI.
Sindhi people in Pakistan tend to have lots of farmer, some steppe, and not much AASI.
Reddy people in South India have lots of farmer, very little steppe, and average amounts of AASI.
Kallar people in South India have some farmer, very little steppe, and lots of AASI.

For details of where I’m getting this, you can look at The Genomic Formation of South and Central Asia for quantities. But as a stylized fact farmer ancestry tends to peak around the Sindh. In Pakistan steppe ancestry increases as you go north. As you go east and south AASI increases pretty steadily, but there are groups further east, such as Jatts and Brahmins, who have a lot of steppe, almost as much as northern Pakistani groups. And curiously you get a pattern where some groups have more steppe and AASI, and less farmer, than is the case to the west (you see this in the Swat valley transect, as steppe & AASI increase in concert).

Going back to the history, by the time the steppe people arrived in South Asia, in the period between 2000 BC and 1000 BC, it may be that the IVI ancestry is what they mixed with predominantly. Though it is likely that the southern and eastern peripheries had “pure” AASI, by the time steppe people spread their culture to these fringes they were already thoroughly mixed with IVI populations, and so already had some AASI ancestry.

In contrast, the farmer populations likely mixed extensively with AASI in situations where the two populations were initially quite distinct.

Please note I have not used the words “Aryan” or “Dravidian.” The reason is that these are modern ethnolinguistic terms. Genetics is arriving at certain truths about population changes and connections, but we don’t have a time machine to go back to the past and determine what language people were speaking 4,000 years ago.

Our inferences rest on supposition, and a shaky synthesis of historical linguistics and archaeology and genetic demography, a synthesis which is unlikely to ever be brought together in one person due to vast chasm of disciplinary method and means.

It is highly likely that the steppe component is associated with Indo-European speaking peoples. Probably Indo-Aryan speaking peoples. The reason is that by historical time, the period after 1000 BC, Iran and Turan seem to already have been dominated by Indo-Iranian peoples. But, in the period around 2000 BC, western Iran was not Indo-Iranian. People like the Guti and the Elamites were not Indo-European, and they were not Semitic. We have some genetic transects which show that steppe ancestry did arrive in parts of Turan and Iran in the period after 2000 BC.

Where did the Dravidian languages come from? We don’t know. They could have been spoken by an AASI group. Or, they could be associated with farmers from the west. We don’t know. Ultimately, we may never know. Unlike Indo-European languages, there are no Dravidian languages outside of South Asia.

Various toponymic evidence indicates that Dravidian languages were spoken at least as far north and west as Gujurat. And Brahui exists today in Balochistan. Though I don’t have strong opinions, I think Dravidian languages probably are descended from a group of extinct languages that were present in Neolithic Iran.

Though unlike Indo-Aryan languages, Dravidian exploded onto the scene after a long period of incubation within South Asia, as part of at least one of the language groups dominant with the IVC and pre-IVC societies.

At least that’s my general assessment. I have strong opinions about the genetics. But am much more curious about what others have to say about linguistics and archaeology.

* Some groups, such as Munda and Indo-Aryan groups in Northeast India, have East Asian ancestry. Some groups in coastal Pakistan have African ancestry.