India Today published my review of the current state of the genetics and genomics of the Indian subcontinent, and what it can tell us about the ethnogenesis of South Asians generally. In the piece I tried to be very circumspect and stick to what we know with a high, if not perfect, degree of certainty. Here I will add some comments where I reduce the threshold of certainty somewhat. That is, I’m going to include here my beliefs where I think I’m right, but in some details wouldn’t be surprised if I was wrong.
First, the title is Aryan wars: Controversy over new study claiming they came from the west 4,000 years ago. Writers don’t get to choose titles, and this is not one I would have chosen. But I am not in a position to care or know what draws clicks. Let’s note that this “controversy” is restricted mostly to India. Outside of India it’s not controversial, but a matter of the science, because people don’t have any political or social investment in the topic. It reminds me of debates about genetics and intelligence in the West, where emotions get overwrought and lies fly wildly with abandon.*
Second, there is a reference in the figures to an “Out of India” (OIT) model. That is, the Aryans migrated out of India, and implicitly the Indo-European languages derive from South Asia. I don’t think this theory has any support at all. That is, I think it is rather clear that proto-Indo-European probably emerged neither in Europe proper, nor in South Asia, but in the Inner Eurasian spaces between. But for an Indian audience ignoring OIT would seem a peculiar lacunae, so there was a reference added to the figure on that account (I pushed back against this, but do not make ultimate decisions on figures).
But I do think it was plausible up until 2009’s Reconstructing Indian History to suggest that most modern South Asian ancestry dates to the Pleistocene. In this framework the Indo-Europeanization of the subcontinent was primarily a cultural one, where small groups of Central Asians imposed their language on the native population. What the genome-wide work has shown is that South Asians are the product of a large-scale mixing process between a population very distant from West Eurasians (“Ancestral South Indians”, ASI) and a population which was indistinguishable from other West Eurasians (“Ancestral North Indians”, ANI).
Since ANI is indistinguishable from West Eurasians I hold it is clearly a West Eurasian population in provenance. Those who reject this position from a scientific perspective believe that there could have been some sort continuous zone of “ANI-like” habitation from northwestern South Asia up into northern Inner Eurasia (and perhaps toward West Asia as well) dating from the late Pleistocene. I do not that believe this is plausible, and I will tell you that prominent researchers who I have brought up this idea to are somewhat incredulous.**
Third, there are major unresolved issues genetically in relation to the dates and the total number of mixing populations. I am quite confident saying around half of the total South Asian genomic ancestry today derives from populations who were living outside of South Asia on the Holocene-Pleistocene boundary 11,700 years ago. Much of that ancestry probably flourished between the Caucasus and Zagros mountains. The remainder somewhere in the vast swath of territory between the Baltic and Siberia (perhaps further south, toward the Pamirs?).
But I am not confident of the relative balances of contribution to the ANI. It does seem that the northern component, which is derived in part from the southern component, is much more prominent in upper castes and northwestern populations. In contrast the southern component is found throughout the subcontinent.
In Genomic insights into the origin of farming in the Near East there is analysis of South Asia in the supplements. The author concludes that ANI can not be modeled as a single population (Zack Ajmal and I were saying this in 2010). The top hits for the sources of ANI tend to be the genomic sample from the Zagros, in western Iran (before subsequent admixture with Levantine farmers), and a population similar to the Yamna culture of the steppe. The issue seems to be that later steppe populations which harbor a fair amount of “Early European Farmer” ancestry (e.g., LBK in Central Europe) due likely to back migration aren’t good model fits.
Below are two plots, one showing a scatter of South Asian groups with their Iran_N (a sample from ~10,000 years ago) vs. Yamna (from ~5,000 years ago), and another with the ratios.
DO NOT TAKE THE PROPORTIONS LITERALLY. My intuition is that these models are overestimating the proportion of steppe ancestry, but my confidence in my intuition is low.
There are two groups enriched for Iran_N ancestry:
- Lower caste groups, especially from South India.
- Populations in southern Pakistan.
The reasons differ. If you have done genetic analysis of the Pakistani populations it seems quite obvious that unlike other groups in South Asia Pakistani groups facing the Arabian sea across from Oman have genuine Near Eastern ancestry. This affinity declines as you go north in Pakistan rather rapidly. Notice though one South Indian group: Jews from Cochin. This population clearly has recent Near Eastern ancestry.
The Kharia are an Austro-Asiatic Munda group. For whatever reason Austro-Asiatic groups seem to consistently have very little steppe ancestry. The Mala are Dalits from South India. The further up you go on the modal Iran_N-Yamna cline you see the populations are either upper caste, or, they are from the far northwest of the subcontinent.
The conclusion I derive from this is that first there was an early migration of West Eurasian populations consisting of Iranian farmers. This group mixed with the ASI element. The Indo-Aryans, which probably correlates with the Yamna-like component, arrived later as an overlay (and nearly half of their ancestry was derived from Iranian farmers). Then many South Asian populations have modifications on this base model of compound ANI + ASI; Munda and Bengali have later East Asian ancestry, while populations on the Arabian sea have Near Eastern ancestry.
Fourth, the story in India Today leans heavily on Y chromosome of R1a1a lineage. It is true we are Lords of the Steppe and destined to drive our enemies before us. But, it is not the primary story. And yet Y chromosomal phylogenies are easy for the public to understand. But they only make sense in light of the above framework. R1a1a is found in South Indian tribal populations. It seems likely that Indo-Aryan paternal lineages were highly invasive across the subcontinent, just as they were in Europe. In many cases they likely extended far beyond domains where Indo-European acculturation occurred.
I’m probably wrong on some of the details. But I suspect the final story will not be so different from this.
Finally, I will mention the cultural element here. There is a fair amount of the discussion of the form “so you are saying the ancestors of Indians are Europeans?” or “does this mean Hinduism is not Indian?”
The piece was about genetics and demography, not my opinions about culture. So I will say this:
- The “West” as an entity is no older that Classical Greece. 500 BC. My own personal position, strongly held, is that the West should indicate cultures and societies which descend from the European societies which adhered to the Western Church around ~1000 AD (some nations, like Lithuania, became absorbed into this cultural complex hundreds of years later). So Russia is not the West. And Merovingian Francia is not the West.
- Indian civilization of what we term the Hindu variety coalesced in the period between between 500 BC and 500 AD, from before the Mauryas, up to the Guptas. Obviously the period before 1000 BC was important in setting the ground-work, but I do not believe it was Indian as we’d understand it in anything but the geographical sense, nor was it Hindu in any way we’d recognize it today (similarly, Shang dynasty China was not China as we’d understand, which came into being after 500 BC).
These positions mean that I think nationalist passions are in the “not even wrong” category. Indian Hindu civilization is indigenous by definition, since it was synthesized in situ on the edge of historical perception and attestation (for the record, I think Adi Shankara was critical in the completion of a crystalized self-conception of Hindu religio-philosophical thought, but its origins predate him). Similarly, Indian civilization was not seeded by white Europeans because white Europeans were only coming into being in Europe when the Indus Valley civilization was collapsing.
That is all (for now).
Addendum: The first tranche of ancient DNA should be out in a few months. Also, there is another paper on Indian genetics in the work from the usual suspects. There won’t be anything totally surprising (or so I’ve been told).
* By lies, I mean the contention that intelligence is an “invalid” instrument in relation to predictiveness, or, if it is valid, it is not genetically heritable. People routinely lie about these facts in discussion or spread lies because there are socially preferred positions which they conform to. Similarly, many questions about Indian history seem to hinge on widely promoted lies.
** This model needs to also confront the massive mixing of the last 4,000 years. If it is true then it is ASI which is mostly likely intrusive, because it is not creditable that these two populations were in nearby proximity for tens of thousands of years without exchanging genes.