The Peopling of South Asia and the New Genomic Evidence

Prabir Purkayastha

SLOWLY, but surely, the story of the peopling of South Asia is being unravelled, using genetic data and their analysis. The latest in this series, is a preprint of a paper by David Reich, Vagheesh Narasimhan and others in biological archives, The Genomic Formation of South and Central Asia in www.biorxiv.org. Biologists, following mathematicians and physicists, are now uploading their papers before they are refereed and published in journals, making important results available much earlier.

The major findings in Reich-Narasimhan paper is that it confirms the Eurasian steppes ancestry as a component in the Indian population, dating it to about 2,000-1,200 BCE (Before Current Era). This is very much in line with other evidence – both archaeological and linguistic, which posited similar dates. What is striking and new in this paper, is the importance of Iranian farmer genetic component in South Asia, which predates the steppes ancestry considerably. This should not have been logically unexpected. We know that the first evidence of agriculture in South Asia has been found in Mehrgarh 9,000 years back, a clear indication of its coming across the Bolan Pass from Iran. Obviously, agriculturalists and agriculture moved together, as they did in different parts of the world.

The paper deals with both Central and South Asia, and there is indeed a continuum that we need to look at, to get the complete picture of Eurasian migrations. However, we are restricting the discussion here to its implications for South Asia.

The paper postulates that the Indus Valley Civilisation would have consisted of a mixture of ancient hunter gatherer South Asian population and the neolithic farmers coming from Iran. It is this population that acted as a bridge in creating the Ancient North Indian and the South Indian populations, with the North Indian population having a greater steppe component than the South Indian population, but both with significant Iranian farmer ancestral genetic component.

This large international group of scholars spanning the continents, looked at both ancient DNA and DNA from current populations. The ancient DNA has been extracted from a number of burial sites from three broad regions: Iran and the southern part of Central Asia (Turkmenistan, Uzbekistan, and Tajikistan, which authors call Turan), from the western-central steppes and northern forest zone encompassing present day Kazakhstan and Russia, and from Swat Valley in northern Pakistan (“South Asia”). Unfortunately, the ancient DNA from Rakhigiri, which has been dated to be around 4,600 years old, has yet to be published; therefore we do not have ancient DNA of the Indus Valley people in these samples. The Swat Valley samples are after the entry of the steppe population, therefore carry their signature as well.

The evidence of the ancient DNA has been co-analysed with genome-wide data from present-day individuals from 246 ethnographically-distinct groups in South Asia. This analysis consists of both statistical analysis and creating models of population mixing – admixture models – using older DNA, possible mutation rates, etc. The researchers then choose the models of mixing that approximate most closely to the population distribution that we see today.

What the Reich-Narasimhan paper shows is that the earlier Reich paper (we are calling this the earlier Reich paper as he was the lead author of this paper) postulate of Ancestral North Indian (ANI) and Ancestral South Indian (ASI) populations needs to be modified. What we have instead, is a South Asian hunter gatherer population – called by Reich-Narasimhan as Ancient Ancestral South Asian (AASI) population, and then two pulsed migrations. One pulse that originates around 5,000 years ago from Iran – the Iranian farming population. This creates what Reich-Narasimhan call the Indus Periphery population in North West South Asia. This probably – though not explicitly stated – is the Indus Valley civilisation people. At any rate, the authors seem to use this population group as a proxy for the Indus Valley people. This Indus_Periphery population then mixes with the AASI population over the next thousand-fifteen hundred years to create the Ancestral South Indian (ASI) population.

The second pulse is the entry of the Eurasian steppes people from Central Asia into India, carrying Central Asian genetic markers. This mixes with the existing Ancestral South Indian population, creating the Ancestral North Indian (ANI) population. Its markers are found in higher proportion in the North Indian Brahmin and Bhumihar populations than in other communities. The existing South Asian population is the result of mixing in various degrees between the ANI and ASI populations.

Of course, different research groups could come up with different models of how the people carrying these genetic markers – of the ancient hunter gatherers, the Iranian farmers, and the steppe people – have migrated within India and have mixed together. What is unlikely to be disproven is that the current Indian population carry these distinct genetic components of the Iranian farmers or Eurasian steppes people. Or to prove the hypothesis that the there was a large dispersion from India to Central and West Asia, and then to Europe – Out of India hypothesis – as the Hindutva ideologues would have us believe. The picture of Anatolian and Iranian farmers spreading slowly across Eurasia, followed by the Eurasian steppes people spreading west towards Europe, and South towards West Asia and South Asia, is far too complete to be overturned.

The story of the spread of the steppes people from Kurgan/Yamna cultures – the area between Caspian and Black Sea – has drawn far more attention than that of farmers from West Asia. Part of this is clearly the fascination of both Europeans and North Indians with the origin of their languages. It is the Yamna people who were the original proto Indo-European language speakers and carried it west towards Europe, and South towards Central, West and South Asia.

The story of the spread of agriculturalists however, is equally compelling. It shows that the postulate of demic expansion – agriculturists marching across the globe along with agriculture – while mixing with the original hunter gatherer populations was not wrong. This is the march of wheat and barley from Anatolia and Iran, across Europe, Central Asia and South Asia. They just got this expansion, which happened much earlier than the expansion of the steppes people, wrong in terms of language. The demic expansion did not carry the Indo European language with them. This was done by the steppes people.

This of course leaves out the other demic expansion that is linked to rice. In India, it has impacted eastern India. Reich-Narasimhan paper has relatively less to say about this, a gap which needs to be filled to have a more complete picture of the peopling of South Asia.

One important conclusion can be drawn from this paper, that the study of history has to contend with pulsed events – events accompanying rapid change – as well as slower changes. That history is not just simply a story of gradual changes; changes occur slowly, interspersed with rapid changes.

The other conclusion is that language of the conquest, and the language of civilisation can be different. The Indo European speakers spread their language, through conquest, a conquest made possible by their domestication of the horse, and their mobility. It is their mobility that allowed them to dominate over such large areas. The influx of pastoral people – with new weapons and mobility – have overturned many settled agricultural communities in history. We saw this with the rise of Turkic tribes – the Ottoman Empire, and the Mongols ruling over most of the Eurasian land mass as well. Both of them took over much bigger populations, with far more advanced civilisations. We should not therefore be surprised by similar events happening in more ancient times.

For the historians, there should be a sigh of relief. The painstaking work that they have done with archaeological and textual evidence is very close to the new genomic evidence. And it may yet help us provide pointers in deciphering the Indus Valley script, as we now know which are the possible relatives that we should look at.

Science is very cruel to myths and illusions. For those who confuse civilisation with “Aryans” have to live with the historical truth that the “Aryan” Vedic speakers were pastoral people and did not build the Indus Valley civilisation. Nor are they the original inhabitants of South Asia. They were just one among many of the late movers from the North West, who came to India using either the Bolan or the Khyber Pass. Genomics should now finally settle the question of who are the Indo Aryan language speakers, and are they just one branch of the Eurasian steppe population.