HarappaWorld Ancestral South Indian

Using the same method as I used for reference 3 admixture, I decided to guesstimate the Ancestral South Indian proportions, as given by Reich et al, for my HarappaWorld admixture run.

Basically, I used the 92 (out of the 96 samples Reich et al used) to find population averages for the South Indian component. Then, I used linear regression between the South Indian component average and Reich et al's estimate of Ancestral South Indian (ASI) ancestry. Since Reich et al actually list Ancestral North Indian percentages in their paper but their model is a two-ancestry ANI+ASI one, I simply calculated the ASI percentages as 100% minus ANI.

The correlation between Reich et al ASI and my HarappaWorld South Indian component for the relevant populations turns out to be 0.99277086.

And the linear regression fit for the data is:

ASI = 2.5218942 + 0.8104836 * S_INDIAN

where both ASI (Reich et al) and S_INDIAN (HarappaWorld) are given in percentages.

Of the individuals in HarappaWorld, I kept only those who had a South Indian component of at least 20% for computing the ASI proportions.

The resulting ASI percentages can be seen in a spreadsheet.

Please note that in the Group sheet, the averages are based on the samples which met the 20% South Indian component threshold. Thus, the 20% ASI in the Romanians is the average of the two Romanians who met the threshold out of a total of 16 Romanian samples.

The individual results are available in the Individual sheet. These results are a little different from the estimates using reference 3. Thus, I would point out that these should be taken only as a rough estimate.

12 Comments.

  1. Zach,

    Great work!

    What is the relationship of South East Asian to ASI?

    Do you think that the trace amounts of South East Asian often found in Western Europe is really ASI?

    I still think that Y-DNA R1b and mtDNA T1a1a may have migrated together out of the Mehrgarh Cultural area as Indo-Europeans.

    http://en.wikipedia.org/wiki/Mehrgarh

    Thoughts anyone?

    • The South Indian component is about equally far from Caucasian, Baloch, NE Asian, NE European and SE Asian components. Since South Indian component is a mixture of ASI (major) and ANI (minor), I would say that ASI is probably fairly different than SE Asian.

      However, there does seem to be some South Indian in Cambodians, Burmanese and Malay. This could indicate some affinity to ASI.

    • Are you saying that the area around Mehrgarh can be the PIE homeland? I'm not sure but if you are then you are not alone! As an Italian Indologist named Giacomo Benedetti from academic criteria has the exact thought! You can also see his blog 'New Indology'.
      P.S. I'm also quite sure that the Gedrosian components spread suggests some kind of similar scenario rather than any puted/assumed "West Asian Bottleneck Theory".

      • NOTE:The above reply is to Pconroy which amazingly got to Zack intead of using the Reply section of Pconroys post!.

      • @Nirjhar,

        Yes, that is what I'm suggesting, as it seems that Pastoralism and R1b haplotypes go together. One of the earliest pastoralist sites is Mehrgarh, and it so happens it's also the only place where R1b, R1a and importantly R2 are found in close proximity.

        From the recent paper on mtDNA J-T by Maria Pala,


        70% of the samples in T1a1 fall within T1a1a1 (Table S2).
        The geographic distribution of T1 is extraordinary—lineages
        are distributed, albeit at varying frequencies, across
        its range throughout the tree, from northwestern Africa
        throughout Europe, the Caucasus, and the Near East,
        into western India, and across central Asia into Siberia.
        The South Asian lineages tend to cluster with or match
        Near Eastern ones in the HVS-I network, but common
        HVS-I types frequently match across an extremely wide
        range. Indeed, the root type of T1a1a1, dating to ~7 ka
        ago, is very unusual among whole-genome mtDNA types
        in that it is shared between multiple geographically
        distant individuals from Scandinavia, the Baltic, the North
        Caucasus, Anatolia, and Morocco. The distribution of
        T1a is both widespread and patchy, although at low
        frequencies overall, the values rise to ~5% in the South
        Caucasus, ~6% in northeastern Iran, ~8% in Tunisia, and
        almost 9% in Romania (Table S3). Curiously, despite the
        age of T1a1a1, it has not been seen in any Neolithic
        remains to date.

  2. The Gujarati populations are very interesting. I feel like there is close kinship between Gujarati's and South Indians, compared to other Indian populations.

    • There is no ''close kinship'' between Gujaratis and South Indians - that is laughable.

      If at all, it's the Tamil Brahmins that have recent origins from Gujarat, and given that the great bulk of historically attested Western Eurasian human migrations into India came from the Northwestern corridor i.e. Gujarat-Rajasthan, and Punjab, the fact is that the distant ancestors of South Indians came from the NW.

      Gujaratis logically have more genetic similarities with their neighbour populations (that is Maharashtra, Sindh, Rajasthan and Madhya Pradesh) than to South Indians. However Tamil Brahmins might have genetic affinities with Gujaratis which supports their self-reported claims of having migrated from that region - that is all. Other than that, South Indians might really have more affinity to Eastern Indian populations, such as Bengalis.

      Autosomal ancestry generally follow geography, but two Harappa participants from the polar opposite sides of South Asia having similar admixtures does not necessarily mean that those two participants share similar ancestral history. There are a number of reasons why a Rajasthani Muslim might have similar admixtures as a UP Brahmin, without there actually being any common ancestral history between the two.

  3. This question may have Been asked before, but can a relationship between ANI and Baloch be derived?

  4. Is there a similar regression model for %ANI?