HarappaWorld Admixture

Here is a new admixture calculator. This uses populations all over the world and I got the best results (i.e., lowest crossvalidation error) at K=16.

You can see the admixture results for different ethnic groups as well as results for individual (founder-only) project participants.

UPDATE: The population results have been calculated using weighted means.

The group results are also shown in the usual interactive bar chart below. You can click on the component labels to sort by that ancestral component.

Do note that the admixture components do not necessarily represent real ancestral populations. Also, the names I have chosen for the components should be thought of as mnemonics to ease discussion. I chose them based on which populations in my data these components peaked in. They do not tell anything directly about ancestral populations. The best way to look at these admixture results is by comparing individuals and populations.

I used about 188,173 SNPs for this run. The results for Henn2011 (181,223 SNPs for Hadza, Sandawe and San, 26,494 SNPs for other groups), Henn2012 (26,494 SNPs), Reich (48,967 SNPs) and Xing (18,986 SNPs) datasets reported above were however calculated using lower number of common SNPs. Hence caution should be exercised in interpreting those results.

You can also see the Fst distances between the ancestral components.

I should have HarappaWorldOracle and DIYHarappaWorld calculators out in the next few days.

Also, I am working on another calculator which will focus more closely on South Asia.


  1. This is a very comprehensive analysis. Thanks a ton, Zack! What were the results like at K=12 or K=13?

    Also, have you come to any conclusion as to what these Brahui-centered components might represent? Is this component a legitimate component in the variation of West-Eurasians or is it simply the bottle-necking of a component (in an inbred group) with a West-Asian (Anatolian and/or Iranian plateau) source? The earliest Neolithic site in South Asia is Mehrgarh, dated to 7500 BC, in the Kachi plain of Balochistan, Pakistan - this site has evidence of farming (wheat and barley) and herding (cattle, sheep and goats). This Balochistan component's peak and wide-spreadness among South Asians as their main West-Eurasian component corroborates very well with that.

    • I am not quite sure about these Brahui/Baloch-centered components. If I were to guess, it seems to be sort of a mixed component. It's closest to the Caucasian (Fst=0.046) and is actually a little more distant from S Indian (0.080) than Caucasian is (0.077). I need to run a few more experiments.

      As for it being real, I don't think we can consider any of the components to be real.

  2. looking forward to the new oracle:)

  3. I am surprised at how close the Kalash resemble the Pashtuns, Kashmiris and Punjabis.

    This is very interesting Zack, thank you again for your work.

