Ref1 South Asian + Harappa Admixture

Since I was working on this dataset consisting of South Asians (minus Kalash and Hazara) from Reference I and some Harappa participants, I thought I would run Admixture on it.

The optimum value for the number of ancestral populations K is 3 in this case. Roughly the three ancestral components correspond to South India, Balochistan and Gujarat.

The spreadsheet showing the admixture results is here. The first sheet shows the individual results for reference samples as well as project participants.

The 2nd sheet shows the average (and standard deviation) for the reference populations.

The 3rd sheet shows the average and standard deviation for each cluster computed by MClust. I included only the samples which had at least 90% probability of belonging to a cluster.

Note how clusters CL8, CL9 and CL13 have a lot more variation than the others. Of course, I am in CL9 along with some fairly eclectic samples.


  1. namaste and hi. why im never included into the admixture runs and plots neways? im 0015, also on this sheet here i miss my project number.

    • This analysis was South Asian only. I excluded admixed individuals so they would not skew the results since no European references were included.

      I do plan to do a supervised admixture ran for admixed individuals to get some idea of their ancestry contributions. Un-admixed people will be excluded there. 😉

  2. My proportions of the three elements are almost even.