Ref1 South Asian + Harappa MDS MClust

Now I am going nuts on this dataset consisting of South Asians (minus Kalash and Hazara) from Reference I and some Harappa participants, but I promise this is the last item on this specific data. I will however do similar analyses some time after integrating all the new South Asian samples I have gotten (via project participation as well as from research data).

I ran MDS on the data in Plink and then retaining various number of MDS dimensions, ran MClust on it. This is what Dienekes calls Clusters Galore.

Here are the plots of the MDS, two dimensions at a time.

The graph of number of MDS dimensions retained versus optimum number of clusters computed by Mclust is as follows:

The maximum number of clusters (28) are inferred with 8 MDS dimensions. So I posted the clustering results for 8 MDS dimensions + 28 clusters.

Some observations on the clusters:

  1. 56 of the 62 Gujaratis are in cluster CL1 and the remaining 6 are in CL5. Both are Gujarati-only clusters. Let's see where the Harappa Gujaratis fall next time I do this analysis,
  2. CL2 has an Andhra Reddy, Caribbean Indians, a Keralan, a few Gujaratis-B, and a third of the Singapore Indians.
  3. Gujaratis-B are a varied lot spread out into CL3, CL7, CL2, CL8, CL4, CL6, and CL15, but half are in CL3.
  4. CL6 has a lot of the South Indian Brahmins
  5. Burusho are isolated
  6. Punjabis from the project seem to be divided among CL7, CL8 and CL15.

I also posted the results for 20 MDS dimensions resulting in 21 clusters.


  1. Zack, from this it does like your punjabi ancestry is more specifically pashtun punjabi.

    • That is a possibility and part of the family myth states so. But the quarter Egyptian is also what's pulling me west. Need to somehow separately analyze South Asian and Egyptian ancestries.

  2. Zack,

    CL7 appears to be a sort of a northern brahmin cluster. It has Bihar Brahmin, Punjab Brahmin, Punjabi, UP Brahmin, Rajasthani Brahmin (0.49), Kashmiri, Gujarari-b 8, Pathan 2, Singapore Indian 5.

    I think the Pathan in your case must be due to the closeness of Peshawari Pathans and Northern Punjabis (the old Gandhara).

    Do you see any reason why the 1/4 Egyptian should put you closer to Pathan than say the Makrani (unless there is some truth to the Pathan-Gupta-Copt tradition*!)?

    *Firishtah, Tarikh-i Firishtah: “I have read in the Mutla-ool-Anwar, a work written by a respectable author, and which I pro­cured at Boorhanpoor, a town of Kandeish in the Dekkan, that the Afghans are Copts of the race of the Pharaohs; and that when the prophet Moses got the better of that infidel who was over­whelmed in the Red Sea, many of the Copts be­came converts to the Jewish faith; but others, stubborn and self-willed, refusing to embrace the true faith, leaving their country, came to India, and eventually settled in the Soolimany mountains, where they bore the name of Afghans.”