Harappa Admixture Dendrogram

Using the ancestral component percentages from the Admixture run at K=12 for Harappa Project participants, we can calculate the pairwise Euclidean distance between them. These distances can be used to create complete linkage (i.e. furthest neighbor) hierarchical clustering, which you see below.

Note that this is not a phylogeny. It just visualizes the closeness of your admixture results to others.

Thus in terms of admixture results, the Punjabis mostly cluster together along with the Rajasthani (HRP0033), except for my family (HRP0001 and HRP0035) who cluster (not so closely) with the Sindhi-Balochi guy (HRP0039) likely due to the Southwest Asian and African components.

Interestingly, the Bihari Brahmin (HRP0003) is very different from the Bihari Kayastha participant (HRP0032). The Caribbean Indian samples (HRP0027 & HRP0028) cluster with the Bihari Kayastha, so we can't really say for sure where from India their ancestors originated from.

The South Indian Brahmin samples seem to vary consistently from the non-Brahmin ones.

The Iranians cluster closely except for the Khorasanian HRP0034 and Assyrian HRP0010. The Assyrian Iranian sample is actually closer to the Iraqi/Egyptian Jewish sample (HRP0037) than to other Iranians.

The participants with recent European admixture cluster very loosely with each other. Other techniques will need to be used to pinpoint their specific South Asian origins.

If we make a cut at about 0.3 on this tree, we get 3 South Asian clusters:

  • the Northwest of South Asia
  • South Indian Brahmins, Bihari Brahmin, UP Brahmin
  • South Indian non-Brahmin, Bihari non-Brahmin, Bengalis, Caribbean Indians

I wish I had a thousand South Asian samples to play with. I wonder how this dendrogram would look in that case.


  1. the closest person i'm 'gene sharing' with on 23andme is a bihari kayastha (aside from my family). parts of my family carry the surname sarkar, which is often associated with bengali kayasthas, so it is likely that may have been their caste before they became muslim. but many ethnographers presume that kayasthas are simply a class of non-brahmins who were literate (or became so), and emphasized cultural flexibility in service to various non-hindu elites.

  2. The Khorasani Iranian differs from the others by having less Southwest Asian and more South Asian. The other components (Siberian, European, Caucasian-Pakistani, Kalash) are all in the same ballpark.

    I am interested in learning the geographic origin of HRP0040, the Iranian who was closest to me.

  3. Would it be possible to do one with the reference populations as well? at least the south asian reference ones.

    • Coming up later today.

      • I am looking forward to that. Do you have any interest in splitting the Gujarati reference samples in the future, in the manner Razib has described in his posts?

        • I'll split a few of the reference populations when I start analyzing the Eurasian dataset. Off the top of my head, Gujaratis and Armenians need to be split.

          • Great. I am looking forward to it. I am assuming that for the Armenians, you want to split out the Northern European admixed samples? Or was that a different project I am thinking of?

          • Yes, those Armenians with a lot more European than the rest.

  4. Re: "Interestingly, the Bihari Brahmin (HRP0003) is very different from the Bihari Kayastha participant (HRP0032)."

    This is consistent with a previous study: "distance between the Brahmin and Kayasth caste groups was found to be large." http://www.ncbi.nlm.nih.gov/pubmed/12959898

    The Kayasthas were an elite group with a higher level of education than Brahman agriculturists who constituted the bulk of Brahmans of Bihar. Some of the Baro Bhumiyas of Bengal were Kayasthas and they were the dominant power center in eastern India prior to Islamic rule so much that Abul Fazl declared most of the Bengal rulers to be "Kayeth."

    There is a theory that some Kayasthas were from Karnataka. http://books.google.com/books?id=A0i94Z5C8HMC&pg=PA33
    They brought about a literary renaissance of sorts to Bengal and Mithila.

    There a were number of Brahma-Kshatra migrants from Karnataka too ( http://www.banglapedia.org/httpdocs/HT/S_0199.HTM http://himalaya.socanth.cam.ac.uk/collections/journals/ancientnepal/pdf/ancient_nepal_159_01.pdf) who had in prior period migrated to the south if we are to believe the vanshavalis and Chalukyan inscriptions.

  5. There are apparently also some recent Northern European admixed Georgians among the reference Georgians, will you split them too?

    • I plan to look at all the population groups which have a high variance in their ancestral component results and see if it makes sense to split them.

  6. Interesting that South Indian/Cow Belt Brahmins cluster together; while Punjabi Brahmins are closer to Punjabis.

    I can understand the first clustering, assuming that Southern Brahmin communities are a spinoff of northern communities and have maintained relative genetic isolation; and the source Northern Brahmin population differed in original origin from other Cow Belt populations.

    But how do both Brahmin communities differ equally from Punjabi/Rajasthani Brahmins; and why is that community closer to other Punjabi populations?

  7. Distance Measures | Harappa Ancestry Project - pingback on March 18, 2011 at 6:42 am
  8. Harappa Admixture Dendrogram 1-80 | Harappa Ancestry Project - pingback on April 7, 2011 at 7:25 am
  9. Harappa Clustering | Procrastination - pingback on May 25, 2011 at 11:21 pm

Trackbacks and Pingbacks: