Reference 3 Admixture K=11

Continuing with the admixture analysis with our new reference 3 dataset.

Here's the results spreadsheet for K=11.

You can click on the legend to the right of the bar chart to sort by different ancestral components.

You don't know how excited I am to see the Onge (C2) component. Let's compare the Onge component with Reich et al's ASI (Ancestral South Indian):

Reich ASI % Onge Component %
Mala 61.2 39.9
Madiga 59.4 37.9
Chenchu 59.3 38.6
Bhil 57.1 37.5
Satnami 57 36.4
Kurumba 56.8 39.5
Kamsali 55.5 35.5
Vysya 53.8 34.4
Lodi 50.1 31.8
Naidu 49.9 32.1
Tharu 49 32.2
Velama 45.3 28.9
Srivastava 43.6 27.8
Meghawal 39.7 25.4
Vaish 37.4 23.8
Kashmiri-Pandit 29.4 17.6
Sindhi 26.3 13.4
Pathan 23.1 10.6

Let's plot that with a linear regression:

How do you like that?

Now let's take all the reference populations with an Onge component between 10% to 50% and use the equation above to calculate their ASI percentage. The results are in a spreadsheet. There are several populations with an even higher Ancestral South Indian than any of the Reich et al groups, with Paniya being the highest at 67.4%.

Fst divergences between estimated populations for K=11 in the form of an MDS plot.

I guess you might want to see the Fst dendrogram too. Just remember it's not a phylogeny.

And the numbers:

C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
C2 0.165
C3 0.121 0.122
C4 0.090 0.161 0.152
C5 0.071 0.152 0.137 0.048
C6 0.134 0.144 0.067 0.163 0.143
C7 0.184 0.224 0.216 0.179 0.186 0.232
C8 0.210 0.209 0.205 0.235 0.223 0.228 0.286
C9 0.175 0.207 0.139 0.208 0.178 0.141 0.281 0.290
C10 0.261 0.304 0.294 0.257 0.261 0.311 0.123 0.367 0.364
C11 0.150 0.195 0.187 0.143 0.148 0.203 0.059 0.260 0.252 0.133


  1. The Kalash appear to have 0 Ongee.

    • I think this is a very interesting result. The Kalash are a small isolated non-Islamic group in Pakistan. I suspect that ASI percentage in Pakistan as a whole was much lower before the coming of Islam. After the coming of Islam, women from the rest of the subcontinent must have contributed significantly to the genetic heritage of present-day Pakistanis. Many Pakistanis even have some African ancestry brought about by Islamic connections.

      • These are only a minority of Sindhis. Punjabis and Pashtuns almost never have African admixture. Some ASI admix should have always been in Pakistan, albeit far less so than in India. Pakistanis are generally more West Eurasian than generic Indians.

        • The african in the sindhi sample is due to certain individuals who are siddi in the sindhi reference sample.

        • The African ancestry of certain of the Pakistani Reference 3 groups is as follows: Balochi 3%, Brahui 3%, Makrani 7% and Sindhi 3%. This ancestry if from Africans transported by sea to Pakistani ports in the South. It would have been much easier to transport people from India to Pakistan by camel caravans across the Thar desert leading to increase in the ASI percentage in Pakistan. This largely female-mediated gene flow from India to Pakistan during the several centuries of Muslim rule over the subcontinent could very well account for much of the 20 to 25% ASI found in Pakistan today.

      • That may the case. It is possible that ASI is comparatively recent that it has not reached the Kalash due to their isolation. Plus the Kalash are off the Indian Cline too as are many of the populations with high Onge (Onge, Great Andamanese, Kharia, Santhal, Sahariya, Hallaki are not on Reich's Indian Cline).
        In eastern India, IE speakers are close to 100% R (R1a1+R2) on their Y-dna while their close non-IE speaking neighbors are 0% R - the Bhumij, Birhor, Ho, Kharia, Munda, and Santhal (Austro Asiatic) collectively show 0/56 R1a1 and 0/56 R2 (Sahoo data). The Austrics are predominantly O2a which in turn is 0% among IE speakers. Almost no overlap on Y. There is some overlap on mtDNA but not much. Which means one of these populations is very recent.

    • Regarding the Indus people, there is some stats on how they compare to modern people. Cranial they were diverse but fit within South Asian variation and they cluster more with the Vedda of Sri Lanka more than anyone else. For this reason I think they were mixed ANI/ASI rather than one or the other.

  2. Would it be safe to say that Onge component represents a layer of early negritos in South Asia.

    • Without getting into size differences, I think it just means aboriginal component in South Asians. For example the Mesolithic People of the
      Ganga Plains measured 5 inches taller than modern South Asians and what happened to the Onge was probably the result of a size reduction.

      • Thanks. The mean calculated height of the Nahar Rai females is 6'2.8" !

        "mean stature (188.9 ± 1.6 cm; calculated from data presented in Kennedy et al. 1986: 71, Table 6; using stature estimates from bone lengths and formulae with the lowest standard error). Mean stature for both Damdama and Mahadaha is significantly taller than European Mesolithic skeletal samples. Recently published stature data for Mesolithic Europe reveals a clinal pattern from Western European
        sites with short stature, to Eastern groups with intermediate stature (Formicolaand Giannecchini 1999). Figure 3 shows that individuals from both Damdama and Mahadaha are significantly taller than Mesolithic Europeans."

  3. Tight enough correlation to assert this component is comparable with the ASI.

    Whether or not Reich's ASI was the best representative for a part of India's prehistoric demographic is another question.

    Nevertheless, it's exciting to see this. I wonder whether we'll soon see a component that's specific to the ANI?

    • For South Asian groups in this analysis here (excluding the Brahui, Balochi and Makrani), if you add up South Asian (C1), Southwest Asian (C4) and European (C5) that approximates ANI, though not as well as Onge (C2) does for ASI.

  4. Interesting results!

  5. This is definately interesting , i've done up an isopleth just for the C2 at K=11. Take a look

    • Looks more like eastern/south eastern. Very puzzling that many of the populations in which C2 peaks are Austric - Santhal, Ho, Kharia, Bonda, Savara, Juang, Asur. Cf. Reich: "Santhal ... have a relatively high proportion of ancestry from ASI compared with most of the Indian Cline groups."

  6. Reference 3 Admixture K=12 | Harappa Ancestry Project - pingback on April 22, 2011 at 5:48 am
  7. Admixture Onge Component Map | Harappa Ancestry Project - pingback on April 22, 2011 at 1:57 pm
  8. Indian Cline | Harappa Ancestry Project - pingback on May 26, 2011 at 7:23 am
  9. Pathan parahistory | Gene Expression | Discover Magazine - pingback on July 3, 2011 at 3:59 pm
  10. well the brahui are not aryan their tongue is dravidian but they are mostly likely from the elamites who were a afro asiatic people the brahui have 28% J ydna !

  11. Elamo-Harappan origins for Haplogroup J2 in India?

    The presence of Haplogroup J2 in India, including the subclades M410 and M241 has been an often overlooked clue to the origins of M172. Sengupta et al, in 2005 worked to explain the presence of M172 in India. Their paper provides an immediate acknowledgement of the proposed spread of proto-Elamo-Dravidian speaking peoples into India originating from the Indus Valley and southwest Persia. The idea that M172 may have been carried into India with proto-Elamo-Dravidian groups is supported by the frequencies of Haplogroup J in one of the only remaining Dravidian Speaking ethnic groups in the Iranian Plateau, the Brahui. 28% of the Brahui, an ethnic Dravidian speaking group from Western Pakistan were found to carry the mutation defining Haplogroup J. Overall Haplogroup J2 in India represented 9.1% of this very populous nation. In Pakistan, M172 accounted for 11.9% of the Y-Chromosomes typed. Sengupta's paper broke down the frequencies of Haplogroup J2 into various caste and language groups. J2 was found to be significantly higher among Dravidian castes at 19% than among Indo-European castes at 11%. J2a-M410 in particular may be a strong candidate for a proposed migration of proto-Dravidian peoples from the Iranian Plateau or the Indus Valley since J2a M410 is a very high component of the haplogroup J2 chromosomes found in Pakistan. Over 71% of the M172 found in Pakistan was M410+.

    Another interesting characteristic in the distribution of M172 and more specifically, M410, in India was its higher frequencies in Upper Caste Dravidians. M410+ chromosomes were found in 13% of Upper Caste Dravidians. Sengupta goes on to suggest an Indian origin of Dravidian speakers but from a Y chromosome perspective, the paper seems to acknowledge M172 arriving in India from Middle Eastern and Indus Valley Civilizations.

  12. dravidians are L ydna
    The subclades of Haplogroup L with their defining mutation(s), according to the 2011 ISOGG tree:

    L (M11, M20, M22, M61/Page43, M185)
    L* Found only in Europe from Ireland to Eastern Europe[26]
    L1 (M295) Found from Western Europe to South Asia [27]
    L1a (M27, M76, P329) Found frequently in Indians, Sri Lankans, and Balochs, with a moderate distribution in other populations of Pakistan, southern Iran, and Arabia but also in European populations
    L1b (M317) Found at low frequency in Central Asia, Southwest Asia, and Central Europe
    L1b1 (M349) Principally found in Europe
    L1b2 (M274)
    L1c (M357) Found frequently among Burushos, Kalashas, Chechens and Pashtuns, with a moderate distribution among other populations in Pakistan, Georgia, northern Iran, India, the UAE, and Saudi Arabia
    L1c1 (PK3) Found frequently among Kalash

  13. info below is from wikipedia!
    L was found in( 51% of Syrians from Al-Raqqah), a northern Syrian city in which its previous inhabitants have been wiped out by the Mongols by and repopulated in recent times by local Bedouin populations and Chechen war refugees.[4] In a small sample of Israeli Druze haplogroup L was found in 7 out of 20 (35%). However, studies done on bigger samples showed that L-M20 averages 5% in Israeli Druze,[5] 8% in Lebanese Druze,[6] and it was not found in a sample of 59 Syrian Druze. Haplogroup L has been found in 2.0% (1/50)[7] to 5.25% (48/914)[8] of Lebanese. wikipedia
    L y dna
    Syria 51.0% (33/65) of Syrians in Al-Raqqah,( 31.0% of Eastern Syrians) Mirvat El-Sibai et al. 2009[4] Iran 3.4% L1-M76 (4/117) and 2.6% L2-M317 (3/117)
    for a total of 6.0% (7/117) haplogroup L in southern Iran
    3.0% (1/33) L3-M357 in northern Iran Regueiro et al. 2006(( Turkey 57% in Afshar village,)) 12% (10/83) in Black Sea Region, 4.2% (1/523 L-M349 and 21/523 L-M11(xM27, M349)) CinnioÄŸlu et al. 2004, Gokcumen (2008)

  14. Mari (modern Tell Hariri, Syria) was an ancient( Sumerian and Amorite city), located 11 kilometers north-west of the modern town of Abu Kamal on the western bank of Euphrates river, some 120 km southeast of Deir ez-Zor, Syria. It is thought to have been inhabited since the 5th millennium BC, although it flourished with series of superimposed palaces that spans a thousand years, from 2900 BC until 1759 BC, when it was sacked by Hammurabi.[1]

    Abu Kamal (Arabic: أبو كمال‎, Turkish: Ebu Kemal or Kışla) is a city in eastern Syria on the Euphrates River near the border with Iraq. The Euphrates divides Abu Kamal into two areas: Shamiyya (belonging to the Levant) and Jazira (belonging to Mesopotamia) Al-Jazira, a plains region consisting of northeastern Syria and northwestern Iraq, quite distinct from the Syrian Desert and lower-lying central Mesopotamia. Abu Kamal is an economically prosperous farming region with cattle-breeding, cereals, and cotton crops. It is also home to the historical site Dura-Europos and the ancient kingdom of Mari.

    i have seen the Prophet 8times in dream form
    i have also dreamt 2 times the sumerians are the dravidians and 1 time they are from canaan and i dreamt the elamite are afro asiatic not caucasian!neither aryan!

  15. i have also dreamt a explanation of the dravidians i was told negroid(hamitic) in type with straight hair and were also egyptians before i never wanted to except this i was corrected
    also i dreamt the dravidians came from syria in the fertile crescent close the sea after this i dreamt they came from northern part of africa close to the sea and i see a arrow showing their travel route the arrow goes up into syria and i see another arrow going east throught southern iraq and southern iran until the arrow stop's at northwest india

  16. The Mediterranean Peoples (Dravidians)
    (Extracts from ‘The Original Indians â€" An Enquiry’ by Dr. A. Desai)
    How the Mediterranean people came to be called Dravidians makes interesting story. The Pre-Hellenistic Lycians of Asi Minor, who where probably the Mediterranean stock called themselves Trimmili. Another tribe of this branch in the island of Crete was known by the name Dr(a)mil or Dr(a)miz. In ancient Sanskrit writings we find the terms Dramili and Dravidi, and then Dravida which referred to the southern portion of India.
    South India was known to the ancient Greek and Roman geographers as Damirica or Limurike. Periplus Maris Erithroei (Periplus of the Eritrean Sea) in the second or third century AD described the maritime route followed by Greek ships sailing to the South Indian ports: “Then follow Naoura and Tundis, the first marts of Limurike and after these Mouziris and Nelkunda, the seats of government.â€
    Dramila, Dravida and Damirica indicated the territory. Then it was applied to the people living in the territory and the language they spoke, in the local parlance Tamil and Tamil Nadu or Tamilakam.
    this article i came across long time after what i dreamt of the dravidian's

  17. Admixture (Ref3 K=11) HRP0181-HRP0190 | Harappa Ancestry Project - pingback on November 7, 2011 at 9:21 am
  18. Metspalu Ref3 Admixture Results | Harappa Ancestry Project - pingback on December 13, 2011 at 10:07 am
  19. Admixture (Ref3 K=11) HRP0201-HRP0210 | Harappa Ancestry Project - pingback on January 17, 2012 at 6:35 pm
  20. Hodoglugil Dataset | Harappa Ancestry Project - pingback on February 24, 2012 at 5:44 am
  21. Simonson Tibet Dataset | Harappa Ancestry Project - pingback on March 7, 2012 at 6:22 am
  22. Henn Ref3 K=11 Admixture | Harappa Ancestry Project - pingback on March 16, 2012 at 6:51 am
  23. Ref3 Admixture Dendrograms | Harappa Ancestry Project - pingback on March 19, 2012 at 6:04 am