Burusho Kalash HarappaWorld Admixture

Someone asked for the individual HarappaWorld Admixture results for the Burusho and Kalash from HGDP.

In the chart below as well as in the spreadsheet, the IDs starting with "b" belong to the Burusho and those starting with "k" belong to the Kalash individuals.

You can check the spreadsheet too.


  1. So Zack, in your opinion, do the Kalash actually show European admixture?

    Based on my findings, I'd say they received gene flow from near the Volga, possibly around present day Chuvashia.


    This more or less fits the latest data in regards to the expansion of R1a-Z93, which probably moved from the southern Urals into Central Asia, and then Pakistan and India.

    • I think not due to R1a-Z93, but their elevated L and H (and of course mtDNA N). That is old European connection. The Kalash have some of the lowest proportions of R1a1 in the region (I believe only the Hazara have lower).
      The NE Euro component may indeed correlate to Z93.

      • Why worry? just wait for the aDNA of Farmana.....

        • I would not bet on any ancient DNA from Farmana or from anywhere else in the Indo-Gangetic plains. The conditions are not favorable to obtain sufficient endogenous DNA material.

        • We've been waiting for forever and the results are still not in. I guess we'll have to continue waiting.

  2. Thank you very much Zack! I really appreciate this! I have been looking forward to these individual results for a very long time. Very interesting, rather homogeneous populations. I expected this for the Kalash, but even the Burusho are pretty uniform. But I do have a quick question. Did HGDP00376 belong to the Burusho cluster in your last big ChromoPainter/FineStructure analysis?

    @ Davidski - I think your idea is very plausible. People on the northwestern fringe of South Asia and the eastern fringe of the Iranian plateau display a clear affinity to eastern/northeastern Europeans lacking amongst West Asians and peninsular South Asians. Nevertheless, the Kalash aren't really more eastern/northeastern European shifted than Pashtuns, at least based on these results. I mean, compare them to the Jats of greater Punjab, especially the Haryanvi Jats. One Jat participant is 20% NE European. In their case, admixture with people of ultimately European origin is far more likely. The Kalash seem to be very similar to the HGDP Pashtuns, whenever the Kalash-modal component is avoided. I think the physical appearance of the Kalash gives people a cognitive bias towards finding a European connection.

    • Right, but I'm just interested in the Kalash, because they're an extreme genetic and cultural Indo-Aryan isolate. In fact, they're something of a relic in that regard. So if they show some European admixture, then that leaves little doubt that European admixture entered South Central Asia with the early Indo-Iranians, rather than later, like with the Scythians.

      • Very interesting. I think your right in that case.
        I wonder how Nuristanis and Pashayi west of Durand Line stack up genetically. This is why the new Afghanistan genetics paper was so disappointing. They neglected to sample dwindling populations. On top of that, their autosomal sample size was a meager 4-5 individuals, and their Pashtun samples came from isolated northern communities far from the Pashtun heartland of Nangarhar, Paktia, and Kandahar. I guess they wanted haplogroup information more than anything.

      • What is the age of the European component on these populations can you please tell me? What i say is that it is important to know whether it is 3000 YBP OR 6000YBP.....Unfortunately for the 3000 ybp there is no evidence even in an Atomic level...

  3. It seems to me that Kalash are just an isolated group with affinity to the Pashtuns.
    Anyways, is there a difference between the NE European in jatts vs NE European in say the Brahmins? I am getting more West European affinity whereas some Jatts I have talked to have more of the Eastern European. We need more data from Brahmins to know with certainty. This could be a difference between old line (indo-aryan) vs new line (Scythian).

    • I think this would be difficult to realize with the small number of SNPs, just because there is not much difference between west and east European in an autosomal sense. All of Europe , the Levant, Anatolia and caucus seem to be bunched up in every PCA plot I have seen so far.

    • Have you tried Dodecad V2 K15? The Dodecad Jatts and the individual results I've seen show a Northeastern/Eastern European affinity with the Eastern Euro and Baltic components but they do show somewhat significant North Sea as well. However, the Euro scores of a Nepali Brahmin and Pakistani Punjabi Jatt seemed much more Northwestern/Southwestern European shifted such as British or French with the primary components being North Sea and Atlantic components with some Eastern Euro. Although, it seems the Atlantic component shows some affinity to Mediterranean populations as well.

      • My sample with Eurogenes was showing a Volga affinity initially (6%), but then switched to a North Sea and caucasus affinity during the next update by David.The point being that it is essentially a statistical analysis maximizing goodness of fit and hence the basis functions only approximate the sample. For example, to represent someone with Pashtun ethnicity, if you pick the two basis functions as south Indian and NW European, you might end up with the same percentage (50-50?) as if you had picked Mediterranean and East asian as the two basis functions instead. Hence I strongly believe admixture analysis must only be used to assign an individual to a known reference population based on known breakdown and affininty, rather than determining true ancestry of a population. Putting it simply, I would only use admixture if someone gave me a random sample whose ethnicity I did not know, and I wanted to guess which population they might be sampled from, based on my existing database.

    • I had compiled some Eurogenes results of a few family and friends on this spreadsheet:


      There's one Brahmin. My father is partly related to my mother (who had a North Sea-heavy breakdown) but his father emigrated from Hoshiarpur in India and brought some strange genes with him as you can tell by our breakdowns (my father's results are phased so are to be taken with a grain of salt, but my mother and I had 23andMe tests).

      Though the exact grouping seems to vary, there is a consistent West European grouping that, instead of showing up in NE-Euro (Lithuania in HAP), shows up as Mediterranean if you look at the Harappa sheet. So those people with substantial Mediterranean might actually have more affinity for Western European admixture that just isn't being caught by the NE-Euro component.

      In my McDonald plot, he put my two dots in North India and France, near the northwestern coast. My mom's K36 breakdown had actual French show up, along with North Sea. My dad's phased results had North Sea and French Basque. My results came out as almost all North Sea (no other West European component), even more than both of their original North Sea amounts.

      In K36 and K15, the exact West European varies across North Sea (Germanic) and Basque (pre-Germanic), and other West European components. I think the pattern basically describes age of ancestral components. The people whose European admixture is 2000-3000 years old are probably getting the West European as opposed to the people who are getting mostly East European, whose admixture is probably dating to a migration around 2000 years or less. So it matters less where in Western Europe the admixture is coming from, just that it is Western.

      For example, in the K15 sheet, my dad had almost all Atlantic (Basque) and mom and the Brahmin had all North Sea (and I had half and half). But in K36, that resolves to more North Sea with a hint of older admixture. In K36 only the Brahmin and the Tajik had the Atlantic (Cornwall) in significant numbers which makes more sense, so I'm leaning towards the K36 being the more accurate breakdown with so many Western European components.

      Another thing to note is that some of the Eastern European components are actually Turkic/Tatar in nature ("Eastern_Euro" and "Volga-Ural") and some are going to get more of that than other Eastern European components. These can all be interpreted as geographic signals pointing to possible origins of the respective European components (though in most cases it's probably going to be a little of everything after all this time).

      • There's also something interesting going on with West Caucasian admixture, which seems to correspond with the West European a little.

        The thing about K36 is that it uses the entire HGDP Pakistan group (Burusho, Brahui, Baloch, Pathan, Sindhi) for its South Central Asian component so it catches a lot of the admixture which leaves its European numbers totaling something reasonable and expected (not too far off the NE-Euro numbers in HAP). The other Eurogenes calculators wound up having inflated European numbers without the more effective Caucasian/South Asian components.

      • "The people whose European admixture is 2000-3000 years old are probably getting the West European as opposed to the people who are getting mostly East European, whose admixture is probably dating to a migration around 2000 years or less."

        How are you getting these timelines?

        • History. The people from before 2000 years ago merged into the substratum of northwest India, but the ones who came around then or afterwards came recently enough to be somewhat distinct (like the Hephthalites and all the post-Islamic people).

          The last group I would consider contributed to the Northwest Indian groups (like Jatts) were the Indo-Scythians or Sakas. The earliest were probably the first Indo-Europeans who came south from the Andronovo culture around 3000 BC.

          We don't know specifically where the people who came during this period (3500-3000 BC to 1 AD) wound up but chances are a lot of different groups of them contributed to the old ANI.

          I would guess the Indo-Scythians, Indo-Greeks and other people who came in the first millennium BC or more recently have the admixture pointing to Eastern European or Turkic/Tatar signals, and the really old ones (Brahmin) correspond to the older European signals (i.e, Western and Northern Europe).

          The really old traces from the neolithic era or earlier are probably the signals related to the Western Mediterranean (Basque?) which might be related to the first ANI contributions.Though West Asian, which might be the original ANI, still shows up in pretty high % in Western Europe.

          • The bulk of the Jats are R1a1, and among these R1a1 most are L657+. The L657 lineage is rare outside South Asia and Arabia, with Arabia likely being the recipient.

            So the influx if it happened would be at the Z93 level. From that level, there are about 100 mutations (from Full Genomes testing) downstream to the present. At approximately 1 mutation per 3 generations that would be about 300 generations, which at 25 years per generation would be 7500 years and at 30 years per generation, 9000 years.

            There is undoubtedly a European connection though.
            One sample L1029 sample has been tested by Full Genomes and we are getting are getting about 6000 years before present to Z645.

            It appears that R1a1-M417 was a small clan at that time. The branching dates for Z283, CTS4385, and Z93 are so close that it may have just a family living perhaps in Europe or Siberia 7000ybp.

          • @Parasar

            Yes, teh bulk of Jatts are R1a1a but a substantial amount are also L1c-M357 and I've seen some J's as well.

          • @ Paul, I am the J1 you see on Harrapa project, J1-Z1853+ and have 9GD closest matches at 37 markers, all Europeans. Z1853* is about 8000 years old mutation and we keep alive the name and ydna of the origin of P58* mutation. Probably the Gutian connection to Jatts through Scythians.

          • @Paul,
            Yes no doubt there is some L1-M357 present, but the diversity is so low that it looks to be just a few hundred years old in the Jats who have the 15,12,16,13,22, 10, 14,12,12, 10 (DYS19 DYS388 DYS389AB DYS389CD DYS390 DYS391 DYS392 DYS393 DYS439).
            It is also present among Pahstuns, Rajputs and a few others but all of them have a marked low diversity showing drift and recent expansion.

            @Paul Gill,
            I have not investigated your J1 line. In general, I had always thought of J1 as being of Arab origin in India. The European matches that you have - are they all confirmed Z1853? If that is the case then they would be genuinely close matches, otherwise it would be an artifact of the higher presence of Europeans on public databases. Even at a much higher resolution of 111 markers, some of my closer GDs are Europeans even though SNPs show thousands of years of separation.

          • @ Parasar, J1 are not of Arabian origin. All expert opinions say that origin of J1 is caucus. And P58* mutation happened in Van Lake area of Eastern Anatolia. To my knowledge no one calls Eastern Anatolia or Caucus parts of Arabia. At geno 2.0 we are about 26 Z1853* and only two other have written their story. One is a German and the other one is a Norwegian. Don't know anything about the rest of them. 9GD at 37 markers is basically considered not even remotely related. My 9GD matches are from Poland, Belaruse, Slovakia, Romania and Ukranian in origin. We carry the name and Dna of Eastern Anatolia so any contamination is totally ruled out. Age of Z1853* tells that split from other P58* is prehistoric.

          • @ Parasar, Jatts are not children of one father, they are related tribes, not always from paternal side, always been a confederation, though mainly R1a1a but other Haplogroups to a minor degree been always a part of them for thousands of years.

          • @Paul Gill,

            Yes I understand that Jats are a mix.
            As far as J1 I was limiting myself to the spread of J1 in India.

            Are you L862-?

          • Parasar: I'm J2b2* as tested by 23andMe and recently had the 12 marker Y-STR test from FTDNA done, results of which are DYS393=12, DYS390=23, DYS19=15, DYS391=11, DYS385=12-18, DYS426=11, DYS388=15, DYS439=11, DYS389I=12, DYS392=11, DYS389II=29, DYS437=14, DYS438=9

            There are J2b2* hotspots in Gujarat/Sindh, North India, East/Southeast India, and a few in NW India.

            Do you know anything about the J2b2* presence in India?

            Closest genetic match on FTDNA was a J2b1 from Turkey with a genetic distance of 1. Then on ysearch.org (my ID there is DHQ7Q), my closest match was from England (3AW2S) with a genetic distance of 3 (but haplogroup unknown).

            I'm kit B6225 on FTDNA:


            FTDNA predicts J-M172 for me, still deciding whether to buy SNP tests from them or via Geno 2.0, but I assume 23andMe's positive test for the M241 SNP confirms I am indeed J2b2* and not J2b1 or J2b.

          • This was the TiP result with the J2b1 match from Turkey:

            4 generations 6.81%
            8 generations 19.28%
            12 generations 33.15%
            16 generations 46.30%
            20 generations 57.79%
            24 generations 67.37%

          • @ Parasar, J1 in India from Arabs are Muslims, even if a Jatt is a Muslim and J1 he is unlikely to be of Arabian origin. J1 among Jatts are of Central Asian origin. There is one kit from Azerbijan and one from Turkey which may end up close to me, but we don't know yet. Turkish one is actually from Central Asia, it is called Yetis Family, so he may originally be a Gutian from Eastern Anatolia to Central Asia and then back to his place of origin Eastern Anatolia. Parsies have J1 but it is of Iranian origin.

            Yes I am Z1853*+ and L862-.

          • There are at least two possible routes for J1 into South Asia. There's the ancestral link to the Caucasus, some kind of migration or invasion from thousands of years ago where some J1 people came with more common R haplogroup people. The other is post-Islamic Arabs who brought more J2 (no subclades) and J1 into Afghanistan and Northwestern India. I think any J1 in non-Muslim South Asians is likely ancestral and in Muslim groups like Pashtun or Arain Punjabi could possibly be of Arab origin.

          • @ AD, what do you mean by ancestral? There is no back door entry into Jatt ethnic group. WE carry a name with us that is the name of the origin of P58* mutation, so nil chances of any contamination by Arabs or others. Ours is as for as I can figure out is from Eastern Anatolia to either Ukraine or Northwest Iran, then to Central Asia, back to Iran, Afghanistan and then Into India. Mine might have come with Dehae, the Dahia clan of Jatts who ruled Persia and beyond at different times as Parthians and under other different names. From my Geno 2.0 results the split with Arabians is 8000 years old.

          • Paul Gill, cool coincidence, you're on mine and my mother's GEDmatch one-to-many results with a 5.6cm shared segment (and an additional 6-9cm shared on the X chromosome with my mom). My dad's the Gill though, my mom is Pansota and has no Gills in her recent ancestry. You're also on the generated/phased paternal kit representing my dad.

            My mom shows up as a cousin for my dad, 3.5 generations back, which was interesting because they're 2nd cousins (my dad's mother was my mom's father's first cousin).

  4. I still dont get how Kalash, Bursho and even Pashtuns for that matter show so much south indian in them? considering their looks are totally caucasian and in many cases they look eastern european. Is it possible that in different people different components may show more on their looks? for example in all Indian populations, their south indian is more visible in their phenotypes, while in these these south/central asian populations, their caucasian or even northern european is more visible. how do looks correlate with genetics?

  5. Is it just me that's noticed the significant (6-10%) NE Asian component in Balochs that is totally missing in the Kalash? I don't understand this based on the geographical location of the two groups.

    Another categorising criteria is obvious if you sort by the Caucasian component - the Kalash are uniformly higher than the Baloch.

  6. My mistake, these are Burusho not Baloch . . . that explains it. I'll read more closely next time!

  7. Well there is some news on Phenotype....
    The Light Skin Allele of SLC24A5 in South Asians and Europeans Shares Identity by Descent

    Chandana Basu Mallick et al.
    ''Skin pigmentation is one of the most variable phenotypic traits in humans. A non-synonymous substitution (rs1426654) in the third exon of SLC24A5 accounts for lighter skin in Europeans but not in East Asians. A previous genome-wide association study carried out in a heterogeneous sample of UK immigrants of South Asian descent suggested that this gene also contributes significantly to skin pigmentation variation among South Asians. In the present study, we have quantitatively assessed skin pigmentation for a largely homogeneous cohort of 1228 individuals from the Southern region of the Indian subcontinent. Our data confirm significant association of rs1426654 SNP with skin pigmentation, explaining about 27% of total phenotypic variation in the cohort studied. Our extensive survey of the polymorphism in 1573 individuals from 54 ethnic populations across the Indian subcontinent reveals wide presence of the derived-A allele, although the frequencies vary substantially among populations. We also show that the geospatial pattern of this allele is complex, but most importantly, reflects strong influence of language, geography and demographic history of the populations. Sequencing 11.74 kb of SLC24A5 in 95 individuals worldwide reveals that the rs1426654-A alleles in South Asian and West Eurasian populations are monophyletic and occur on the background of a common haplotype that is characterized by low genetic diversity. We date the coalescence of the light skin associated allele at 22–28 KYA. Both our sequence and genome-wide genotype data confirm that this gene has been a target for positive selection among Europeans. However, the latter also shows additional evidence of selection in populations of the Middle East, Central Asia, Pakistan and North India but not in South India. ''
    What do you guys think?
    I think this is not good news at all for the Invasionists!....

  8. Im a J1 (Pashtun origin) as well and waiting for FTDNA 67 marker test results.

    Parasar said that it indicates Arabian origin in another thread so it will be interesting to see if he is correct or if I am more like the Jats tested here.

    My closest match on 23andme is a half Tajik half Russian and my closest South Asian match is a Jatt or Rajput(not exactly sure).

  9. G2a2a PF 3146 double value at day 19. Jat Punjabi, Pakistan. FTDNA kit 283231. I am told this is unusual but there is another Jat my markers just about match with.

    • @283231,
      Jatts are not children of one father but are from related tribes and not always related from the paternal side, so there are more than one lineage among Jatts. Majority of the Jatts are certainly R1a1 but other lineages though small in numbers but are part of Jatt ethnic group for thousands of years.