Admixture K=12, HRP0041-HRP0050

Here are their ethnic backgrounds and the results spreadsheet. Also relevant are the reference I admixture results.

If you can't see the interactive bar chart above, here's a static image.

PS. This was run using Admixture version 1.04.


  1. What is Pak/Cauc that seems strongly associated with Iranians in this sample? Why is it labeled so?

  2. --
    The Kalash component has split into a pure Kalash component (C3) and a Pakistan/Caucasian component (C2) which is highest in Southwestern Pakistan (Brahui, Makrani, Balochi) at 60-57% followed by Georgians, Lezgin, Adeygei, Azerbaijan Jews and Iranian Jews (56-50%).

    I respectfully request you not to label components after existing countries/nations. You could call it Persian or something like that or make up a new name.

    • Thanks for your suggestion. I started out with numbered ancestral components only but that is confusing too.

      Let me put the different options to the participants and readers and let's see what everyone finds most useful.

      • please continue to use countries.

      • Indo-Iranian, representing the Indus Valley and greater Iran (Parthia).

      • I do not agree with ethno-linguistic labels being used. It would not make sense for Georgians to have an excess of "Indo-Iranian".

        I suggest the Pak-Cauc component be relabeled to "Caucasus-Iranian" based on its distribution peaks. The Caucasus mountains, Iranian plateau and Pakistani mountains form one continuous semi-circular belt. Practically all populations located around this geographical feature have an excess of "Pakistani-Caucasian".

        Note the use of Iranian in "Caucasus-Iranian" is purely geographical.

        • I think you are correct as Pak/Cauc is very high in the Dravidian speaking Brahui and in Caucasian speaking groups. An Indo-Iranian label will confuse due to its linguistic connotation though I meant it in a strict geographical sense.

          • +1 DMXX Caucasus-Iranian seems more apt. Still, the peak of this component (in any group) isn't as clear as most of the others, reaching 60ish at best.

        • Caucasus-Iranian might be purely geographical after you have explained that. But my first thought when "Iranian" is mentioned is not the plateau but the country or its people.

          All I am saying is that this naming of ancestral populations is fraught with issues. For one thing, we are naming them based on their probabilities in current population groups. For another, they are very rough mnemonics.

      • Why not just call it Brahui? Or Elamite, since there's that hypothesis about how (Proto-)Dravidian spread from Iran to the subcontinent.

    • The pak /caucasion component is rather interesting , but i don't think iranian/caucasion is theappropriate label for it. The fact that it peaks in the brahui/makrani , instead of the pathan( which i would take as the most indo-iranian of the pakistani populations) probably means that it something else.
      It might however break up in higher K's.

      The caucasion part is probably true as it is high in all the Caucasus populations , and pakistani is most lkely the most appropriate label for the other part of it at the moment.

  3. I do not see use of existing nation names, other than Pak/Cauc and it would be wise to avoid using Ind/Pak/etc as it can easily be misunderstood. Since this is a split of Kalash, it seems to be that subcomponent thats common in the peoples listed in your comment (the residual being distinctive Kalash). It may not be a bad idea to use a hierarchical naming scheme once the major components that are easily understood are in.

    • The Kalash component didn't just "split" into "pure Kalash" and "Pak/Cauc". Compare K=11 to K=12: you can see that the Pak/Cauc component for several groups was higher than the previous Kalash component. (For instance, the Balochi were 57% Pak/Cauc at K=12, but only 37% Kalash at K=11.) So you can't really create a hierarchy or phylogeny of components.

      With respect to the component names, there's always going to be a tradeoff between confusion (for people who don't understand what the names represent) and convenience (for people who do). My suggestion would be to continue naming components after the countries, populations, or geographical areas where they're modal, and put up a FAQ explaining that the names shouldn't be taken too seriously and that components aren't necessarily comparable across runs (or other projects that use different reference populations and Ks).

      • Sloppy writing on my part. If I recall correctly, K=11 SW Asian + Kalash is roughly close to K=12 Pak/Cauc + SW Asian + Kalash (with some effect of the European component too), but even that's not exact and doesn't imply any phylogeny or straight forward relationship since all components are calculated de novo at a new value of K.

        I like your suggestion about explaining component names.

  4. Putting aside the name, it would be interesting to figure out the origin of this component. It seems to have dual peaks, one in Georgians and the second in Baluchestan, but also consistently high in Iranians. doesn't quite fit with a conventional dispersion pattern

  5. Can somebody please tell me what the difference is between Siberian and NE asian? I thought they meant the same thing.
    Which reference populations are used for each? Thanks

  6. Balochistan/Caucasian | Harappa Ancestry Project - pingback on March 16, 2011 at 7:10 am
  7. My Harappa Project Results | Procrastination - pingback on March 16, 2011 at 10:58 am
  8. Harappa Participant Admixture Maps | Harappa Ancestry Project - pingback on April 13, 2011 at 1:25 pm

Trackbacks and Pingbacks: