Tag Archives: caucasus

Participation Changes

Now that I have DIY HarappaWorld out, I am changing the participation requirements a little bit with somewhat different requirements for South Asians compared to other regions.

If you have any real ancestry from a South Asian origin, you are eligible to participate. Partial South Asian ancestry is okay. The list of countries of origin I count as South Asian are as follows:

  • Afghanistan
  • Bangladesh
  • Bhutan
  • India
  • Maldives
  • Nepal
  • Pakistan
  • Sri Lanka

Note that 2-3% South Asian from Dr. McDonald's BGA or Dodecad Project does not count as South Asian ancestry.

If you have all four of your grandparents from one of the following countries or regions, you can also send me your data.

  • Burma
  • Tibet
  • Uyghur from Xinjiang, China
  • Tajikistan
  • Kyrgyzstan
  • Kazakhstan
  • Uzbekistan
  • Turkmenistan
  • Iran
  • Turkey
  • Azerbaijan
  • Armenia
  • Georgia
  • North Caucasian Federal District, Russia
  • Iraq
  • Syria
  • Lebanon
  • Jordan

Relatives will only be accepted when they are a better replacement for current participants. For example, replacing a participant by his/her parents or his maternal uncle and paternal aunt gets us two unrelated participants (assuming, of course, that the two sides of the family are not related by blood). Another example could be if a participant is of partial South Asian ancestry and they get replaced by a relative who has more South Asian ancestry.

Everyone else can use DIY HarappaWorld. It's fairly easy to use on both Windows and Linux. The only hard part right now is that you have to install R to standardize your genome file. I might look into creating an executable for that to make it easier.

Finally, please be honest.


There has been some discussion in the comments about the C2 ancestral component at K=12 admixture runs which I called Pakistani/Caucasian.

First of all, we should remember that these "names" of ancestral populations are just rough mnemonics. They are chosen based on the frequencies of the component among modern reference samples. So the names have nothing at all to do with history.

In the case of Pakistani/Caucasian component, I wanted to emphasize the peaks of the component in Pakistan and the Caucasus. As commenters pointed out, the component is also quite high among the Iranians.

However, I have realized that this name, Pakistani/Caucasian, is a hindrance rather than a help for understanding the Admixture results. Also, this component is lower among the Pathan, Sindhis, and Punjabis than it is for Iranians etc. Therefore, the Pakistani part of the name is a bit of a misnomer, considering that the Pakistani populations it is high among comprise only about 5% of the country's population.

On the other hand, I do not like the name "Iranian" for this component. While it was suggested based on the geographical Iranian plateau which extends from the Caucasus to Balochistan, it still is confusing and it doesn't emphasize the peak areas.

Thus, I have renamed "Pakistani/Caucasian" as "Balochistan/Caucasus". I didn't use the shorter Baloch as this component is equally high among the Baloch, Brahui and Makrani, all populations living in the province of Balochistan.