Harappa Reference 2 IBS Concordance

Vasishta asked:

would it be possible to repeat the same exercise with the Reference II populations? These results seem to be far more plausible for every participant as compared to the previous ones.

Since it took only a few minutes, I calculated the scores as detailed in a previous post from the IBS measures between Harappa participants (1-80 only) and Reference 2.

The spreadsheet is here.


  1. Thanks, Zack. My top 10:

    HRP0010 (Asy)
    1.armenians 93.8%
    2.uzbek jews 93.5%
    3.georgians 93.1%
    4.iraq jews 92.5%
    5.georgia jews 92.1%
    6.cypriots 91.5%
    7.adygei 89.0%
    8.lezgins 88.2%
    9.iranian jews 88.0%
    10.turks 87.0%

  2. My top 10 are not as high as Paul G's.
    HRP0059 (Kurd):
    1. stalskoe 87.38%
    2. georgians 87.18%
    3. uzbekistan jews 87.10%
    4. kurd 86.92%
    5. lezgins 86.74%
    6. turks 85.60%
    7. cypriots 85.15%
    8. armenians 84.90%
    9. azerbaijan jews 84.02%
    10.georgia jews 83.47%

    The Top10 for kurds:
    1. HRP0059 (Kurd) 86.92%
    2. HRP0010 (Asy) 86.75%
    3. HRP0018 (Iranian) 86.70%
    4. HRP0037 (Iraqi/Egyptian Jewish) 86.14%
    5. HRP0030 (Iranian) 85.37%
    6. HRP0040 (Iranian) 84.49%
    7. HRP0046 (Iranian) 84.45%
    8. HRP0020 (Iranian) 82.32%
    9. HRP0080 (Georgian) 77.86%
    10. HRP0015 (Romani/Serb) 71.14%

    The highest scores are:
    HRP0058 (99.29%) and HRP0068 (99.28%) for gujaratis-a

  3. Yes, that is a good idea, Palisto. Thank you. Going by population order may also be informative.

    Populations for whom I (HRP0010-Assyrian) am the first match:
    Uzbek Jews, Syrians, Lebanese, Iranians, Georgians, Druze, Azeri Jews, and Armenians

    Second match:
    Adygei, Georgian Jews, Iranian Jews, Iraqi Jews, Sephardi Jews, Yemeni Jews, Jordanians, Kurds, Lezgins, Sardinians, and Turks

    Third match:
    Ashkenazi Jews, Bedouin, Egyptians, Ethiopians, Moroccan Jews, Saudis, Palestinians, and Tuscans

    Hopefully I did not make any errors.

  4. Looking at all Iranian HRP's I realized that none of them has 'Iranians' in the Top 5. Instead they all have Armenians, Georgians, or other Caucasus populations (Adygei and stalskoe) in their Top scores:
    HRP0018 (Iranian):
    1. uzbekistan jews 88.59%
    2. georgia jews 87.14%
    3. kurd 86.70%
    4. stalskoe 85.15%
    5. armenians 85.08%
    6. lezgins 84.23%
    7. iraq jews 84.22%
    8. georgians 82.92%
    9. cypriots 82.17%
    10. turks 81.71%

    HRP0030 (Iranian):
    1. adygei 91.72%
    2. stalskoe 91.63%
    3. georgians 90.47%
    4. georgia jews 89.03%
    5. cypriots 88.54%
    6. lezgins 8.09%
    7. armenians 86.18%
    8. turks 85.80%
    9. uzbekistan jews 85.68%
    10. iraq jews 85.46%

    HRP0040 (Iranian):
    1. uzbekistan jews 91.69%
    2. armenians 88.11%
    3. georgians 85.50%
    4. kurd 84.49%
    5. cypriots 83.85%
    6. adygei 83.60%
    7. lezgins 82.67%
    8. iranians 82.50%
    9. azerbaijan jews 82.50%
    10. georgia jews 81.40%

    HRP0046 (Iranian):
    1. armenians 85.08%
    2. stalskoe 84.99%
    3. uzbekistan jews 84.95%
    4. georgians 84.74%
    5. kurd 84.45%
    6. adygei 84.39%
    7. iraq jews 84.16%
    8. azerbaijan jews 84.08%
    9. iranians 82.96%
    10. lezgins 82.91%

    HRP0020 (Iranian):
    1. armenians 89.47%
    2. adygei 88.34%
    3. stalskoe 85.36%
    4. georgians 84.14%
    5. lezgins 82.97%
    6. iranians 82.81%
    7. kurd 82.32%
    8. iraq jews 82.28%
    9. turks 82.23%
    10. cypriots 81.51%

    HRP0034 (Iranian Khorasani):
    1. georgians 91.06%
    2. lezgins 90.92%
    3. stalskoe 88.24%
    4. armenians 87.43%
    5. adygei 86.83%
    6. ashkenazy jews 84.37%
    7. kurd 84.07%
    8. turks 83.69%
    9. iranian jews 83.69%
    10. iranians 83.49%

  5. Yes. This is a phenomenon that appears to be true for many populations with a predominantly Muslim tradition. A relatively higher degree of heterogeneity in a population, I believe, has such an effect on the values generated. Iran, due to its size, may be an even more extreme example of this phenomenon. In future research studies, I hope, they will divide population reference samples by at least region.

    • I should add that even a reference data set divided by region will not improve results much, I presume, if the heterogeneity has become uniform. Or, at least, effectively so.

    • I'm not sure , some of the results are still weird. My closest matches and in this case even the iranian ones are odd. The one that dienekes did had results that made more sense at least for me, so i'm not sure whats causing this.

      • Thanks, Zack :-).

        Simranjit, the last time TN Brahmins were my 11nth highest match, and in this one the 6th highest match. I'm beginning to wonder whether Xing's TN Brahmin are somewhat less South Asian than me (considering that Gujaratis-a are now my top match)- strange, considering that I am less SA and more Baloch/Cauc than all of the Gujarati participants. Either that or whether IBS is not completely accurate and that I should take it with a pinch of salt. My results for related runs have also been quite strange, as in incongruous with my ancestry *shrugs*.

      • There might be other reasons for the difference in results in Dodecad and here, but my guess is that Dienekes is using a fairly limited set of South Asian reference samples since that's not his focus. This results in very clean results instead of the odd overlapping IBS values you get here.

        • "I'm not sure , some of the results are still weird."

          I just checked and both of us have Gujaratis-a as our highest match. Your second highest match are Xing's TN Brahmins. I was going through the paper again yesterday and wasn't able to find any information regarding where the samples where obtained. If the area in Tamil Nadu was specified, that would help us figures out the probability of Xing's TN Brahmins were being slightly more Northern than average if they have to show up as the second highest match for a Jatt - perhaps they are Vadagalai Iyengars or Vadama Iyers. While Xing an Co. specify where they obtained the samples for the samples, not much is specified about the TN and AP Brahmins. For instance this is what they have to say about the Punjabi and Nepali samples-

          "Pakistani: Arain agriculturalists from the Punjab region;
          Nepalese: collected from Kathmandu, Nepal (samples consist of 16 Brahman, 2 Magar, 2
          Chhetri, 2 Newar,1 Madhesi, and 2 Nepalese with unknown ethnicity)

  6. Iranians matching the Behar reference poorly is more likely due to their (presumed) Southern origins. If you examine their K breakdowns, you will note many of the Behar Iranians have African components. The only plausible explanation is they come from somewhere with a historical connection to Africans, such as Bandar Abbas or Khuzestan.

  7. So, what happens if the Iranian ouliers of the Behar reference are excluded? The outliers would be GSM536746,GSM536751,GSM536752,GSM536754 and GSM536758.

  8. HRP0016

    1. Singapore Indians (95.27%)
    2. Gujaratis-A (94.99%)
    3. TN Brahmin (94.67%)
    4. AP Mala (94.05%)
    5. TN Dalit (94.00%)
    6. Gujaratis-B (93.32%)
    7. Sakilli (92.38%)
    8. AP Madiga (92.03%)
    9. AP Brahmin (91.61%)
    10. Malayan (91.25%)

  9. Heavily off-topic, but apparently Osama's been caught(!)