Tag Archives: uzbek

Afghan Dataset

A paper, Afghan Hindu Kush: Where Eurasian Sub-Continent Gene Flows Converge by Julie Di Cristofaro, Erwan Pennarun, Stéphane Mazières, Natalie M. Myres, Alice A. Lin, Shah Aga Temori, Mait Metspalu, Ene Metspalu, Michael Witzel, Roy J. King, Peter A. Underhill, Richard Villems, Jacques Chiaroni was published at PLoS One about the genetics of the people of Afghanistan.

Thanks to Mait Metspalu, the data is available online. It consists of:

  • 5 Hazara
  • 5 Pashtun
  • 5 Tajik
  • 4 Turkmen
  • 5 Uzbek

Here are the HarappaWorld Admixture results for the samples in this dataset.

You can check the spreadsheet too.

Tadjik1_44Af and Pashtun2_6Af seem to be outliers and there's a possibility they are mislabeled. I would like to look into these two samples further before I calculate group averages.

You can compare these Pashtun results to HGDP Pathan and HAP Pashtun results.

Behar et al Data

In their paper "The genome-wide structure of the Jewish people", Behar et al analyzed the genomes of some Jewish groups. More important than the Jewish samples (which include two South Asian Jewish groups) for us are the different South Asian, Middle Eastern, and European groups they sampled:

Ethnic group Count
Saudis 20
Jordanians 20
Georgians 20
Turks 19
Iranians 19
Hungarians 19
Ethiopians 19
Armenians 19
Lezgins 18
Chuvashs 17
Syrians 16
Romanians 16
Uzbeks 15
Spaniards 12
Egyptians 12
Cypriots 12
Moroccans 10
Lithuanians 10
North Kannadi 9
Belorussian 9
Yemenese 8
Lebanese 7
Sakilli 4
Paniya 4
Cochin Jews 4
Bene Israel 4
Samaritians 2
Russian 2
Malayan 2

Of the 466 samples, I excluded 8 because they were either duplicates or too similar in their genomes to others.

The series matrix files that I downloaded were in a somewhat different format. To convert them to Plink format, I had to look up the platform file for the Illumina genotyping BeadChip they used. Also, Illumina used an A/B alleles and Top/Bot strands system instead of the regular ACGT alleles and forward/reverse strands. This Illumina Technote explained it and I found a Perl script to convert between the two.