Tag Archives: harappa - Page 3

Ref3 + Yunusbayev Caucasus Data Admixture

To my standard reference 3 (list of populations), I added the Yunusbayev et al Caucasus samples which include the following:

  • 20 abhkasians
  • 16 armenians
  • 19 balkars
  • 13 bulgarians
  • 20 chechens
  • 14 kumyks
  • 6 kurds
  • 15 mordovians
  • 16 nogais
  • 15 north-ossetians
  • 15 tajiks
  • 15 turkmens
  • 20 ukranians

These 204 samples increased the total to 4,090.

Then I applied a stricter IBD relationship cutoff than I have before. Previously my focus was on removing relatives, but now I wanted to remove samples that seemed highly inbred or belonged to highly bottle-necked small groups so they would not create their own clusters in Admixture. This process removed the following 164 samples:

  • maasai 30
  • papuan 15
  • karitiana 12
  • pima 12
  • onge 8
  • surui 7
  • luhya 6
  • melanesian 6
  • colombian 5
  • hadza 5
  • koryaks 5
  • sandawe 5
  • san 4
  • turkmens 4
  • african-americans 3
  • east-greenlanders 3
  • great-andamanese 3
  • nganassans 3
  • chenchu 2
  • evenkis 2
  • han-chinese-south 2
  • maya 2
  • mbutipygmy 2
  • mexicans 2
  • utahn-whites 2
  • aus 1
  • bantukenya 1
  • british 1
  • chinese-americans 1
  • gujaratis-b 1
  • iranians 1
  • naxi 1
  • north-kannadi 1
  • samaritians 1
  • she 1
  • tuvinians 1
  • yemenese 1
  • yoruba 1
  • yukaghirs 1

Finally, I added the 165 founders from the Harappa Project participants (up to HRP0180).

The crossvalidation error for the admixture results with K (number of ancestral components) from 2 to 20 is plotted here.

Zooming in,

The lowest crossvalidation errors are for K=17 and K=12.

The admixture results are in a spreadsheet.

In addition to K=17 and K=12, take a look at the results for K=15.

PS. I should point out that the names for the ancestral components are just useful mnemonics based on the current distribution of that component. Also, a component with the same name at one value of K is different from a similarly named component at another K.

Admixture (Ref3 K=11) HRP0181-HRP0190

Here are the admixture results using Reference 3 for Harappa participants HRP0181 to HRP0190.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

HRP0181 is half-Punjabi Jatt and half-English and the admixture results are not too different from the average of the reference British and our other Punjabi Jatt participants.

HRP0183, a Khatri, has fairly high European component, less than the Jatts but higher than most other South Asians.

HRP0186 is the most West Asian (and thus least European) of all our Georgian participants.

HRP0188,a Haryana Jatt, has the highest European component (29%) of all South Asians I think. I am surprised at the results for the two Haryana Jatts. I would not have expected their results to be much different from the Punjabi Jatts. If anything, I thought the Haryanavis would be less European than the Punjabis. Now I want to get a few non-Jatt Haryanavi participants. Anybody know someone?

Participation Rate

Just thought I would show you how quickly I was getting data earlier in the year and how it has slowed down now. This shows the number of days it took to get 10 samples.http://reteks.ru

As you can see, data submission has picked up a little recently.

Admixture (Ref3 K=11) HRP0171-HRP0180

Here are the admixture results using Reference 3 for Harappa participants HRP0171 to HRP0180.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

HRP0171 is our 2nd Tamil Vellalar from Sri Lanka and the results are similar to HRP0169.

HRP0172 has 1/16 Romani ancestry. The Onge component is about 0.4% which could be noise or possibly evidence of a South Asian connection via the Romany.

HRP0174 and HRP0176 are Kerala Nairs.

HRP0175 is a Georgian Svan and pretty similar to HRP0138 (who is Georgian but not sure which local ethnic group).

HRP0177 (Azeri) is a bit more northern European than HRP0083.

HRP0178, our first Punjabi Khatri, has admixture results more like the Punjabi Jatts than Punjabi Brahmins.

HRP0179, who is 7/8 Turkish and 1/8 Kurd, has the highest Siberian component (5%) other than the Kazakh participant.

HRP0180 is our first Pashtun even if he's only half-Pathan (the other half being English). I have heard grumblings on the net about the HGDP Pathans not being representative of the Pashtun tribes. If we use the HGDP Pathans and 1000genomes British averages to estimate HRP0180's recent ancestry, we get 45.5% Pashtun and 54.5% British. So it seems that the HGDP Pathan samples are reasonable for at least this individual.

Admixture (Ref3 K=11) HRP0161-HRP0170

Here are the admixture results using Reference 3 for Harappa participants HRP0161 to HRP0170.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

HRP0161 is my mom.

HRP0169 is our first 100% Sri Lankan Tamil. Admixture results are close to the other non-Brahmin Tamils.

HRP0170 is a Haryana Jatt whose results match the other Haryana/UP Jatt.

Admixture Ref3 Dendrogram HRP0001-HRP0160

I haven't done any admixture dendrograms in a while, so I thought you guys might be interested.Особенности национального строительства. Стены помещения.

This uses admixture results using Reference 3. As usual, I used complete linkage for the hierarchical clustering.

Let's look at the dendrogram using regular Euclidean distance measure between admixture results.

I also decided to use chi squared distance measure to do the clustering.

PS. Any thoughts on the trees based on two different distance measures?

Admixture (Ref3 K=11) HRP0151-HRP0160

Here are the admixture results using Reference 3 for Harappa participants HRP0151 to HRP0160.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

There are several interesting participants here. HRP0151 is a quarter Nepalese and his/her results are actually quite odd. The East Asian ancestry shows up as Native American which is possible. I wonder if the quarter Chinese ancestry is not Han but rather some other Chinese ethnicity.

HRP0155 is Sri Lankan Sinhalese and has a lower Onge component than I expected.

HRP0158 is my Dad and has similar results as me (HRP0001).

Harappa Participants Haplogroups

All the ancestry analysis here has been based on the autosomal genome (i.e., the SNPs on chromosomes 1-22) and not on the sex chromosomes X and Y or the mitochondrial DNA. The reason is basically that the autosome provides information about your overall ancestry.

Since the Y chromosome is inherited only from father to son, it is useful for finding out about your paternal line. Similarly, mitochondrial DNA is inherited from mother to child, so that's good for information on the maternal line. Note however that the paternal and maternal lines are not the sum total of your ancestry. In fact, it is quite possible to have very different mtDNA or Y-DNA ancestry compared to your whole genome.

Anyway, many people are interested in paternal (Y-DNA) haplogroups and maternal (mtDNA) haplogroups. AV requested information on the haplogroups of Harappa Project participants and SB created a spreadsheet where project participants can enter their paternal and maternal haplogroups. I am also pulling that information into my Harappa Participants Ethnicity spreadsheet.

If you tested with 23andme, here are the links to their maternal and paternal haplogroup pages.

Now go ahead and enter your information in the haplogroups spreadsheet.

You might also want to take a look at the Harappa Participants Map.

UPDATE: Please be considerate of others' privacy. Only disclose someone else's information (haplogroups, location, or anything else) if you have explicit permission to do so. Thanks!

Admixture (Ref3 K=11) HRP0141-HRP0150

Here are the admixture results using Reference 3 for Harappa participants HRP0141 to HRP0150.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

Admixture (Ref3 K=11) HRP0131-HRP0140

Here are the admixture results using Reference 3 for Harappa participants HRP0131 to HRP0140.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.