Harappa Ancestry Project

HarappaWorld HRP0352-HRP0374

Posted by Zack on November 1, 2013 44 comments

I have added the HarappaWorld Admixture results for HRP0352-HRP0374 to the individual spreadsheet.

Do note that the admixture components do not necessarily represent real ancestral populations. Also, the names I have chosen for the components should be thought of as mnemonics to ease discussion. I chose them based on which populations in my data these components peaked in. They do not tell anything directly about ancestral populations. The best way to look at these admixture results is by comparing individuals and populations. Finally, the standard error estimates on these results can be about 1%. Therefore, it is entirely possible that your 1% exotic admixture result is just noise.

I have also updated the group averages.

Burusho Kalash HarappaWorld Admixture

Posted by Zack on October 29, 2013 45 comments

Someone asked for the individual HarappaWorld Admixture results for the Burusho and Kalash from HGDP.

In the chart below as well as in the spreadsheet, the IDs starting with "b" belong to the Burusho and those starting with "k" belong to the Kalash individuals.

You can check the spreadsheet too.

Personal Journey

Posted by Zack on October 28, 2013 Comments Off

I have had myself, my wife, my daughter, my parents, and my sister genotyped by 23andme.

From time to time, I explore my personal and family data on my other blog. If you are interested, you can read about it under the Genetics category there.

My latest post is about figuring out how my inbred genome has passed on to my daughter.

Affiliate Links

Posted by Zack on October 9, 2013 1 comment

As you can see, I now have affiliate links on the sidebar. If you order a test following those links, I get a small amount for referring you.

Right now, I have FTDNA and 23andme listed there.

HarappaWorld HRP0328-HRP0351

Posted by Zack on August 28, 2013 95 comments

I have added the HarappaWorld Admixture results for HRP0328-HRP0351 to the individual spreadsheet.

Do note that the admixture components do not necessarily represent real ancestral populations. Also, the names I have chosen for the components should be thought of as mnemonics to ease discussion. I chose them based on which populations in my data these components peaked in. They do not tell anything directly about ancestral populations. The best way to look at these admixture results is by comparing individuals and populations. Finally, the standard error estimates on these results can be about 1%. Therefore, it is entirely possible that your 1% exotic admixture result is just noise.

I have also updated the group averages.

Since 10 of the 24 new participants are Punjabis (with 8 being Punjabi Jatts), I now have 33 Punjabis in HAP. Therefore, I will try to write about the Punjabi samples and their results next week.

Genetic Evidence for Recent Population Mixture in India

Posted by Zack on August 8, 2013 130 comments

Finally the paper I had been waiting for ever since the conference presentations on ANI-ASI admixture dating by Moorjani et al at Reich Lab is out:

Moorjani et al., Genetic Evidence for Recent Population Mixture in India, The American Journal of Human Genetics (2013), http://dx.doi.org/10.1016/j.ajhg.2013.07.006

Here's the abstract:

Most Indian groups descend from a mixture of two genetically divergent populations: Ancestral North Indians (ANI) related to Central Asians, Middle Easterners, Caucasians, and Europeans; and Ancestral South Indians (ASI) not closely related to groups outside the subcontinent. The date of mixture is unknown but has implications for understanding Indian history. We report genome-wide data from 73 groups from the Indian subcontinent and analyze linkage disequilibrium to estimate ANI-ASI mixture dates ranging from about 1,900 to 4,200 years ago. In a subset of groups, 100% of the mixture is consistent with having occurred during this period. These results show that India experienced a demographic transformation several thousand years ago, from a region in which major population mixture was common to one in which mixture even between closely related groups became rare because of a shift to endogamy.

In this paper, Moorjani et al calculate ANI (Ancestral North Indian) percentage as:

From Reich et al, they changed the outgroup from Papuan to Yoruba and the ANI clade group from CEU (Utahn Whites) to Georgians. I think both are much better choices. Looking at the D-statistics in Table S2, Georgians are definitely an appropriate choice for forming a clade with ANI.

Another important result from the paper is the difference in the date of admixture for Dravidians (108 generations or 3,132 years) and Indo-Europeans (72 generations = 2,088 years).

Testing for multiple waves of admixture, they find that it is more likely in upper-caste and middle-caste Indo-Europeans and the admixture history of a lot of Indian groups is more complex.

UPDATE: Razib and Dienekes comment.

Bengalis

Posted by Zack on August 7, 2013 16 comments

Let's take a look at the Bengali participants of the Harappa Ancestry Project.

I have added a suffix to the IDs where B = Brahmin, V = Vaidya and M = Muslim.

Here are the HarappaWorld Admixture results for the Bengalis which you can also see in a spreadsheet.

It's easy to see the difference between the Brahmins and others.

Razib wanted to know the origin of the East Asian ancestry among the Bengalis. So I ran a supervised ADMIXTURE with the following populations set as ancestral:

Altaian
Burmanese
Buryat
Cambodian
Chukchi
Dai
Daur
Dolgan
Evenki
Georgian
Gujarati-A
Han
Han-NChina
Hezhen
Japanese
Ket
Kinh
Koryak
Lahu
Miao
Mongola
Mongolian
Naxi
Nganassan
Oroqen
Selkup
She
Singapore-Malay
Tibet
Tu
Tujia
Tuvinian
Xibo
Yakut
Yi
Yukaghir

While most of these populations are various East Asian groups, I used the Gujarati-A as the South Asian group since it has the most South Indian + Baloch components without any East Asian influence. I used the Georgians as a proxy for West Asian ancestry.

Since it's K=36, I ran ADMIXTURE 10 times with different seeds and computed the average percentages for the Bengali participants. The number of SNPs was about 85,565. I did a similar analysis at K=35 after excluding the Tibetans, which got me 263,000 SNPs. The results were broadly similar.

I am showing only the first 12 ancestral components since all the rest were less than 0.5% for all the Bengalis (Spreadsheet).

Please do remember that in supervised ADMIXTURE, I assign the ancestral populations and the algorithm has to find the best fit using those populations. So it's not showing actual ancestry but broad affinity. Also, the exact percentages are not important and can vary when I change the parameters of the analysis. Just look at the broad trends.

The general pattern is that Bengali Brahmins have the least Eastern Eurasian and the most West Asian. The Eastern Eurasian ethnicity most closely related to Bengalis is Burmese.

Interestingly, there is a pattern of a small amount of Siberian ancestry among these Bengalis. Let's add all the Siberian and Russian Far East groups.

ID	Ethnicity	Siberian
HRP0244	West Bengal Rajput	5.07%
HRP0077B	Bengali Brahmin	5.01%
HRP0049	Bengali	4.45%
HRP0252B	Bengali Brahmin	4.01%
HRP0268B	Bengali Brahmin	3.90%
HRP0023M	Bengali Muslim	3.54%
HRP0316B	Bengali Brahmin	3.45%
HRP0054B	Bengali Brahmin	3.41%
HRP0300M	Bengali Muslim	2.95%
HRP0240V	Bengali Vaidya	1.78%
HRP0293B	Bengali Brahmin	1.02%
HRP0291V	Bengali Vaidya	0.99%
HRP0317M	Bengali Muslim	0.89%
HRP0321M	Bengali Muslim	0.58%
HRP0322M	Bengali Muslim	0.41%
HRP0022M	Bengali Muslim	0.37%
HRP0091B	Bengali Brahmin	0.01%

I am not sure of the pattern here, but at least the first few are above noise thresholds.

An Analysis of HAP by Razib

Posted by Zack on August 6, 2013 5 comments

Razib Khan looked over the HarappaWorld Admixture results and posted his analysis on his GNXP blog. Go read it.

Comments: Signal vs Noise

Posted by Zack on August 5, 2013 16 comments

Recently there has been a lot of noise in the comments here with very little real information. That is a waste of time and effort for everyone.

I would appreciate if all of you thought about any comments you plan to make. The stronger your belief in a proposition, the more you should hesitate before posting it.

One thing you should keep in mind is that I know more than you do. I do not mean that as a boast but as a fact. Of course, I do not know the esoterica of South Asian caste divisions, religious rituals and cultural practices. It can even be said that my knowledge of Indian history is inferior to that of European, American and Near Eastern history. However, when participants send me their data, they are generous with their personal information. I have a lot more information about their ethnic backgrounds than is public, due to privacy concerns. Another factor is that a lot of the analyses I run do not end up here due to various reasons. But I use them in creating a complete picture. These two things give me an unfair advantage over you guys.

Generally, I have kept a very light hand on the comment section. This might have to do with this being my hobby. Thus I do not have time to control the conversation and reduce the noise. And I also feel that I have much to learn about history and genetics from all comers.

However, recently I feel that I need to run a much tighter ship so discussions are useful and on-topic and not the hobby horses of a few crazed people. I might have to take a page from my friend Razib Khan's comment policy and be very strict about deleting, warning and banning.

Haber et al Lebanon Data

Posted by Zack on August 3, 2013 45 comments

Haber et al published a paper Genome-Wide Diversity in the Levant Reveals Recent Structuring by Culture in PLoS Genetics. Here's their abstract:

The Levant is a region in the Near East with an impressive record of continuous human existence and major cultural developments since the Paleolithic period. Genetic and archeological studies present solid evidence placing the Middle East and the Arabian Peninsula as the first stepping-stone outside Africa. There is, however, little understanding of demographic changes in the Middle East, particularly the Levant, after the first Out-of-Africa expansion and how the Levantine peoples relate genetically to each other and to their neighbors. In this study we analyze more than 500,000 genome-wide SNPs in 1,341 new samples from the Levant and compare them to samples from 48 populations worldwide. Our results show recent genetic stratifications in the Levant are driven by the religious affiliations of the populations within the region. Cultural changes within the last two millennia appear to have facilitated/maintained admixture between culturally similar populations from the Levant, Arabian Peninsula, and Africa. The same cultural changes seem to have resulted in genetic isolation of other groups by limiting admixture with culturally different neighboring populations. Consequently, Levant populations today fall into two main groups: one sharing more genetic characteristics with modern-day Europeans and Central Asians, and the other with closer genetic affinities to other Middle Easterners and Africans. Finally, we identify a putative Levantine ancestral component that diverged from other Middle Easterners ~23,700â€“15,500 years ago during the last glacial period, and diverged from Europeans ~15,900â€“9,100 years ago between the last glacial warming and the start of the Neolithic.

They also released their data consisting of 75 Lebanese from different regions of the country, with 25 samples each for Muslims, Druze and Christians.

Here are the HarappaWorld admixture results for the Lebanese.

You can check the spreadsheet too.

As the authors mention in the summary:

Population stratification caused by nonrandom mating between groups of the same species is often due to geographical distances leading to physical separation followed by genetic drift of allele frequencies in each group. In humans, population structures are also often driven by geographical barriers or distances; however, humans might also be structured by abstract factors such as culture, a consequence of their reasoning and self-awareness. Religion in particular, is one of the unusual conceptual factors that can drive human population structures. This study explores the Levant, a region flanked by the Middle East and Europe, where individual and population relationships are still strongly influenced by religion. We show that religious affiliation had a strong impact on the genomes of the Levantines. In particular, conversion of the region's populations to Islam appears to have introduced major rearrangements in populations' relations through admixture with culturally similar but geographically remote populations, leading to genetic similarities between remarkably distant populations like Jordanians, Moroccans, and Yemenis. Conversely, other populations, like Christians and Druze, became genetically isolated in the new cultural environment. We reconstructed the genetic structure of the Levantines and found that a pre-Islamic expansion Levant was more genetically similar to Europeans than to Middle Easterners.

the Lebanese can be grouped better based on religion than region. That's why I am using group averages by religion.

Harappa Ancestry Project

Genetics and South Asia

HarappaWorld HRP0352-HRP0374

Burusho Kalash HarappaWorld Admixture

Personal Journey

Affiliate Links

HarappaWorld HRP0328-HRP0351

Genetic Evidence for Recent Population Mixture in India

Bengalis

An Analysis of HAP by Razib

Comments: Signal vs Noise

Haber et al Lebanon Data

Contact

My Sites

Data

Affiliate DNA Tests

Categories

Archives

Recent Comments

Blogroll

Genetics and South Asia

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Contact

My Sites

Data

Affiliate DNA Tests

Categories

Tags

Archives

Recent Comments

Blogroll