Tag Archives: gujarati

June Update

I have a total of 123 participants in the project right now who have sent me their raw data. Six of those have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.

The following groups are represented:

  • South Asian: 90
    • Tamil: 15
    • Punjab: 13
    • Bengal: 9
    • Karnataka: 7
    • Andhra Pradesh: 5
    • Uttar Pradesh: 5
    • Kerala: 5
    • Bihar: 5
    • Gujarati: 4
    • Sindhi: 4
    • Maharashtra: 3
    • Sri Lankan: 3
    • Caribbean Indian: 2
    • Kashmir: 2
    • Romani: 2
    • Goa: 1
    • Rajasthan: 1
    • Baloch: 1
    • Orissa: 1
    • Anglo-Indian: 1
    • Unknown: 1
  • Others: 33
    • Iran: 8
    • Assyrian: 3
    • Kurd: 2
    • Mexican: 2
    • Ashkenazi: 2
    • Northwest European: 2
    • Iraqi Arab: 2
    • Georgian: 1
    • Azeri: 1
    • Kazakh: 1
    • Brazilian: 1
    • Yemen: 1
    • Irish: 1
    • Egypt: 1
    • Gagauz Turk: 1
    • Afro-Belizean: 1
    • Iraqi Mandaean: 1
    • Egyptian/Iraqi Jew: 1
    • French/Madagascar/Indian: 1

Most are 23andme data while 4 are from FTDNA.

We are getting close to 100 South Asian participants.

Related Reading:

In the Valley of Mist: Kashmir: One Family In A Changing World
Old French Fairy Tales
Yemen Divided: The Story of a Failed State in South Arabia
Thirty-Three Secrets Arab Men Never Tell American Women: A Dissection of How Muslims Treat Women and Infidels
Precolonial India in Practice: Society, Region, and Identity in Medieval Andhra

April Update

I have a total of 97 participants in the project right now who have sent me their raw data. Six of those have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.

The following groups are represented:

  • Tamil: 14
  • Punjab: 10
  • Bengal: 7
  • Iran: 7
  • Karnataka: 6
  • Andhra Pradesh: 4
  • Uttar Pradesh: 4
  • Gujarati: 3
  • Kerala: 3
  • Maharashtra: 3
  • Assyrian: 3
  • Bihar: 2
  • Caribbean Indian: 2
  • Kashmir: 2
  • Sindhi: 2
  • Sri Lankan: 2
  • Iraqi Arab: 2
  • Anglo-Indian: 1
  • Roma: 1
  • Goa: 1
  • Rajasthan: 1
  • Egyptian/Iraqi Jew: 1
  • Baloch: 1
  • Iraqi Kurd: 1
  • Georgian: 1
  • Azeri: 1
  • French/Madagascar/Indian: 1
  • Kazakh: 1
  • Ashkenazi: 1
  • Brazilian: 1
  • Mexican: 1
  • Unknown: 2

Let's try to get to hundred soon.

And yes, I am accepting FTDNA Family Finder (new Illumina chip) now.

Related Reading:

India Treasures : An Epic Novel of Rajasthan and Northern India through the Ages
The New Arab Revolt: What Happened, What It Means, and What Comes Next
Kashmir: Roots of Conflict, Paths to Peace

End of March Update

I have a total of 67 participants in the project right now who have sent me their raw data. This is not counting those who have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.

The following groups are represented:

  • Tamil: 11
  • Punjab: 9
  • Iran: 7
  • Bengal: 5
  • Uttar Pradesh: 4
  • Andhra Pradesh: 3
  • Kerala: 3
  • Gujarati: 3
  • Bihar: 2
  • Karnataka: 2
  • Caribbean Indian: 2
  • Kashmir: 2
  • Sri Lankan: 2
  • Maharashtra: 2
  • Iraqi Arab: 2
  • Anglo-Indian: 1
  • Roma: 1
  • Goa: 1
  • Rajasthan: 1
  • Baloch: 1
  • Sindhi: 1
  • Iraqi Kurd: 1
  • Egyptian/Iraqi Jew: 1

I need to post analyses of Tamils, Bengalis and Punjabis soon.

Related Reading:

ANDHRA MAHABARATHAMU-ADI PARVAMU (Telugu Edition)
Tamil for Beginners
Israel vs. Iran: The Shadow War
Thirty-Three Secrets Arab Men Never Tell American Women: A Dissection of How Muslims Treat Women and Infidels
The Secret Keeper

HapMap Gujaratis

Razib is wondering what's going on with the HapMap Houston Gujaratis.

As you can see, the Chinese simply do not vary much, and are a tight cluster. But, there is a somewhat equivalent Gujarati cluster too! The HapMap sample was collected from Gujaratis in Houston. To me, it looks like that Houston population can be divided into two groups: one of the tight cluster, and the rest of the population, which is all over the place. [...] What’s more interesting is to try and understanding what’s going on with Houston Gujaratis. Anyone in the audience know?

And his 3-dimensional PCA plot: (Those on the right are Gujaratis)
PCA Plot of Gujaratis and Chinese

So I thought I would share the admixture results for the Gujaratis for K=8. Here's the spreadsheet of the admixture proportions for Gujaratis. And here is the plot:

Gujaratis Admixture K=8

The ancestral components and their statistics are as follows:

Population Range Mean Median
C1 South Asian 64-89% 81.9% 85.8%
C2 West Asian 0-13% 2.3% 1.6%
C3 European 2-22% 7.6% 5.0%
C4 Southeast Asian 0-9% 4.9% 5.0%
C5 Austronesian 1-6% 2.8% 2.9%
C6 Northeast Asian 0-3% 0.4% 0.0%
C7 West African 0-1% 0.0% 0.0%
C8 East African 0-0% 0.0% 0.0%

It looks like a majority of the Gujarati samples have mostly South Asian ancestral component with small amounts of West Asian, European and Southeast Asian, but some Gujarati samples have much larger West Asian and/or European ancestral components.

Related Reading:

A Gujarati Girl, My Lohana Lover, My Life, A Game
Colloquial Gujarati (Colloquial Series)
Fieldwork Is Not What It Used to Be: Learning Anthropology's Method in a Time of Transition
GitaSaar-Gujarati

HapMap

I am using several datasets in the public domain for my reference population samples. HapMap is one of those datasets.

According to its website,

The goal of the International HapMap Project is to develop a haplotype map of the human genome, the HapMap, which will describe the common patterns of human DNA sequence variation. The HapMap is expected to be a key resource for researchers to use to find genes affecting health, disease, and responses to drugs and environmental factors. The information produced by the Project will be made freely available.

In the first phase, it genotyped

30 Yoruba adult-and-both-parents trios from Ibadan, Nigeria, 30 trios of U.S. (Utah) residents of northern and western European ancestry, 44 unrelated individuals from Tokyo, Japan and 45 unrelated Han Chinese individuals from Beijing, China.

In their HapMap phase 3 release #3 (NCBI build 36, dbSNP b126), there are 1,397 samples with about 1,457,897 SNPs each.

I removed related individuals as well as individuals whose genomes were too similar. This left me with a total of 1,149 samples with about 474,606 SNPs in common with 23andme's version 2 data.

Since we are not interested in Native American ancestry, I also removed 58 Mexican samples, thus leaving me with 1,091 samples.

Here are the samples I am using from the HapMap data:

Ethnicity Region Count
African Americans Africa 48
European Americans (Utahns) Europe 111
Han Chinese East Asia 137
US Chinese East Asia 106
Gujaratis South Asia 98
Japanese East Asia 113
Kenyan Luhya East Africa 101
Maasai East Africa 135
Tuscans Europe 102
Yoruba West Africa 140

The region assignments are mine to aid me in the analysis, by including/excluding samples by region or by aggregating results by region to find patterns etc.

It was easiest to use the HapMap data since it's available for download in Plink format.

Related Reading:

The Complete Gujarati Cook Book
Learn Gujarati in a Month
Principles of Pharmacogenetics and Pharmacogenomics
Basic Econometrics