Tag Archives: sindhi

Gujaratis HarappaWorld Admixture

Someone asked for the individual HarappaWorld Admixture results for the Gujarati B from HapMap.

To refresh your memory, the Gujarati B are those individuals who do not form part of the big closely clustered Gujarati cluster.

I decided to include HGDP Sindhis as well as Gujaratis, Rajasthanis, Maharashtrians, etc from Harappa Ancestry Project in the list so the Gujaratis can be compared to the people of neighboring regions.

You can check the spreadsheet too.

UPDATE: I have added the Thathai Bhatia and Halai Bhatia participants that I had forgotten.

June Update

I have a total of 123 participants in the project right now who have sent me their raw data. Six of those have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.Укладка дикого камня

The following groups are represented:

Most are 23andme data while 4 are from FTDNA.

We are getting close to 100 South Asian participants.

April Update

I have a total of 97 participants in the project right now who have sent me their raw data. Six of those have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.http://mountainsphoto.ru

The following groups are represented:

Let's try to get to hundred soon.

And yes, I am accepting FTDNA Family Finder (new Illumina chip) now.

End of March Update

I have a total of 67 participants in the project right now who have sent me their raw data. This is not counting those who have relatives participating and thus have to be filtered out for most analysis other than individual admixture percentages etc where I divide participants into small groups.http://polvam.ru

The following groups are represented:

I need to post analyses of Tamils, Bengalis and Punjabis soon.

HGDP

Human Genome Diversity Project (HGDP) is the best resource for a diverse set of genomic data. It has 1050 individuals from 52 different populations.

I got the Stanford University data which has data for 660,918 SNPs from 1,043 samples. It is claimed that the forward strand is given but that turned out not to be true and I had to flip strands and make sure I didn't include any ambiguous A/T or C/G strands in my dataset.

I followed the recommendations of Rosenberg (spreadsheet) in excluding some atypical samples and relatives, leaving me with 940 samples.

I also excluded the Native American samples because we are not interested in them and they are very closely related either due to recent endogamy or ancient bottlenecks. (yeah I had the nerve to write that.)

Of the total of 876 samples, here are the numbers for our populations of interest:

Balochi 24
Brahui 25
Burusho 25
Hazara 22
Kalash 23
Makrani 25
Pathan 22
Sindhi 24
Total South Asians 190

These samples have about 541,560 SNPs in common with 23andme v2.