Of interest to readers on this weblog: Pathan parahistory.
Category Archives: Miscellaneous - Page 2
Some posts elsewhere:
Several cultural or religious groups claim descent from a common ancestor. The extent to which this claimed ancestry is real or socially constructed can be assessed by means of genetic studies. Syed is a common honorific title given to male Muslims belonging to certain families claiming descent from the Prophet Muhammad through his grandsons Hassan and Hussein, who lived 1,400 years ago and were the sons of the Prophet’s daughter Fatima. If all Syeds really are in direct descent from Hassan and Hussein, we would expect the Y chromosomes of Syeds to be less diverse than those of non-Syeds. Outside the Arab world, we would also expect to find that Syeds share Y chromosomes with Arab populations to a greater extent than they do with their non-Syed geographic neighbours. In this study, we found that the Y chromosomes of self-identified Syeds from India and Pakistan are no less diverse than those non-Syeds from the same regions, suggesting that there is no biological basis to the belief that self-identified Syeds in this part of the world share a recent common ancestry. In addition to Syeds, we also considered members of other hereditary Muslim lineages, which either claim descent from the tribe or family of Muhammad or from the residents of Medinah. Here, we found that these lineages showed greater affinity to geographically distant Arab populations, than to their neighbours from the Indian subcontinent, who do not belong to an Islamic honorific lineage.
The results are pretty simple. First:
1) The Syed lineages don't exhibit a "Syed modal haplotype." What you should see is a Syed haplotype of ~50%, and then a range of other lineages which introgressed through people lying about their origins or women being unfaithful to their husbands. Instead there are a wide range of haplotypes. Being Syed is an honorific.
2) I don't think that they really prove higher Arab ancestry as such. They include really diverse populations, from Algerians to Israeli Arabs to Sudanese. The Islamic Honorific Lineages are somewhat closer to these groups, but that could be generic West Asian ancestry. For example, Persian. Or perhaps more African ancestry in cosmopolitan Syed lineages. Or, perhaps Syeds are just former high caste Hindus, who have more West Asian affinities.
Below is the PCA and list of Y chromosomal haplogroups. The paper is free at the link above.
I am going on vacation. Unfortunately, the rush of work before a vacation means that I haven't been able to finish the analysis of ASI (Ancestral South Indian) that I wanted to present before going.
Since the only computer I am taking with me is my daughter's netbook, I don't expect to work much on the Harappa Ancestry Project during that time.
I will have Internet access, so I might write some. However, there will not be any new analyses of participants. Also, I am more likely to post a travelogue on my regular blog.
In the meantime, I have Razib lined up as guest blogger here. Hopefully, you guys will have some good discussions.
I encourage people to keep sending me their 23andme or FTDNA data which I'll analyze as soon as I am back.
I expect to be back working on the project at the start of July.
A journalist from Times of India has contacted me to talk about the Harappa Ancestry Project. She is interested in talking to some Indian participants about it.
If any of you are interested, let me know in comments or by email and I'll forward your contact information to the journalist.
UPDATE: I have sent the contact info of the six people who volunteered.
What would you want from this project? What sort of analyses would you like me to do?
I know several of you want regional admixture/PCA analyses and those are coming starting next week.
In addition to that, is there something specific you would like to be investigated?
For example, is there some specific supervised admixture you would like me to run? A specific PCA/MDS analysis?
Or do want me to try to synthesize all the results we have gotten into some sort of coherent theory instead of throwing out the numbers like I have been doing?
Since Dodecad posted nearest IBS (identity by state) neighbors, I have had requests to do the same for Harappa participants.
I have the data ready but I am not sure how to present it. I don't want to post an R object since I suspect most of you don't have it installed.
The idea is to give you a list of your closest IBS neighbors as well as your match percentage with them. How would you present that that for 90 people who might match any of several hundred (thousand?) reference samples too? Give me some ideas.
23andme is having the DNA Day sale early.
Monday April 11, 2011, they are selling the kits for FREE with a $9/month 1 year commitment. So basically a total of $108. This is compared to $199 + $9/month (=$307) regular price. It's even less than the Christmas sale (assuming you cancel subscription after a year).
The sale is on midnight Pacific time tonight (3am Eastern time or 7am GMT) and will end April 11 11:59pm Pacific time (2:59am Eastern time or 6:59am GMT April 12).
Spread the word and get people to participate in our Harappa Project too.
I got my daughter a netbook, so now my computer is doing Harappa Project work 24x7.
Also, Simranjit was nice enough to offer me the use of a server. For privacy reasons, I am not going to upload any of the participants' data there but it is much faster than my machine and hence very useful for running Admixture on the reference data (especially with crossvalidation).
As for steps back, I downloaded the current 1000genomes data (1,212 samples, 2.4 million SNPs). It's in vcf format. Using vcftools to convert it to ped format will take about 3 weeks. Yes you heard that right. BTW, the good stuff from a South Asian point of view will come later this year with a 100 Assamese Ahom, 100 Kayadtha from Calcutta, 100 Reddys from Hyderabad, 100 Maratha from Bombay and 100 Lahori Punjabis.
Also, I spent most of Sunday evening and night in the ER and got a diagnosis of ureterolithiasis for my efforts. All I can say is: Three cheers for Percocet!!
UPDATE: Dienekes was kind enough to send me his conversion code which looking at the source code should run really fast.
I am still astonished at why the vcftools conversion code is so slow. May be I should look at their source code.