Monthly Archives: July 2011

Admixture: Supervised Zombies Vs Unsupervised

I wanted to see how the supervised ADMIXTURE using zombies performed compared to regular unsupervised ADMIXTURE. Zombies here refers to genomes created using the --simulate option of plink from allele frequencies.

Therefore, I used the allele frequencies computed by Admixture for K=11 ancestral components for Reference 3 to generate 25 zombie individuals per ancestral component.

Using these 275 zombie samples as belonging 100% to one ancestral component, I ran Admixture in supervised mode on the Reference 3 dataset. You can see the population average results here (compare to unsupervised results).

Since I was interested in the difference between the supervised zombie admixture and the unsupervised results, here are the histograms for the difference between the two for all 3,886 samples and each ancestral component. The histogram bins are 0.5% wide.











Most of the results are within the usual error margins. Except for C7 West African component and C10 San/Pygmy component. Those two have larger differences between the unsupervised and supervised zombies approaches. Basically, individuals with West Africans or San/Pygmy ancestry get ~5-8% more West African component in the supervised zombie case with a corresponding decrease in the San/Pygmy component.

Related Reading:

Semi-Supervised Learning (Adaptive Computation and Machine Learning series)
Zachary Zombie and the Lost Boy, A Story for Demented Children (Stories for Demented Children)
Merriam-Webster's Everyday Language Reference Set
A Zombie Apocalypse
How to Use Your Mind A Psychology of Study: Being a Manual for the Use of Studentsand Teachers in the Administration of Supervised Study

Zack's Genome Public

I have released my 23andme version 3 genome into the public domain.

I challenge y'all to find anything interesting about my chromosome 9 (which is 93% homozygous).

Also, I got my parents' 23andme results last night.

Related Reading:

The Cure For Everything: Untangling Twisted Messages about Health, Fitness, and Happiness
Public Speaking: How To Become The Best Public Speaker (Public Speaking: Complete)
Public Parts: How Sharing in the Digital Age Improves the Way We Work and Live
Public Secrets
Genes, Chromosomes, and Disease: From Simple Traits, to Complex Traits, to Personalized Medicine (FT Press Science)

Harappa Participants Haplogroups

All the ancestry analysis here has been based on the autosomal genome (i.e., the SNPs on chromosomes 1-22) and not on the sex chromosomes X and Y or the mitochondrial DNA. The reason is basically that the autosome provides information about your overall ancestry.

Since the Y chromosome is inherited only from father to son, it is useful for finding out about your paternal line. Similarly, mitochondrial DNA is inherited from mother to child, so that's good for information on the maternal line. Note however that the paternal and maternal lines are not the sum total of your ancestry. In fact, it is quite possible to have very different mtDNA or Y-DNA ancestry compared to your whole genome.

Anyway, many people are interested in paternal (Y-DNA) haplogroups and maternal (mtDNA) haplogroups. AV requested information on the haplogroups of Harappa Project participants and SB created a spreadsheet where project participants can enter their paternal and maternal haplogroups. I am also pulling that information into my Harappa Participants Ethnicity spreadsheet.

If you tested with 23andme, here are the links to their maternal and paternal haplogroup pages.

Now go ahead and enter your information in the haplogroups spreadsheet.

You might also want to take a look at the Harappa Participants Map.

UPDATE: Please be considerate of others' privacy. Only disclose someone else's information (haplogroups, location, or anything else) if you have explicit permission to do so. Thanks!

Related Reading:

Surnames, DNA, and Family History
From Harappa to Hastinapura: A Study of the Earliest South Asian City and Civilization (American School of Prehistoric Research Monograph Series)
Haplogroup C (Y-DNA)
Intimate Fathers: The Nature and Context of Aka Pygmy Paternal Infant Care

Admixture (Ref3 K=11) HRP0141-HRP0150

Here are the admixture results using Reference 3 for Harappa participants HRP0141 to HRP0150.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

Related Reading:

Quicksheet Citing Ancestry.com Databases & Images
Indus Valley Painted Pottery - A Comparative Study Of The Designs On The Painted Wares Of The Harappa Culture
Uncovering Your Ancestry through Family Photograph

150 and 100

The total number of participants has gotten to 150. The Harappa Ancestry Project has been active for a little less than 6 months.

Of these, depending on how you count, we are at about 100 unrelated South Asians.

Sorry for my absence from the comment section and the slow pace of posting. There's a lot of backlog of work and home errands that accumulated during my vacation.

Related Reading:

50 Seasons of Quarterbacks - 1960-2010 - A comprehensive comparison and ranking of 150 of the greatest quarterbacks of all time (Pro Football Stats)
Rites of Passage at $100,000 to $1 Million+: Your Insider's Lifetime Guide to Executive Job-Changing and Faster Career Progress in the 21st Century
The Wonder Years: Helping Your Baby and Young Child Successfully Negotiate The Major Developmental Milestones
FASHION: 150 Years of Couturiers, Designers, Labels
100,000 Hearts: A Surgeon's Memoir

Admixture (Ref3 K=11) HRP0131-HRP0140

Here are the admixture results using Reference 3 for Harappa participants HRP0131 to HRP0140.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

Related Reading:

Indus Valley Painted Pottery - A Comparative Study Of The Designs On The Painted Wares Of The Harappa Culture
From Harappa to Hastinapura: A Study of the Earliest South Asian City and Civilization (American School of Prehistoric Research Monograph Series)
The Seven Daughters of Eve: The Science That Reveals Our Genetic Ancestry
The Everything Guide to Online Genealogy: Use the Web to trace your roots, share your history, and create a family tree (Everything Series)

Admixture (Ref3 K=11) HRP0121-HRP0130

Here are the admixture results using Reference 3 for Harappa participants HRP0121 to HRP0130.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

Related Reading:

Deep Ancestry: Inside The Genographic Project
India Divided Religion 'Then' (1947) (East-West): 'Now' What Languages ( North-South ) ?....
Quicksheet Citing Ancestry.com Databases & Images
Ancient Cities of the Indus Valley Civilization

Back

I got back yesterday. Due to jetlag and things that accumulated during my absence, it'll take me a few days to get back to regular posting here.

I got about 12 data submissions during my vacation. I'll send everyone their Harappa IDs by tomorrow. If you haven't received an ID from me by Thursday morning, drop me an email to remind me.

I have 2 batches of ten to process for Admixture results. I hope to post those by the end of the week but can't make any promises.

If you sent me an email in the last month that required a response, I will try to reply soon. If you haven't heard from me by July 11, please remind me.

Thanks, Razib, for your guest blogging.

Related Reading:

The Interregnum of Despair: Hoover, Congress, and the Depression
Cromwell and the Interregnum: The Essential Readings (Blackwell Essential Readings in History)
Hobbes, Bramhall and the Politics of Liberty and Necessity: A Quarrel of the Civil Wars and Interregnum (Cambridge Studies in Early Modern British History)
Ironroot
Justice to the Maimed Soldier: Nursing, Medical Care and Welfare for Sick and Wounded Soldiers During the Civil Wars and Interregnum, 1642-1660 (The History of Medicine in Context)

Brahui are something old, not new

From Wikipedia:

The ethnonym "Brahui" is a very old term and a purely Dravidian one. The fact that other Dravidian languages only exist further south in India has led to several specualations about the orgins of the Brahui. There are three hypotheses regarding the Brahui that have been proposed by academics. One theory is that the Brahui as a relic population of Dravidians, surrounded by speakers of Indo-Iranian languages, remaining from a time when Dravidian was more widespread. Another theory is that they migrated to Baluchistan from inner India during the early Muslim period of the 13th or 14th centuries. More established theory says the Brahui migrated to Balochistan from central India after 1000 CE. The absence of any older Iranian (Avestan) influence in Brahui supports this hypothesis. The main Iranian contributor to Brahui vocabulary is a western Iranian language like Kurdish.

A lot of ADMIXTURE plots I've seen are more consistent with the first (indigenous) than the latter two (exogenous) models. Here's a result for K = 9 with ~90,000 markers:

Read more »

Related Reading:

The life-history of a Brahui / by Denys Bray
Oral Literature of Iranian Languages: Kurdish, Pashto, Balochi, Ossetic; Persian and Tajik: Companion Volume II: History of Persian Literature A, Vol XVIII

Turks and Pathans

Of interest to readers on this weblog: Pathan parahistory.

Related Reading:

The Pathan borderland; a consecutive account of the country and people on and beyond the Indian frontier from Chitral to Dera Ismail Khan
The Pathans: 550 B.C.- A.D. 1957 (Oxford in Asia Historical Reprints)
Security of Self-Organizing Networks: MANET, WSN, WMN, VANET