Monthly Archives: July 2011

Admixture: Supervised Zombies Vs Unsupervised

I wanted to see how the supervised ADMIXTURE using zombies performed compared to regular unsupervised ADMIXTURE. Zombies here refers to genomes created using the --simulate option of plink from allele frequencies.

Therefore, I used the allele frequencies computed by Admixture for K=11 ancestral components for Reference 3 to generate 25 zombie individuals per ancestral component.

Using these 275 zombie samples as belonging 100% to one ancestral component, I ran Admixture in supervised mode on the Reference 3 dataset. You can see the population average results here (compare to unsupervised results).

Since I was interested in the difference between the supervised zombie admixture and the unsupervised results, here are the histograms for the difference between the two for all 3,886 samples and each ancestral component. The histogram bins are 0.5% wide.











Most of the results are within the usual error margins. Except for C7 West African component and C10 San/Pygmy component. Those two have larger differences between the unsupervised and supervised zombies approaches. Basically, individuals with West Africans or San/Pygmy ancestry get ~5-8% more West African component in the supervised zombie case with a corresponding decrease in the San/Pygmy component.

Related Reading:

Supervised and Unsupervised Pattern Recognition: Feature Extraction and Computational Intelligence (Industrial Electronics)
LZR-1143: Perspectives
Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks
Microsoft Excel 2010 Introduction Quick Reference Guide (Cheat Sheet of Instructions, Tips & Shortcuts - Laminated Card)
Doing Supervision & Being Supervised

Zack's Genome Public

I have released my 23andme version 3 genome into the public domain.

I challenge y'all to find anything interesting about my chromosome 9 (which is 93% homozygous).

Also, I got my parents' 23andme results last night.

Related Reading:

The Human Face of Big Data
The Creative Destruction of Medicine: How the Digital Revolution Will Create Better Health Care
Genetics: From Genes to Genomes (Hartwell, Genetics)
Public Policy: Politics, Analysis, and Alternatives, 4th Edition
GENOME - The Story of the Most Astonishing Scientific Adventure of Our Time - The Attempt to Map All the Genes in the Human Body

Harappa Participants Haplogroups

All the ancestry analysis here has been based on the autosomal genome (i.e., the SNPs on chromosomes 1-22) and not on the sex chromosomes X and Y or the mitochondrial DNA. The reason is basically that the autosome provides information about your overall ancestry.

Since the Y chromosome is inherited only from father to son, it is useful for finding out about your paternal line. Similarly, mitochondrial DNA is inherited from mother to child, so that's good for information on the maternal line. Note however that the paternal and maternal lines are not the sum total of your ancestry. In fact, it is quite possible to have very different mtDNA or Y-DNA ancestry compared to your whole genome.

Anyway, many people are interested in paternal (Y-DNA) haplogroups and maternal (mtDNA) haplogroups. AV requested information on the haplogroups of Harappa Project participants and SB created a spreadsheet where project participants can enter their paternal and maternal haplogroups. I am also pulling that information into my Harappa Participants Ethnicity spreadsheet.

If you tested with 23andme, here are the links to their maternal and paternal haplogroup pages.

Now go ahead and enter your information in the haplogroups spreadsheet.

You might also want to take a look at the Harappa Participants Map.

UPDATE: Please be considerate of others' privacy. Only disclose someone else's information (haplogroups, location, or anything else) if you have explicit permission to do so. Thanks!

Related Reading:

2015 Draconian NWO: Revelations given to Abbot-Bishop David Michael, OC, ThD (Revelations of Abbot-Bishop David Michael, OC, ThD) (Volume 2)
The Harappa Files

Admixture (Ref3 K=11) HRP0141-HRP0150

Here are the admixture results using Reference 3 for Harappa participants HRP0141 to HRP0150.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

Related Reading:

Harappa: The Cradle of Our Civilization
Quicksheet Citing Ancestry.com Databases & Images
There Were Giants Upon the Earth: Gods, Demigods, and Human Ancestry: The Evidence of Alien DNA (Earth Chronicles)
Ancestry's Concise Genealogical Dictionary

150 and 100

The total number of participants has gotten to 150. The Harappa Ancestry Project has been active for a little less than 6 months.

Of these, depending on how you count, we are at about 100 unrelated South Asians.

Sorry for my absence from the comment section and the slow pace of posting. There's a lot of backlog of work and home errands that accumulated during my vacation.

Related Reading:

Milestones: The Music And Times Of Miles Davis
The Math Book: From Pythagoras to the 57th Dimension, 250 Milestones in the History of Mathematics (Sterling Milestones)
Wheat Belly Cookbook: 150 Recipes to Help You Lose the Wheat, Lose the Weight, and Find Your Path Back to Health
Milestones
Developmental Milestones of Young Children (Redleaf Quick Guides)

Admixture (Ref3 K=11) HRP0131-HRP0140

Here are the admixture results using Reference 3 for Harappa participants HRP0131 to HRP0140.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

Related Reading:

The Seven Daughters of Eve: The Science That Reveals Our Genetic Ancestry
Script of Harappa & Mohenjodaro & Its Connection With Other Scripts
The Family Tree Problem Solver: Tried-and-True Tactics for Tracing Elusive Ancestors

Admixture (Ref3 K=11) HRP0121-HRP0130

Here are the admixture results using Reference 3 for Harappa participants HRP0121 to HRP0130.

You can see the participant results in a spreadsheet as well as their ethnic breakdowns and the reference population results.

Here's our bar chart and table. Remember you can click on the legend or the table headers to sort.

If the above interactive charts are not working, here's a static bar graph.

Related Reading:

There Were Giants Upon the Earth: Gods, Demigods, and Human Ancestry: The Evidence of Alien DNA (Earth Chronicles)
The Official Guide to Ancestry.com
Quicksheet Citing Ancestry.com Databases & Images
Deep Ancestry: Inside The Genographic Project
Genealogy Online Research Quick and Easy Reference Guide

Back

I got back yesterday. Due to jetlag and things that accumulated during my absence, it'll take me a few days to get back to regular posting here.

I got about 12 data submissions during my vacation. I'll send everyone their Harappa IDs by tomorrow. If you haven't received an ID from me by Thursday morning, drop me an email to remind me.

I have 2 batches of ten to process for Admixture results. I hope to post those by the end of the week but can't make any promises.

If you sent me an email in the last month that required a response, I will try to reply soon. If you haven't heard from me by July 11, please remind me.

Thanks, Razib, for your guest blogging.

Related Reading:

The Party: The Socialist Workers Party 1960-1988. VOLUME 2:  INTERREGNUM, DECLINE AND COLLAPSE, 1973-1988
Vacation on Earth
The Night Before Summer Vacation (Reading Railroad Books)
Cromwell and the Interregnum: The Essential Readings (Blackwell Essential Readings in History)
Vacation Fun Mad Libs

Brahui are something old, not new

From Wikipedia:

The ethnonym "Brahui" is a very old term and a purely Dravidian one. The fact that other Dravidian languages only exist further south in India has led to several specualations about the orgins of the Brahui. There are three hypotheses regarding the Brahui that have been proposed by academics. One theory is that the Brahui as a relic population of Dravidians, surrounded by speakers of Indo-Iranian languages, remaining from a time when Dravidian was more widespread. Another theory is that they migrated to Baluchistan from inner India during the early Muslim period of the 13th or 14th centuries. More established theory says the Brahui migrated to Balochistan from central India after 1000 CE. The absence of any older Iranian (Avestan) influence in Brahui supports this hypothesis. The main Iranian contributor to Brahui vocabulary is a western Iranian language like Kurdish.

A lot of ADMIXTURE plots I've seen are more consistent with the first (indigenous) than the latter two (exogenous) models. Here's a result for K = 9 with ~90,000 markers:

Read more »

Related Reading:

The Baloch and Balochistan: A Historical Account from the Beginning to the Fall of the Baloch State
Brahui Language Introduction and Grammar
The Strategic Importance of Balochistan

Turks and Pathans

Of interest to readers on this weblog: Pathan parahistory.

Related Reading:

The Pathans
The Pathan Unarmed: Opposition and Memory in the North West Frontier (World Anthropology (Paperback SAR Press))
The Pathans: 550 B.C.- A.D. 1957 (Oxford in Asia Historical Reprints)
S.O.S. Animals And Other Stories
English - Pashto, Pashto - English Dictionary: A modern dictionary of the Pakhto, Pushto, Pukhto Pashtoe, Pashtu, Pushtu, Pushtoo, Pathan, or Afghan language (Iranian Languages Edition)