Category Archives: Miscellaneous

Project Anniversary

It has been one year for the Harappa Ancestry Project. I announced it on January 17, 2011 and then moved it to its own domain on January 19.

It started out fast and furious with participants sending their data every day and I was blogging it multiple times a day. Now it has slowed down quite a bit with only one South Asian unrelated participant (not counting any Romany) in the last 2 months.

Speaking of participants, there have been a couple of complaints.

The decision to include non-South Asian participants from countries that do not neighbour the Subcontinent contradicts the Harappa Project's original inclusion criterion.

I concur with DMXX. While I'm certainly not anyone to tell Zack how to run his own show, I think the project is losing it's original focus. Accepting folks from West-Asia was also fine, given that South-Asians derive a lot of ancient ancestry from the area and thus may be deemed a secondary focus area. The same could also be said for those with partial Roma Gypsy ancestry. But, some of the runs seem to be almost entirely dominated by non-South Asian participants, who have absolutely no connection with the subcontinent. I can't help but ask on what basis these Brazilian, Belizean, Mexican, Hispanic, Somali, African-American and European participants were accepted into the project.

My approach has been to make it clear to any potential participants that my focus is on South Asia. Thus any Admixture components I have computed have been with a heavily South Asian dataset. Also, a number of PCA and clustering analyses that I do are at times limited to South Asians etc. On the other hand, I have accepted any participant who has asked to be included. I run a basic Admixture run for everyone which is a fairly automated process and sometimes include them in other analyses.

Let me illustrate with an example. I am working on a new Admixture calculator. For computing the components, I am using all the South Asian project participants in addition to reference datasets. I am going to select the data for that in such a way that we get Admixture components which gives us a better idea of South Asian genetic ancestry.

While participation in the project has slowed down, the research on South Asian genetics has picked up. We had the Metspalu et al paper and dataset. Also. 1000genomes is expected to release 400 South Asian samples (100 Lahori Punjabis, 100 Bangladeshis, 100 Sri Lankan Tamil and 100 Indian Telegu) over the summer.

Related Reading:

On Writing: 10th Anniversary Edition: A Memoir of the Craft
HOW TO WRITE A BUSINESS PLAN THAT ATTRACTS INVESTORS
The End of Diabetes: The Eat to Live Plan to Prevent and Reverse Diabetes
The Plan: Eliminate the Surprising "Healthy" Foods That Are Making You Fat--and Lose Weight Fast
Plan D: How to Lose Weight and Beat Diabetes (Even If You Don't Have It)

My Phased Genome

I released my 23andme raw data last year. Since I had my parents also genotyped, I am now releasing my phased genome.

I haven't done anything with it yet. So if you have any ideas of what use I can put it to, chime in.

Related Reading:

The Beagle Owners Guide For Brainiacs
Elemental Rising (Paranormal Public Series)
Phased-Array Radar Design: Application of Radar Fundamentals
Big Data: A Revolution That Will Transform How We Live, Work, and Think
THE HUMAN GENOME, Third Edition: A User's Guide

Happy New Year

Hope y'all have a great 2012! And that we see another fruitful year for personal genetics.

I have been away over the holidays driving around the country and am just getting back to try to catch up with my email. I owe several people IDs which I'll send out by tomorrow.

Related Reading:

The Cartoon Guide to Genetics (Updated Edition)
Schaum's Outline of Genetics, Fifth Edition (Schaum's Outline Series)
The Family Tree Problem Solver: Tried-and-True Tactics for Tracing Elusive Ancestors
Deep Ancestry: Inside The Genographic Project
Genome: The Autobiography of a Species in 23 Chapters (P.S.)

Another 23andme Sale

23andme is having another sale till December 31: $23 off per kit (from $99 up front). The code to take advantage of the sale price is TPHG6P.

20111213-083449.jpg

UPDATE: Here is another link for a $23 discount for 23andme.

Related Reading:

The $1,000 Genome: The Revolution in DNA Sequencing and the New Era of Personalized Medicine
Infographics: The Power of Visual Storytelling
The Ageless Generation: How Advances in Biomedicine Will Transform the Global Economy
The Human Face of Big Data
The Creative Destruction of Medicine: How the Digital Revolution Will Create Better Health Care

Computer Upgrade Delays

I upgraded my desktop last week.

The bad news is that I am having to upgrade from XP to Windows 7 which means reinstalling everything.

The good news is that Ubuntu ran perfectly after the upgrade.

And the best news is that with an Intel Core i7-2600 and 8GB of RAM, Admixture is running about 6 times faster.

Related Reading:

Computer Science Illuminated
How Computers Work (9th Edition)
Computer Programming for Teens (For Teens (Course Technology))
Computer Accounting with QuickBooks 2013
Byte Me: A Day in the Life of a Computer Programmer

Zack's Genome Public

I have released my 23andme version 3 genome into the public domain.

I challenge y'all to find anything interesting about my chromosome 9 (which is 93% homozygous).

Also, I got my parents' 23andme results last night.

Related Reading:

Automate This: How Algorithms Came to Rule Our World
Infographics: The Power of Visual Storytelling
The Creative Destruction of Medicine: How the Digital Revolution Will Create Better Health Care
Public Speaking: Storytelling Techniques for Electrifying Presentations
The Human Face of Big Data

Back

I got back yesterday. Due to jetlag and things that accumulated during my absence, it'll take me a few days to get back to regular posting here.

I got about 12 data submissions during my vacation. I'll send everyone their Harappa IDs by tomorrow. If you haven't received an ID from me by Thursday morning, drop me an email to remind me.

I have 2 batches of ten to process for Admixture results. I hope to post those by the end of the week but can't make any promises.

If you sent me an email in the last month that required a response, I will try to reply soon. If you haven't heard from me by July 11, please remind me.

Thanks, Razib, for your guest blogging.

Related Reading:

Charlie Joe Jackson's Guide to Summer Vacation
The Night Before Summer Vacation (Reading Railroad Books)
Penguin on Vacation
Vacation on Earth
Interregnum

Brahui are something old, not new

From Wikipedia:

The ethnonym "Brahui" is a very old term and a purely Dravidian one. The fact that other Dravidian languages only exist further south in India has led to several specualations about the orgins of the Brahui. There are three hypotheses regarding the Brahui that have been proposed by academics. One theory is that the Brahui as a relic population of Dravidians, surrounded by speakers of Indo-Iranian languages, remaining from a time when Dravidian was more widespread. Another theory is that they migrated to Baluchistan from inner India during the early Muslim period of the 13th or 14th centuries. More established theory says the Brahui migrated to Balochistan from central India after 1000 CE. The absence of any older Iranian (Avestan) influence in Brahui supports this hypothesis. The main Iranian contributor to Brahui vocabulary is a western Iranian language like Kurdish.

A lot of ADMIXTURE plots I've seen are more consistent with the first (indigenous) than the latter two (exogenous) models. Here's a result for K = 9 with ~90,000 markers:

Read more »

Related Reading:

The Baloch and Balochistan: A Historical Account from the Beginning to the Fall of the Baloch State
The Gospel of Matthew in Balochi Language / Balochi (????? also Baluchi, Baloci or Baluci) is a Northwestern Iranian language. It is the principal language of the Baloch of Balochistan, Pakistan, eastern Iran and southern Afghanistan.

Turks and Pathans

Of interest to readers on this weblog: Pathan parahistory.

Related Reading:

The Pathans: 550 B.C.- A.D. 1957 (Oxford in Asia Historical Reprints)
The Pathan Borderland: A Consecutive Account of the Country and People on and Beyond the Indian Frontier From Chitral to Dera Ismail Khan ... [1921 ]
English - Pashto, Pashto - English Dictionary: A modern dictionary of the Pakhto, Pushto, Pukhto Pashtoe, Pashtu, Pushtu, Pushtoo, Pathan, or Afghan language (Iranian Languages Edition)
Political Leadership among Swat Pathans (London School of Economics Monographs on Social Anthropology)

The Pakistan genome

Some posts elsewhere:

- The Pakistani genome

- When will the first Jehovah’s Witness be sequenced?

Related Reading:

Heaven in a Chip: Fuzzy Visions of Society and Science in the Digital Age
Cotton: Biotechnological Advances (Biotechnology in Agriculture and Forestry)
Rinderpest and Peste des Petits Ruminants: Virus Plagues of Large and Small Ruminants (Biology of Animal Infections)
Genetics Manuals: Current Theory, Concepts, Terms