Tag Archives: errata

Metspalu Dataset Update

Posted by Zack on January 4, 2012 4 comments

Dr. Metspalu, who has been very good about sharing data and information, has informed me about a couple of cases of mislabeling in the Metspalu et al dataset.

Our sample labelled D238 and reported as Tharu is in fact a Brahmin sample from Uttar Pradesh.

Following the publication we have identified that sample evo_32 was erroneously labelled as Kanjar before any genetic analyses. We hereby re-label the sample as belonging to Kol population.

Thus, I have updated the Metspalu admixture results and clustering results.

Reference 3 Fixed

Posted by Zack on April 17, 2011 2 comments

I have fixed the problem with Reference 3 but if you notice any strange results, do let me know.

While the Reference 3 admixture results were generally good (and I have some nice surprises on the way I hope), the Reich et al populations had some weird behavior. From one K value to the next, their admixture would swing wildly especially among the minor components.

For example, for Chenchu, the 2nd component after South Asian was Southwest Asian (42%) at K=6, European (45%) at K=7 and American (32%) at K=8. That just didn't make any sense. It was similar for other Reich et al populations, but all the other reference populations seemed pretty stable.

The issue was that when I was creating Reference 3, I had to juggle lists of SNPs to figure out a way to include Reich et al with a large (>100,000) number of SNPs in the dataset since Reich doesn't have as many SNPs in common with the other datasets plus 23andme (v2 and v3) and FTDNA. In that effort where I was doing lots of SNP set intersections and unions I messed up. I used 217,000 SNPs. While these SNPs were present in all the other datasets, Reich et al had only 102,000 SNPs common with that set. Ouch! This was a royal mess as the high missing rate of Reich et al caused weird instability in its admixture results even though the rest of the results were mostly stable.

Now, I have pared down Reference 3 to 118,000 SNPs. These have a low missing rate in all the datasets. So I don't expect the same problems.

I am redoing the admixture runs with this new data and will have some of the results up soon.

Reference 3 Admixture

Posted by Zack on April 15, 2011 1 comment

I have withdrawn the Admixture results for Reference 3 for now while I figure out why a few of them were weird and unstable.Далматин

I will report back on what I find and will have fixed results soon.

Changes due to San/Pygmy Removal

Posted by Zack on February 4, 2011 Comments Off

As mentioned earlier, I removed San and Pygmy groups from my reference datasets.

For the admixture runs on Reference Dataset I, the only major changes are for K=2 ancestral components where most European, Middle Eastern and South/Central Asian groups increase their African component. The changes for K=3,4,5 were minor as shown by these statistics:

K	Median Abs	Maximum Abs
3	0.01%	0.22%
4	0.02%	0.26%
5	0.02%	0.71%

I have updated the spreadsheet and the plots in the original post.

Looking at the changes in the admixture results I already posted for Harappa Project participants HRP0001 to HRP0010, there is major change for K=2. The African compoent (C1/red) increased by a lot among all project participants. This seems to be due to the African component best representing West Africans now instead of Pygmies as it did before.

For K=3,4,5, the changes are very minor. Let's look at the absolute value of the changes in the percentages of ancestral components for the ten project participants.

K	Median Abs	Maximum Abs
3	0.05%	0.19%
4	0.05%	0.22%
5	0.09%	0.60%

I have updated the spreadsheets and the charts in the original post.

Harappa Ancestry Project

Genetics and South Asia

Tag Archives: errata

Metspalu Dataset Update

Reference 3 Fixed

Reference 3 Admixture

Changes due to San/Pygmy Removal

Contact

My Sites

Data

Affiliate DNA Tests

Categories

Archives

Recent Comments

Blogroll

Harappa Ancestry Project

Genetics and South Asia

Tag Archives: errata

Metspalu Dataset Update

Share this:

Reference 3 Fixed

Share this:

Reference 3 Admixture

Share this:

Changes due to San/Pygmy Removal

Share this:

Contact

My Sites

Data

Affiliate DNA Tests

Categories

Tags

Archives

Recent Comments

Blogroll