Ref 2 South Asians + Harappa PCA

Posted by Zack on March 30, 2011

I ran PCA on the South Asian populations included in Reference II dataset as well as 38 South Asian participants of Harappa Project. This is sort of a complementary analysis to the Ref1 South Asian one, as this one includes Kalash, Hazara and the additional South Asian groups in Xing et al.

The reference populations included are: Andhra Brahmin, Andhra Madiga, Andhra Mala, Balochi, Bnei Menashe Jews, Brahui, Burusho, Cochin Jews, Gujaratis, Gujaratis-B, Hazara, Irula, Kalash, Makrani, Malayan, Nepalese, North Kannadi, Paniya, Pathan, Punjabi Arain, Sakilli, Sindhi, Singapore Indians, Tamil Nadu Brahmin, and Tamil Nadu Dalit.

Here's the spreadsheet showing the eigenvalues and the first 15 principal components for each sample.

I computed the PCA using Eigensoft which removed 13 samples as outliers. The Tracy-Widom statistics show that about 25 eigenvectors are significant.

Here are the first 15 eigenvalues:

1	6.374483
2	3.650626
3	3.270121
4	2.999767
5	1.937818
6	1.713315
7	1.538295
8	1.503051
9	1.458331
10	1.448079
11	1.433288
12	1.414678
13	1.408943
14	1.390791
15	1.38101

Here is a 3-D PCA plot (hat tip: Doug McDonald) showing the first three eigenvectors. The plot is rotating about the 1st eigenvector which is vertical. Also, I have stretched the principal components based on the corresponding eigenvalues. Also, you can highlight the individual project participants in the plot by using the dropdown list below the plot.

</p> <p>Your browser does not support frames. Go <a href="http://www.harappadna.org/wp-content/uploads/2011/03/r2_sa_hrp_pca.html">here</a> to see the animation.</p> <p>

Now here are plots of the first 14 eigenvectors. In this case, I have not stretched the principal components, so keep in mind that the first eigenvector explains 1.75 times variation compared to the 2nd eigenvector.

PCAeigenvectors, harappa, south asia

← Two Steps Forward, Two Steps Back

Ref 2 South Asians + Harappa PCA Clusters →

2 Comments.

Ref 2 South Asians + Harappa PCA Clusters | Harappa Ancestry Project - pingback on March 30, 2011 at 1:29 pm
Ref2 South Asian + Harappa Admixture | Harappa Ancestry Project - pingback on April 4, 2011 at 6:46 am

Trackbacks and Pingbacks:

Ref 2 South Asians + Harappa PCA Clusters | Harappa Ancestry Project - Pingback on 2011/03/30/ 13:29
Ref2 South Asian + Harappa Admixture | Harappa Ancestry Project - Pingback on 2011/04/04/ 06:46

Harappa Ancestry Project

Genetics and South Asia