When you run PCA, the groups 1, 2, and 3 will all be represented as distinct principal components. You don't need to identify which columns are correlated and which are uncorrelated before running PCA it will give that to you.Īs an example, say you have 100 variables and 30 of them are highly correlated with each other (we'll call them group 1), 40 others are correlated with themselves (group 2),Ģ0 others are correlated with themselves (group 3), 8 others don't fit nicely into any of these groups and are in some sense correlated to all of them but at different times and different ways (group 4), and 2 of them are totally uncorrelated with any of the other variables (group 5). If you take truely uncorrelated data and run PCA you will find that each column of data is its own principal component so then there's no need to run PCA. Looking at the weights (or linear coefficients) you find, you can determine which components of your original dataset make up each principal component. You are transforming your data from its current space to an abstract space where the principal components are linear combinations of the original data and each component is orthogonal or uncorrelated from every other component. The principal components are the components, in an abstract sense, of your data that are uncorrelated. PCA works best with lots of correlated data. The purpose of PCA is to reduce correlated data and find the principal components.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |