03-14-2013 02:49 PM
I am new to SAS and am using the PRINCCOMP tool to identify the most important inputs which affect the target value. I ran this tool and cannot find a place where I can see the list of the principal components ranked by their strength. I see there are principals shown in the chart but represented as numerical values - I mean the original values must have been renamed or code - I wonder where I can see the listing of the most important factors with their NAMES FULLY SHOWN?
03-14-2013 03:23 PM
The default output posts the eigenvectors, which are the principal components. The Eigenvalues>Proportion show you the contribution of each eigenvector to the overall variation.
The principal components are listed in order of contribution by default. In the sample I ran, the names did show fully, are yours getting truncated? If so show the code and let us know the version of SAS and your OS.
You may also want to explain more about what you're trying to achieve? Your question makes me wonder if principal component is the appropriate procedure.
03-14-2013 03:54 PM
thank you for your reply...well I have like 90 parameters which were recorded for 500 observations of a certain social phenomenon....I need to identify the most important affectors in the 90 variables which were collected - to be used in the genetic optimization procedure in excel....I am aware of the need to reduce the number of the predictors to maximize the generalization ability of the model....I am using sas 9.1 windows xp sp3.
what would you recommend?
by the way - do you maybe knwo how may degrees of freedom are allowed for 500 observations? need a calculator for that
03-14-2013 04:11 PM
I think proc factor would be the best choice in this case. Observe factor pattern matrix generated as a result of proc factor and see the factor loadings.Factor pattern matrix displays correlations between variables and the common factors. Factor loadings (correlations) are expected to be on the higher size in absolute value for important variables.
03-14-2013 04:17 PM
Are you expecting to be able to interpret and explain this model, or to be able to predict from it? Generally, factor and PCA analysis doesn't lend itself well to interpretation and explanation. It does reduce the 'variables' but you still need all the variables to make any predictions as the factors and pc are derived from all the variables.
03-14-2013 04:28 PM
well yes you nailed it...I need only the most important factors...not all of the variables...is there a way to single out the most prominent factors without constructing additional principal components???? as for proc factor - is that a function or a tool? thanks!!
and yes I need to predict form this model
03-14-2013 05:50 PM
If your target value is on a continuous scale and what you are looking for is a subset of your factors that will better predict your target value, you should look at proc reg or proc glmselect. Principal components analysis definitely isn't well suited for your problem.
03-15-2013 01:47 AM
When you run factor analysis it generates factor scores which are uncorrelated variables and are linear combinations of the original variables. Only some of the factors which explain most of variation in original variables will be retained. Usually we select factors which expalin at least a variance of 1. As we are using correlation matrix and the maximum variance that a variable can explain is 1. So variance below explaining a variance less than 1 would not make sense to select the factors. In your problem there may be 3 or 4 factors that can explain 60-70% of total variation.
Now as you want to predict from model so use standardized variables included in the model and multiply them with Standardized Scoring Coefficients for the selected factors only. In this way you can calculate factor scores by putting new values of the variables appearing in factor analysis..
Hope this will help in solving your problem.
03-15-2013 04:48 AM
thank you....I need to have the list of the variables which are most strong predictors....the list will nto be transformed by additional coefficients anyhow - is there a way to do that? I just migrated from SPSS Clementine where there is the feature slection tool which gives you the lust of the most prominent inputs