09-21-2015 04:37 PM
I was working on linear regression where I have approx 140 variables data set out of which 93 are categorical variable.
I am not able to get how we do factor analysis for these variables.
Im using proc Factor for numerical variables and categorical variables
For age I have age varlable i want to categorise 1-15 as 1
16-30 as 2,30-rest as 3.
But like this i cannot do for all 93 categorical variables?
How to approach to this problem?
09-22-2015 02:23 AM
09-22-2015 09:01 AM - edited 09-22-2015 09:12 AM
I think you might want to consider performing Partial Least Squares Regression on this data (PROC PLS).
This is, in a certain manner of speaking, similar to Principal Components Regression (not Factor Analysis regression), but has better mathematical properties than Principal Components Regression (and probably better mathematical properties than Factor Analysis regression). PLS has no difficulty handling ordinal or categorical predictor variables.
For age I have age varlable i want to categorise 1-15 as 1 16-30 as 2,30-rest as 3.
Please note that you are creating your own problems by making a continuous variable AGE into a categorical variable, and your life would be so much easier if you treat AGE as a continuous variable. Normally, turning continuous variables into categories is not recommended at all since you are losing information; for example, age 15 and 16 are very close together on a continuous scale but if you create categories, 16 is very different than 15. While I don't know your problem or what types of results you are trying to achieve, in most cases, I wouldn't do this (yes there are exceptions).