Re: Proc PRINQUAL: transformations

mszommer · Posted 07-13-2020 06:18 AM

Hello,

The survey data that I'm analzing has 180 variables - most of which is ordinal (likert scale) and nominal/dummy (1 / 0). I also have about 8 ratio level data. The aim is to run a Proc FACTOR on the transformed data before running cluster analysis to see if the consumers can be segmented. I resorted to Proc PRINQUAL due to the nature of my data.

I opted for maximum Total Variance (MTV) transformation method. The ratio variables are no. of pets and I did not want them to be transformed (was it a correct decision?), hence I used Linear transformation for the Ratio variables. Opscore and Monotone were used for nominal and ordinal data resp. The proportion of variance explained is a mere 12%.

Below is my code:

proc prinqual data=persona.persona_npca
method=mtv
nomiss
out=persona.npca_results2;
*transform identity(gender--n_pet7);
transform opscore(gender--imp_aspc_on_pet_sup8) monotone(age--brand_img7) linear(n_pet1--n_pet7);
run;

Where am I going wrong? Must I change the transformations?

Kindly help.

If Proc PRINQUAL is not the best bet, would it make sense to use Proc CORR to calculate Polychoric correlations before using the output for Proc FACTOR? I have tried to use it in the past with no success 😞

Regards,

MS

PaigeMiller · Posted 07-13-2020 09:02 AM

Where am I going wrong?

It's not clear from what you have written what is wrong. Why do you say something is wrong?

--
Paige Miller

SteveDenham · Posted 07-13-2020 09:56 AM

I see a couple possibilities for the low % variance explained. The first is that the variables are redundant in some sense - they don't span the full 180 dimensional space, because some of them are redundant in the information provided. For a regression, this would be a multicollinearity issue. That might be addressed with the MGV method, but I can't be sure. Another possibility is that there are some data issues, such that the Likert scales run one way for some variables and the opposite way for others. I realize that's a stretch, but in that case the monotone transformation could lead to some problems. The last one I can think of is too many ties in the Likert variables, such that including the ones with highly tied responses damps down the total variability. Oh and I doubt your ratio variable, number of pets is correctly specified. How do you get 7 different variables for that? It should be one variable, with the number as a response, but I may be completely misinterpreting the variable.

SteveDenham

Rick_SAS · Posted 07-13-2020 05:01 PM

And how many observations are left in the data after you use casewise deletion of observations with missing values (the NOMISS option)?

mszommer · Posted 07-14-2020 06:39 AM

@SteveDenhamthe likert scales run the same way for all the variables

the no. of pets addresses the no. of dogs, cats, birds, etc.

@Rick_SAS977 of 15633 observations were deleted/omitted due to missing values.

SteveDenham · Posted 07-14-2020 11:52 AM

Given that, I am going to go with collinearity of variables, My idea about different directions on the Likert scale would only have reversed the sign for the loading of that variable, so it was a pretty dumb idea. Sorry.

SteveDenham

Proc PRINQUAL: transformations