BookmarkSubscribeRSS Feed
mszommer
Obsidian | Level 7

Hello,

 

The survey data that I'm analzing has 180 variables - most of which is ordinal (likert scale) and nominal/dummy (1 / 0). I also have about 8 ratio level data. The aim is to run a Proc FACTOR  on the transformed data before running cluster analysis to see if the consumers can be segmented. I resorted to Proc PRINQUAL due to the nature of my data.

 

I opted for maximum Total Variance (MTV) transformation method. The ratio variables are no. of pets and I did not want them to be transformed (was it a correct decision?), hence I used Linear transformation for the Ratio variables. Opscore and Monotone were used for nominal and ordinal data resp. The proportion of variance explained is a mere 12%.

 

Below is my code:

proc prinqual data=persona.persona_npca
method=mtv
nomiss
out=persona.npca_results2;
*transform identity(gender--n_pet7);
transform opscore(gender--imp_aspc_on_pet_sup8) monotone(age--brand_img7) linear(n_pet1--n_pet7);
run;

 

Where am I going wrong? Must I change the transformations?

Kindly help.

 

If Proc PRINQUAL is not the best bet, would it make sense to use Proc CORR to calculate Polychoric correlations before using the output for Proc FACTOR? I have tried to use it in the past with no success 😞

 

Regards,

MS

 

 

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Where am I going wrong?

It's not clear from what you have written what is wrong. Why do you say something is wrong?

--
Paige Miller
SteveDenham
Jade | Level 19

I see a couple possibilities for the low % variance explained.  The first is that the variables are redundant in some sense - they don't span the full 180 dimensional space, because some of them are redundant in the information provided.  For a regression, this would be a multicollinearity issue. That might be addressed with the MGV method, but I can't be sure. Another possibility is that there are some data issues, such that the Likert scales run one way for some variables and the opposite way for others.  I realize that's a stretch, but in that case the monotone transformation could lead to some problems. The last one I can think of is too many ties in the Likert variables, such that including the ones with highly tied responses damps down the total variability.  Oh and I doubt your ratio variable, number of pets is correctly specified.  How do you get 7 different variables for that?  It should be one variable, with the number as a response, but I may be completely misinterpreting the variable.

 

SteveDenham

Rick_SAS
SAS Super FREQ

And how many observations are left in the data after you use casewise deletion of observations with missing values (the NOMISS option)?

mszommer
Obsidian | Level 7

@SteveDenhamthe likert scales run the same way for all the variables

                           the no. of pets addresses the no. of dogs, cats, birds, etc.

 

@Rick_SAS977 of 15633 observations were deleted/omitted due to missing values.

SteveDenham
Jade | Level 19

Given that, I am going to go with collinearity of variables,  My idea about different directions on the Likert scale would only have reversed the sign for the loading of that variable, so it was a pretty dumb idea.  Sorry.

 

SteveDenham

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 914 views
  • 1 like
  • 4 in conversation