BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Golumn
Calcite | Level 5

Hi, I was on here a little bit ago for proc factor vs. proc pls. 

 

In the case of the dataset I currently have I've been advised to continue using proc factor. 

I've done alot of work in ensuring my dataset has numeric variables, and using dummy variables if required. 

I then proceeded with below:

 

proc factor data=dataset_used

method = principle mineigen=0 

rotate=varimax reorder 

outstat=all_pattern;

run;

 

my initial intention was to not limit via factor=# and see how many factors have an eigen >=1.

then re-run while applying a factor limitation. 

 

When i looked at the output i noticed the number of records used was different then the number of records observed....

 

Question1: what causes the difference in records used? is it the presence of null values within at least one variable in an observational row?

Question2: if I don't specify a variable list, does it default to using all variables?

Question3: And if you see anything stupid please feel free to tell me. 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

@Golumn wrote:

Question1: what causes the difference in records used? is it the presence of null values within at least one variable in an observational row?

Question2: if I don't specify a variable list, does it default to using all variables?

Question3: And if you see anything stupid please feel free to tell me. 

 


Answer 1: yes, missing values is the cause

Answer 2: probably it will use all numeric variables, but I never tried

Answer 3: just pointing out that if you use PROC PLS, you don't have to create the dummy variables first, you can use the CLASS statement and then the procedure will internally create the dummy variables

--
Paige Miller

View solution in original post

2 REPLIES 2
PaigeMiller
Diamond | Level 26

@Golumn wrote:

Question1: what causes the difference in records used? is it the presence of null values within at least one variable in an observational row?

Question2: if I don't specify a variable list, does it default to using all variables?

Question3: And if you see anything stupid please feel free to tell me. 

 


Answer 1: yes, missing values is the cause

Answer 2: probably it will use all numeric variables, but I never tried

Answer 3: just pointing out that if you use PROC PLS, you don't have to create the dummy variables first, you can use the CLASS statement and then the procedure will internally create the dummy variables

--
Paige Miller
Golumn
Calcite | Level 5

Thanks again for the feedback!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 510 views
  • 0 likes
  • 2 in conversation