Dear all
I'd like to delete variables from a dataset based on their values in an outstat set created by PROC FACTOR.
proc factor
OUTSTAT=fout
data=normed
method=principal scree
mineigen=0
priors=smc
var
say -- confirm ;
run;
The code above creates an outstat data set called fout. This dataset stores the communality values (among other values) in a row like this:
_TYPE_ | NAME | var1 | var2 | var_n |
COMMUNAL | 0.3995 | 0.1133 | 0.3744 |
The full outstat dataset (fout) is attached in CSV format. The row in question is #205.
I'd like to delete variables that have COMMUNAL that are less than .15. In this case, variable var2 would be dropped from the dataset normed.
I can do this by including the variables in a data statement:
DATA normed (DROP = var2);
SET normed;
RUN;
But since this involves checking hundreds of variables, it'd be nice if it could be automated.
Could this process be automated in a program?
I'm using SAS University Edition.
Thank you all ahead for your time!
Tony
I am going to assume that you really want to delete any column that has a value less than 0.15 and greater than -0.15 even though you didn't say that.
Something like this (UNTESTED CODE)
data fout2;
set fout;
array v var1-varn;
do 1 = 1 to dim(v);
v(i)=abs(v(i));
end;
run;
proc summary data=fout2;
var var1-varn;
output out=minimums min=;
run;
proc transpose data=minimums out=minimums_t;
var var1-varn;
run;
proc sql;
select _name_ into :names separated by ' ' from minimums_t
where col1<0.15;
quit;
data want;
set fout;
drop &names;
run;
I am going to assume that you really want to delete any column that has a value less than 0.15 and greater than -0.15 even though you didn't say that.
Something like this (UNTESTED CODE)
data fout2;
set fout;
array v var1-varn;
do 1 = 1 to dim(v);
v(i)=abs(v(i));
end;
run;
proc summary data=fout2;
var var1-varn;
output out=minimums min=;
run;
proc transpose data=minimums out=minimums_t;
var var1-varn;
run;
proc sql;
select _name_ into :names separated by ' ' from minimums_t
where col1<0.15;
quit;
data want;
set fout;
drop &names;
run;
Thank you for your reply.
I just want to remove the vars with values less than .15, regardless of their negative values.
I oversimplified the data table for fout in my original post. The variable names are not numeric. The table looks more like this:
_TYPE_ | _NAME_ | say | coronavirus | covid | people | time | take | make | health |
MEAN | 6.83115306 | 2.98523827 | 3.30916916 | 3.42976631 | 2.12912599 | 1.78743404 | 1.91479591 | 2.58683648 | |
STD | 5.93316505 | 3.26815184 | 3.48430046 | 3.6285918 | 2.05970761 | 1.75296901 | 1.85799116 | 3.5991221 | |
N | 211455 | 211455 | 211455 | 211455 | 211455 | 211455 | 211455 | 211455 | |
CORR | say | 1 | 0.15361343 | 0.01273492 | 0.16232542 | -0.1062952 | 0.02024781 | -0.030496 | 0.14823108 |
CORR | coronavirus | 0.15361343 | 1 | -0.0494022 | 0.13147976 | -0.1120803 | 0.01712126 | -0.1051971 | 0.16906057 |
COMMUNAL | 0.3857993 | 0.42459764 | 0.42993311 | 0.35747462 | 0.20179172 | 0.13523689 | 0.22256621 | 0.54063112 | |
PRIORS | 0.2803741 | 0.33664296 | 0.34151796 | 0.26550787 | 0.14132677 | 0.07544468 | 0.1414966 | ||
EIGENVAL | 6.09010089 | 4.14327681 | 3.2221195 | 2.32712338 | 2.29719713 | 1.82682613 | 1.54001848 |
The line with the values is 'COMMUNAL'. Based on the cut-off of .15, the variable 'take' would be dropped.
The array is throwing an error:
73 data fout2;
74 set fout;
75 array v var1-varn;
ERROR: Missing numeric suffix on a numbered variable list (var1-varn).
WARNING: Defining an array with zero elements.
76 do 1 = 1 to dim(v);
_
80
200
ERROR 80-322: Expecting a variable name.
ERROR 200-322: The symbol is not recognized and will be ignored.
77 v(i)=abs(v(i));
78 end;
79 run;
Base on @PaigeMiller's code, this worked:
data fout2;
set fout (where=(_TYPE_="COMMUNAL"));
run;
proc transpose data=fout2 out=communal; id _TYPE_; run;
proc sql;
select _name_ into :names separated by ' ' from communal
where communal <.15;
quit;
/* drop variables with low communalities from data set */
data normed_clean ;
set normed ;
drop &names;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.