BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tonybesas
Obsidian | Level 7

Dear all

 

I'd like to delete variables from a dataset based on their values in an outstat set created by PROC FACTOR. 

 

proc factor
OUTSTAT=fout
data=normed 
method=principal scree
mineigen=0
priors=smc
var say -- confirm ; run;


The code above creates an outstat data set called fout. This dataset stores the communality values (among other values) in a row like this:

 

_TYPE_ NAME var1 var2 var_n
COMMUNAL   0.3995 0.1133 0.3744

 

The full outstat dataset (fout) is attached in CSV format. The row in question is #205.

 

I'd like to delete variables that have COMMUNAL that are less than .15. In this case, variable var2 would be dropped from the dataset normed.

 

I can do this by including the variables in a data statement:

 

DATA normed (DROP = var2);
SET normed;
RUN;

 

But since this involves checking hundreds of variables, it'd be nice if it could be automated. 

 

Could this process be automated in a program?

 

I'm using SAS University Edition.

 

Thank you all ahead for your time!

Tony

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

I am going to assume that you really want to delete any column that has a value less than 0.15 and greater than -0.15 even though you didn't say that.

 

Something like this (UNTESTED CODE)

data fout2;
    set fout;
    array v var1-varn;
    do 1 = 1 to dim(v);
        v(i)=abs(v(i));
    end;
run;
proc summary data=fout2;
    var var1-varn;
    output out=minimums min=;
run;
proc transpose data=minimums out=minimums_t;
    var var1-varn;
run;
proc sql;
    select _name_ into :names separated by ' ' from minimums_t
        where col1<0.15;
quit;
data want;
    set fout;
    drop &names;
run;
--
Paige Miller

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

I am going to assume that you really want to delete any column that has a value less than 0.15 and greater than -0.15 even though you didn't say that.

 

Something like this (UNTESTED CODE)

data fout2;
    set fout;
    array v var1-varn;
    do 1 = 1 to dim(v);
        v(i)=abs(v(i));
    end;
run;
proc summary data=fout2;
    var var1-varn;
    output out=minimums min=;
run;
proc transpose data=minimums out=minimums_t;
    var var1-varn;
run;
proc sql;
    select _name_ into :names separated by ' ' from minimums_t
        where col1<0.15;
quit;
data want;
    set fout;
    drop &names;
run;
--
Paige Miller
tonybesas
Obsidian | Level 7

Thank you for your reply.

 

I just want to remove the vars with values less than .15, regardless of their negative values.

 

I oversimplified the data table for fout in my original post. The variable names are not numeric. The table looks more like this:

 

_TYPE_ _NAME_ say coronavirus covid people time take make health
MEAN   6.83115306 2.98523827 3.30916916 3.42976631 2.12912599 1.78743404 1.91479591 2.58683648
STD   5.93316505 3.26815184 3.48430046 3.6285918 2.05970761 1.75296901 1.85799116 3.5991221
N   211455 211455 211455 211455 211455 211455 211455 211455
CORR say 1 0.15361343 0.01273492 0.16232542 -0.1062952 0.02024781 -0.030496 0.14823108
CORR coronavirus 0.15361343 1 -0.0494022 0.13147976 -0.1120803 0.01712126 -0.1051971 0.16906057
COMMUNAL 0.3857993 0.42459764 0.42993311 0.35747462 0.20179172 0.13523689 0.22256621 0.54063112
PRIORS   0.2803741 0.33664296 0.34151796 0.26550787 0.14132677 0.07544468 0.1414966  
EIGENVAL   6.09010089 4.14327681 3.2221195 2.32712338 2.29719713 1.82682613 1.54001848  

 

The line with the values is 'COMMUNAL'. Based on the cut-off of .15, the variable 'take' would be dropped.

 

The array is throwing an error:

 

73         data fout2;
 74             set fout;
 75             array v var1-varn;
 ERROR: Missing numeric suffix on a numbered variable list (var1-varn).
 WARNING: Defining an array with zero elements.
 76             do 1 = 1 to dim(v);
                   _
                   80
                   200
 ERROR 80-322: Expecting a variable name.
 
 ERROR 200-322: The symbol is not recognized and will be ignored.
 
 77                 v(i)=abs(v(i));
 78             end;
 79         run;
tonybesas
Obsidian | Level 7

Base on @PaigeMiller's code, this worked:

 

data fout2;
    set fout (where=(_TYPE_="COMMUNAL"));
run;

proc transpose data=fout2 out=communal; id _TYPE_; run;

proc sql;
    select _name_ into :names separated by ' ' from communal
        where communal <.15;
quit;

/* drop variables with low communalities from data set */

data normed_clean ;
    set normed ;
    drop &names;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 562 views
  • 0 likes
  • 2 in conversation