Hi Everyone,
My data has 500+ variables that are generated by a doloop. Thus variable names follow rule as below:
Factori_Return_j (so let put it Fi_R_j)
i and j are random instead of 0 1 2 3...
Many of the variables are useless, so I don't want to drop them before analyzing the data.
So in the sample below, I want to drop f1_R_2 , f1_R_6, f2_R_0 , f2_R_6 from my data have.
Could you please help me with that?
Thank you so much.
HHCFX
data droplist;
input factor return;
datalines;
1 2
1 6
2 0
2 6
;run;
data droplist; length varname $50.; set droplist;
varname="F"||(factor)||"_R_"||(return);run;
data droplist; set droplist;
keep varname;varname=compress(varname," ");run;
data have;
input F_0R_99 F_1R_2 F_1R_5 F_1R_8 F_2R_0 F_2R_3 F_2R_6 F_2R_9 F_3R_0;
datalines;
1 1 2 3 6 2 1 4 5
;run;
Here's an approach that sticks to data step and SAS macro code:
%let droplist = "";
data droplist;
input factor $ return $;
keep varname;
varname=CATS("F",factor,"_R_",return);
* Assign the varnames to a macro variable;
IF _N_ = 1 THEN CALL symput("droplist", varname);
ELSE CALL symput("droplist", CATX(" ", symget("droplist"), varname));
datalines;
1 2
1 6
2 0
2 6
;
run;
data have;
input F0_R_99 F1_R_2 F1_R_5 F1_R_8 F2_R_0 F2_R_3 F2_R_6 F2_R_9 F3_R_0;
drop &droplist;
datalines;
1 1 2 3 6 2 1 4 5
;
run;
I was able to condense your three data steps into one. I created a macro variable to store the space separated listing of variables you wanted to drop, as named by the factor and return values. To apply the drop, I simply had to add a drop statement to the data step creating "have" that referred back to that macro variable. I also needed to adjust the variable names on your input statement because the placement of the underscores was not consistent with what you described in your question.
With this approach you will receive warnings if there are variables in the droplist that are NOT in the list of variables you actually have, but the program will still run as intended.
Hi,
Use Dictionary.clolumns for filtering the columns you need. If the columns you need can be filtered out on a condition then this approach make sense.
proc sql;
select name INTO: vars separated by " "
from dictionary.columns
where LIBNAME="SASHELP" and memname="CLASS"
and UPCASE(name) in ("NAME","AGE");
quit;
DATA NEED(Drop=&Vars);
SET SASHELP.CLASS;
run;
Here's an approach that sticks to data step and SAS macro code:
%let droplist = "";
data droplist;
input factor $ return $;
keep varname;
varname=CATS("F",factor,"_R_",return);
* Assign the varnames to a macro variable;
IF _N_ = 1 THEN CALL symput("droplist", varname);
ELSE CALL symput("droplist", CATX(" ", symget("droplist"), varname));
datalines;
1 2
1 6
2 0
2 6
;
run;
data have;
input F0_R_99 F1_R_2 F1_R_5 F1_R_8 F2_R_0 F2_R_3 F2_R_6 F2_R_9 F3_R_0;
drop &droplist;
datalines;
1 1 2 3 6 2 1 4 5
;
run;
I was able to condense your three data steps into one. I created a macro variable to store the space separated listing of variables you wanted to drop, as named by the factor and return values. To apply the drop, I simply had to add a drop statement to the data step creating "have" that referred back to that macro variable. I also needed to adjust the variable names on your input statement because the placement of the underscores was not consistent with what you described in your question.
With this approach you will receive warnings if there are variables in the droplist that are NOT in the list of variables you actually have, but the program will still run as intended.
Thank you so much.
HHCFX
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.