Dear SAS community,
I have a question. I am currently building models for a predicting project.
I'm starting with hundreds of variables on hand.
Specifically, for example. I have 20 base variables X1,X2,X3...X20.
Each variable has 6 transformations. For example
X1_lag1,
X1_lag2,
X1_lead1,
X1_lead2,
X1_avg4m,
X1_avg4m
Now I'm having this problem: for certain variables, I need to exclude certain transformation.
For example, for variable X5 and X10 in particular, I need to remove the lead transformation of X5 and X10.
I wrote the following code and it works,
PROC SQL;
SELECT DISTINCT variable_name INTO :selected list FROM var_list
WHERE
upcase(name) not like %X5_LEAD1%
AND upcase(name) not like %X5_LEAD2%
AND upcase(name) not like %X10_LEAD1%
AND upcase(name) not like %X10_LEAD2%
;
QUIT;
But the problem is that in reality, I have hundreads of variables whose certain transformations need to excluded.
what I need now is a more flexible way to write the code that I can change the combination of base variables and transformations that I want to exclude. Currently, the code is manual and not very effective.
I tried call execute but it doesn't work. I tried write a loop using arrays in a macro but having some difficulties. Can anyone point me to some examples, that'd be great. Thanks!
Ideally, I want to write a macro, the input has two elements, 1. the list of base variables; 2. the list of tranforms.
Then every possible combination of base variable from list 1+transformations from list 2 will be excluded.
Then by executing the macro, goals are achieved. No need to manually to update/change/write a complete list of
upcase(name) not like %X5_LEAD1%
AND upcase(name) not like %X5_LEAD2%
AND upcase(name) not like %X10_LEAD1%
AND upcase(name) not like %X10_LEAD2%
I am able to allow the baseline variable be a macro, but still the problem of the code below is that only one variable can be executed at one time. Is there anyway to do a list of variables at one time?
%LET VARN=var5;
PROC SQL;
SELECT DISTINCT variable INTO :selected_list FROM var_list
WHERE upcase(name) not like cats('%',"&VARN.",'_LEAD1%')
AND upcase(name) not like cats('%',"&VARN.",'__LEAD2%')
;
QUIT;
Thanks!
Only the base variable changes.
So, as an example, we have a SAS data set which contains the base variables of interest (you could read this from a text file or Excel)
data base_variables;
input base_variable_name $32.;
cards;
x1
x11
x21
x111
paige1
paigemiller1
;
/* Create list of base variable names to exclude */
proc sql noprint;
select quote(cats(upcase(base_variable_name),'_LEAD1')) into :names1 separated by ',' from base_variables;
select quote(cats(upcase(base_variable_name),'_LEAD2')) into :names2 separated by ',' from base_variables;
quit;
%put &=names1;
%put &=names2;
proc sql;
select distinct variable_name into :selected_list from var_list
where upcase(name) not in (&names1,&names2);
quit;
@changxuosu wrote:
But the problem is that in reality, I have hundreads of variables whose certain transformations need to excluded.
what I need now is a more flexible way to write the code that I can change the combination of base variables and transformations that I want to exclude. Currently, the code is manual and not very effective.
How would you know what variables and which transformations need to be excluded? Is this a text list, or an Excel file, or something else?
Is it always exclude LEAD1 and LEAD2, or does that vary?
Only the base variable changes.
So, as an example, we have a SAS data set which contains the base variables of interest (you could read this from a text file or Excel)
data base_variables;
input base_variable_name $32.;
cards;
x1
x11
x21
x111
paige1
paigemiller1
;
/* Create list of base variable names to exclude */
proc sql noprint;
select quote(cats(upcase(base_variable_name),'_LEAD1')) into :names1 separated by ',' from base_variables;
select quote(cats(upcase(base_variable_name),'_LEAD2')) into :names2 separated by ',' from base_variables;
quit;
%put &=names1;
%put &=names2;
proc sql;
select distinct variable_name into :selected_list from var_list
where upcase(name) not in (&names1,&names2);
quit;
Let's assume that the next time you run the analysis, you want to exclude these transformations
LEAD1 LEAD2 BACK7 and MICKEYDOLENZ
%let transforms=LEAD1 LEAD2 BACK7 MICKEYDOLENZ;
data base_variables;
input base_variable_name $32.;
cards;
x1
x11
x21
x111
paige1
paigemiller1
;
%macro dothis;
/* Create list of base variable names to exclude */
proc sql noprint;
%do i=1 %to %sysfunc(countw(&transforms));
%let thisone=%scan(&transforms,&i,%str( ));
select quote(cats(upcase(base_variable_name),"_&thisone")) into
:names&i separated by ',' from base_variables;
%end;
quit;
proc sql;
select distinct variable_name into :selected_list from var_list
where upcase(name) not in
(%do i=1 %to %sysfunc(countw(&transforms)); &&names&i
%if &i<%sysfunc(countw(&transforms)) %then %str(,); %end;);
quit;
%mend;
options mprint;
%dothis
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.