I realize you can do that but similar to the "LIKE" problem, it will grab all of the columns that start with "SOMETHING" and not just the ones that start with "S"
proc sql noprint;
select name into :var_list separated by " "
from sashelp.vcolumn
where libname = "WORK"
and memname = "HAVE"
and name like 'S%';
quit;
%put &var_list;
Here's how you can create that variable list in a macro variable, if desired.
@hubertsng wrote:
I have no idea how to condense my question into one sentence but what I am trying to do is perform a macro on all columns that name starts with a successive pattern (S1-S100).
So let's say there are 500 columns in which that 100 of them are named S1_Something to S100_Something and the rest are named SOMETHING_Something to SOMETHING_Something. I want to perform a macro on all of the columns that start with S1-S100. So what I have so far is kind of an ugly way of doing it but it is
PROC SQL; CREATE TABLE ques AS SELECT * FROM dictionary.columns WHERE memname="HAVE"; QUIT; PROC TRANSPOSE DATA=ques OUT=quest; VAR name; RUN; DATA quest; SET quest; ARRAY cols COL1-COL%EVAL(&sqlobs.-2); DO OVER cols; IF SCAN(cols, 1, "_") IN ("S1", "S2", "S3") THEN PUT cols; END; RUN;
But instead of outputting the column into the log, I would then concatenate it into a macro variable and then shove that macro variable into a macro that will then loop over all of the strings inside the macro variable and perform the macro on that. So what I have should work however I was wondering if there was an easier way of doing it. Basically having an array of "S1"-"S100" and seeing if it is in that array but I don't believe that is possible. I wouldn't be able to do LIKE S% because it would also grab all of the columns that are like S% which is all of the columns and not just S1_Something - S100_Something.
Go ahead and ask any questions if you are confused on what I want exactly since I'm bad at explaining.
Like the example that I had given, that code will grab all of the columns that start with "S" which would grab the columns SOMETHING1_something-SOMETHING400_something and notj ust S1_something-S100_something.
@hubertsng wrote:
Like the example that I had given, that code will grab all of the columns that start with "S" which would grab the columns SOMETHING1_something-SOMETHING400_something and notj ust S1_something-S100_something.
You can make like more complex though, even use REGEX if you want to make it super complicated. You're finding a lot of reasons why our solutions won't work for you. I really suggest explaining what you're doing in more detail at this point.
Suppose you have three groups of variables:
1) V_prefixA_1_suffixA - V_prefixA_100_suffixA
2) V_prefixA_1_suffixB - V_prefixA_100_suffixB
3) V_prefixA_1 - V_prefixA_100
you want to grab only one group.
- using V_prefixA_: will grab all the three groups !
- A list defined by: Array V_prefixA_1 - V_prefixA_100; will grab group 3 only,
but you can't use that method to grab group 1 or group 2.
In that case I suggest to do a massive rename of variables:
- rename group 1 to V_prefixA_suffixA_1 - V_prefixA_suffixA_100 and define list by V_prefixA_suffixA_:
- rename group 2 to V_prefixA_suffixB_1 - V_prefixA_suffixB_100 and define list by V_prefixA_suffixB_:
If your case is similar to above you can do the massive rename by:
proc sql;
create table var_names as select
name from dictionary.columns
where memname='HAVE';
quit;
filename ren temp;
data _null_;
set var_names;
length a_line $80;
chk1 = 'V_prefixA_';
chk2 = '_suffixA';
chk3 = '_suffixB';
ix1 = index(name,trim(chk1));
ix2 = index(name,trim(chk2));
ix3 = index(name,trim(chk3));
if ix1=1 then do;
if ix2 > 1 then new_name = cats(ch1,chk2,scan(name,3,'_')); else
if ix3 > 1 then new_name = cats(ch1,chk3,scan(name,3,'_'));
file ren;
a_line = cats(name,' = ',new_name);
put a_line;
end;
run;
proc datasets lib=<library having data HAVE >;
modify have;
%include ren;
quit; run;
you should adapt the CHK# constants to check according to your specific variable names.
The SCAN function should grab the sequence number of the variable in the list.
Pay attention - using proc datasets - modify statement will modify your original HAVE dataset.
You can rename the variables creating new dataset by:
data new;
set have;
%include ren;
run;
My understanding of your rules (after your Nth explanation) is:
- From a given set of variable names in SAS data set HAVE, select the variable names:
starting with a given prefix (e.g., "S")
followed by up to 3 digits
followed by a non-digit
You don't need to go to the dictionary tables for this - the DATA step compiler is a good enough tool:
data have ;
call missing (S1_a, S10p4q, S77bb12, S85_xyz23, S100, SS, S_33, SAb99, Z01_w) ;
run ;
data want (keep = _vn) ;
length _vn $ 32 ;
_p = "S" ; * prefix ;
_d = 3 ; * number of digits after prefix before non-digit ;
do until (_vn = "") ;
call vnext (_vn) ;
if _vn =: _p and 1 < notdigit (substr (_vn, 1 + length (_p))) < _d + 2 then output ;
end ;
stop ;
set have ;
run ;
Now that you have the data set WANT you can do with the _VN values in it whatever you want. In fact, if you just want to create a bunch of macro variables based on the _VN values, you don't even need to write WANT - it can be done in the same DO loop shown above.
Kind regards
Paul D.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.