I have no idea how to condense my question into one sentence but what I am trying to do is perform a macro on all columns that name starts with a successive pattern (S1-S100).
So let's say there are 500 columns in which that 100 of them are named S1_Something to S100_Something and the rest are named SOMETHING1_Something to SOMETHING400_Something. I want to perform a macro on all of the columns that start with S1-S100. So what I have so far is kind of an ugly way of doing it but it is
PROC SQL;
CREATE TABLE ques AS
SELECT * FROM dictionary.columns
WHERE memname="HAVE";
QUIT;
PROC TRANSPOSE DATA=ques OUT=quest;
VAR name;
RUN;
DATA quest;
SET quest;
ARRAY cols COL1-COL%EVAL(&sqlobs.-2);
DO OVER cols;
IF SCAN(cols, 1, "_") IN ("S1", "S2", "S3") THEN PUT cols;
END;
RUN;
But instead of outputting the column into the log, I would then concatenate it into a macro variable and then shove that macro variable into a macro that will then loop over all of the strings inside the macro variable and perform the macro on that. So what I have should work however I was wondering if there was an easier way of doing it. Basically having an array of "S1"-"S100" and seeing if it is in that array but I don't believe that is possible. I wouldn't be able to do LIKE S% because it would also grab all of the columns that are like S% which is all of the columns and not just S1_Something - S100_Something.
Go ahead and ask any questions if you are confused on what I want exactly since I'm bad at explaining.
Okay so I got closer to what I want by reading the link you gave me with
PROC SQL;
SELECT name || "=" || CATS(SUBSTR(name, 1, 27), "_temp") INTO :rename_list SEPARATED BY " "
FROM dictionary.columns
WHERE libname="WORK"
AND memname="HAVE";
QUIT;
PROC DATASETS LIBRARY=WORK NOLIST;
MODIFY HAVE;
RENAME &rename_list.;
QUIT;
PROC SQL;
SELECT name || "=" || SCAN(name, 1, "_") INTO :rename_list SEPARATED BY " "
FROM dictionary.columns
WHERE libname="WORK"
AND memname="HAVE";
QUIT;
PROC DATASETS LIBRARY=WORK NOLIST;
MODIFY HAVE;
RENAME &rename_list.;
QUIT;
And the only thing that doesn't work now is the macro. I successfully renamed all of the variables to S1-S7 P1-P9, etc. instead of having them be S1_Something-S7_Something
ANOTHER WAY OF DOING IT
If you look at my code, I got all the way to the end but I don't know how to grab the columns that are in that successive pattern. I grabbed all of the columns and I can subset the columns by making it satisfy it being inside an array of strings that I enter manually but not a list of strings that I can set automatically. And I could possibly just loop over the columns instead of doing a macro but I feel as though the same problem that exists when looping over the columns exists when shoving all of the columns I want inside the macro.
Have you checked out shortcut lists? If your variables are S1-S1000 you can refer to them as that within your array code at least, this is valid:
array myVars(*) S1-S1000;
Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html
@hubertsng wrote:
If you look at my code, I got all the way to the end but I don't know how to grab the columns that are in that successive pattern. I grabbed all of the columns and I can subset the columns by making it satisfy it being inside an array of strings that I enter manually but not a list of strings that I can set automatically. And I could possibly just loop over the columns instead of doing a macro but I feel as though the same problem that exists when looping over the columns exists when shoving all of the columns I want inside the macro.
How do you make a string of successive S1-S1000 to then compare it to the string of the column name or how do I get a SCAN(colname, 1, "_") to equal the successive variables S1-S1000.
@hubertsng wrote:
How do you make a string of successive S1-S1000 to then compare it to the string of the column name or how do I get a SCAN(colname, 1, "_") to equal the successive variables S1-S1000.
I don't know what that means. Show some example input data with 5 columns and explain what you want as output, or include the example output and someone can likely provide several different methods to accomplish it. There's clearly people here who are more than eager to help if you provide enough details 🙂
I know I'm terrible at explaining.
So let's say I have 500 columns. 100 which is named S1_something, S2_something, S3_something... , S100_something and and SOMETHING1_something, SOMETHING2_something, SOMETHING3_something,... SOMETHING400_something. I want to grab just the 100 columns that start with S1-S100 and not the ones that start with SOMETHING1-SOMETHING400. The reason why I want to grab those columns is that I want to then perform a macro, or loop through those columns and perform a function. Or another way of looking at it is that I want to only KEEP those columns, S1- S100, and DROP the rest because if I do, then I can just loop over all of the columns of the remaining data set and then merge it with the original. Not sure if that clears anything up.
Okay so I got closer to what I want by reading the link you gave me with
PROC SQL;
SELECT name || "=" || CATS(SUBSTR(name, 1, 27), "_temp") INTO :rename_list SEPARATED BY " "
FROM dictionary.columns
WHERE libname="WORK"
AND memname="HAVE";
QUIT;
PROC DATASETS LIBRARY=WORK NOLIST;
MODIFY HAVE;
RENAME &rename_list.;
QUIT;
PROC SQL;
SELECT name || "=" || SCAN(name, 1, "_") INTO :rename_list SEPARATED BY " "
FROM dictionary.columns
WHERE libname="WORK"
AND memname="HAVE";
QUIT;
PROC DATASETS LIBRARY=WORK NOLIST;
MODIFY HAVE;
RENAME &rename_list.;
QUIT;
And the only thing that doesn't work now is the macro. I successfully renamed all of the variables to S1-S7 P1-P9, etc. instead of having them be S1_Something-S7_Something
ANOTHER WAY OF DOING IT
@hubertsng wrote:
I know I'm terrible at explaining.
So let's say I have 500 columns. 100 which is named S1_something, S2_something, S3_something... , S100_something and and SOMETHING1_something, SOMETHING2_something, SOMETHING3_something,... SOMETHING400_something. I want to grab just the 100 columns that start with S1-S100 and not the ones that start with SOMETHING1-SOMETHING400. The reason why I want to grab those columns is that I want to then perform a macro, or loop through those columns and perform a function. Or another way of looking at it is that I want to only KEEP those columns, S1- S100, and DROP the rest because if I do, then I can just loop over all of the columns of the remaining data set and then merge it with the original. Not sure if that clears anything up.
So your issue is that someone or something has added suffixes to the variables of interest and you want to rename those variables?
Or do you just want to identify those variables?
So is your pattern of interest variable names like Sdddd_XXXX where dddd can be a number between 1 and 9999 and XXXX can be any character? So take the list of names that actually exist and apply some logic to find the one you are interested in. You can then either rename them or just use them. Might help to put the list into a macro variable (or two).
proc contents data=have noprint out=contents; run;
data vars_of_interest;
set contents;
name=upcase(name);
if name=:'S' and index(name,'_');
if 1 <= input(substr(scan(name,1,'_',2),?32.) <= 9999;
new_name = scan(name,1,'_');
run;
proc sql noprint;
select name, new_name, catx('=',name,new_name)
into :old_names separated by ' '
, :new_names separated by ' '
, :rename separated by ' '
from vars_of_interest
;
quit;
Then you can use the macro variable to help you write the code to use your dataset.
data want;
set have;
array svars &old_names ;
....
rename &rename ;
run;
using this dummy dataset:
DATA WORK.HAVE;
FORMAT ID $1. S1 $1. S2 $1. S3 $1. L1 $1.L2 $1. L3 $1.;
INFORMAT ID $1. S1 $1. S2 $1. S3 $1. L1 $1.L2 $1. L3 $1.;
INPUT ID S1 S2 S3 L1 L2 L3;
INFILE DATALINES DLM='|' DSD;
DATALINES;
1|A|B|C|X|Y|Z
;
I tested this logic:
PROC SQL;
SELECT DISTINCT name INTO :ColList separated by ','
FROM dictionary.columns
WHERE memname="HAVE"
AND SUBSTR(name,1,2) IN('S1','S2','S3','S4','S5','S6','S7','S8','S9');
QUIT;
%PUT &=ColList.;
And produced the Macro variable "ColList" with this value printed in the log
%PUT &=ColList.; COLLIST=S1,S2,S3
Then wherever you want to use the list of columns identified, you can drop that macro variable into place.
I see that you can do that but I'm trying to make it so I don't have to specify each individual cell. Not sure if that's possible or not. So basically a "S1"-"S1000" type deal.
You can use the colon prefix to make a variable list of all variables that start with a specific prefix.
array all_cols col: ;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.