BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hubertsng
Fluorite | Level 6

I have no idea how to condense my question into one sentence but what I am trying to do is perform a macro on all columns that name starts with a successive pattern (S1-S100).

 

So let's say there are 500 columns in which that 100 of them are named S1_Something to S100_Something and the rest are named SOMETHING1_Something to SOMETHING400_Something. I want to perform a macro on all of the columns that start with S1-S100. So what I have so far is kind of an ugly way of doing it but it is

PROC SQL;
    CREATE TABLE ques AS
    SELECT * FROM dictionary.columns
    WHERE memname="HAVE";
QUIT;

PROC TRANSPOSE DATA=ques OUT=quest;
    VAR name;
RUN;

DATA quest;
    SET quest;
    ARRAY cols COL1-COL%EVAL(&sqlobs.-2);
    DO OVER cols;
        IF SCAN(cols, 1, "_") IN ("S1", "S2", "S3") THEN PUT cols;
    END;
RUN;

But instead of outputting the column into the log, I would then concatenate it into a macro variable and then shove that macro variable into a macro that will then loop over all of the strings inside the macro variable and perform the macro on that. So what I have should work however I was wondering if there was an easier way of doing it. Basically having an array of "S1"-"S100" and seeing if it is in that array but I don't believe that is possible. I wouldn't be able to do LIKE S% because it would also grab all of the columns that are like S% which is all of the columns and not just S1_Something - S100_Something. 

 

Go ahead and ask any questions if you are confused on what I want exactly since I'm bad at explaining. 

1 ACCEPTED SOLUTION

Accepted Solutions
hubertsng
Fluorite | Level 6

Okay so I got closer to what I want by reading the link you gave me with

PROC SQL;
    SELECT name || "=" || CATS(SUBSTR(name, 1, 27), "_temp") INTO :rename_list SEPARATED BY " "
    FROM dictionary.columns
    WHERE libname="WORK"
    AND memname="HAVE";
QUIT;

PROC DATASETS LIBRARY=WORK NOLIST;
    MODIFY HAVE;
    RENAME &rename_list.;
QUIT;

PROC SQL;
    SELECT name || "=" || SCAN(name, 1, "_") INTO :rename_list SEPARATED BY " "
    FROM dictionary.columns
    WHERE libname="WORK"
    AND memname="HAVE";
QUIT;

PROC DATASETS LIBRARY=WORK NOLIST;
    MODIFY HAVE;
    RENAME &rename_list.;
QUIT;

And the only thing that doesn't work now is the macro. I successfully renamed all of the variables to S1-S7 P1-P9, etc. instead of having them be S1_Something-S7_Something

 

ANOTHER WAY OF DOING IT 

https://communities.sas.com/t5/SAS-Programming/Grabbing-all-columns-with-names-that-start-with-a-str...

https://communities.sas.com/t5/SAS-Programming/Grabbing-all-columns-with-names-that-start-with-a-str...

View solution in original post

20 REPLIES 20
Reeza
Super User
SASHELP.VCOLUMN -> has a list of all column names you can query.

If you're looping over columns is there a reason an array wouldn't work instead of macros? Macros are useful but there are other methods.
hubertsng
Fluorite | Level 6

If you look at my code, I got all the way to the end but I don't know how to grab the columns that are in that successive pattern. I grabbed all of the columns and I can subset the columns by making it satisfy it being inside an array of strings that I enter manually but not a list of strings that I can set automatically. And I could possibly just loop over the columns instead of doing a macro but I feel as though the same problem that exists when looping over the columns exists when shoving all of the columns I want inside the macro. 

Reeza
Super User

Have you checked out shortcut lists? If your variables are S1-S1000 you can refer to them as that within your array code at least, this is valid:

 

array myVars(*) S1-S1000;

Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

 


@hubertsng wrote:

If you look at my code, I got all the way to the end but I don't know how to grab the columns that are in that successive pattern. I grabbed all of the columns and I can subset the columns by making it satisfy it being inside an array of strings that I enter manually but not a list of strings that I can set automatically. And I could possibly just loop over the columns instead of doing a macro but I feel as though the same problem that exists when looping over the columns exists when shoving all of the columns I want inside the macro. 


 

hubertsng
Fluorite | Level 6

How do you make a string of successive S1-S1000 to then compare it to the string of the column name or how do I get a SCAN(colname, 1, "_") to equal the successive variables S1-S1000.

Reeza
Super User

@hubertsng wrote:

How do you make a string of successive S1-S1000 to then compare it to the string of the column name or how do I get a SCAN(colname, 1, "_") to equal the successive variables S1-S1000.


I don't know what that means. Show some example input data with 5 columns and explain what you want as output, or include the example output and someone can likely provide several different methods to accomplish it. There's clearly people here who are more than eager to help if you provide enough details 🙂

hubertsng
Fluorite | Level 6

I know I'm terrible at explaining.

 

So let's say I have 500 columns. 100 which is named S1_something, S2_something, S3_something... , S100_something and and SOMETHING1_something, SOMETHING2_something, SOMETHING3_something,... SOMETHING400_something. I want to grab just the 100 columns that start with S1-S100 and not the ones that start with SOMETHING1-SOMETHING400. The reason why I want to grab those columns is that I want to then perform a macro, or loop through those columns and perform a function. Or another way of looking at it is that I want to only KEEP those columns, S1- S100, and DROP the rest because if I do, then I can just loop over all of the columns of the remaining data set and then merge it with the original. Not sure if that clears anything up. 

hubertsng
Fluorite | Level 6

Okay so I got closer to what I want by reading the link you gave me with

PROC SQL;
    SELECT name || "=" || CATS(SUBSTR(name, 1, 27), "_temp") INTO :rename_list SEPARATED BY " "
    FROM dictionary.columns
    WHERE libname="WORK"
    AND memname="HAVE";
QUIT;

PROC DATASETS LIBRARY=WORK NOLIST;
    MODIFY HAVE;
    RENAME &rename_list.;
QUIT;

PROC SQL;
    SELECT name || "=" || SCAN(name, 1, "_") INTO :rename_list SEPARATED BY " "
    FROM dictionary.columns
    WHERE libname="WORK"
    AND memname="HAVE";
QUIT;

PROC DATASETS LIBRARY=WORK NOLIST;
    MODIFY HAVE;
    RENAME &rename_list.;
QUIT;

And the only thing that doesn't work now is the macro. I successfully renamed all of the variables to S1-S7 P1-P9, etc. instead of having them be S1_Something-S7_Something

 

ANOTHER WAY OF DOING IT 

https://communities.sas.com/t5/SAS-Programming/Grabbing-all-columns-with-names-that-start-with-a-str...

https://communities.sas.com/t5/SAS-Programming/Grabbing-all-columns-with-names-that-start-with-a-str...

Reeza
Super User
Is that what you're trying to do? Rename variables? You should be using PROC DATASETS then instead of a data step. You can also rename series quite easily, but I do not recommend using a suffix because you cannot easily reference them later on.

So renaming S1-S7 to S_S1-S_S7 would be:

This would work.

RENAME S1-S7 = S_S1-S_S7;
Tom
Super User Tom
Super User

@hubertsng wrote:

I know I'm terrible at explaining.

 

So let's say I have 500 columns. 100 which is named S1_something, S2_something, S3_something... , S100_something and and SOMETHING1_something, SOMETHING2_something, SOMETHING3_something,... SOMETHING400_something. I want to grab just the 100 columns that start with S1-S100 and not the ones that start with SOMETHING1-SOMETHING400. The reason why I want to grab those columns is that I want to then perform a macro, or loop through those columns and perform a function. Or another way of looking at it is that I want to only KEEP those columns, S1- S100, and DROP the rest because if I do, then I can just loop over all of the columns of the remaining data set and then merge it with the original. Not sure if that clears anything up. 


So your issue is that someone or something has added suffixes to the variables of interest and you want to rename those variables?

Or do you just want to identify those variables?

 

So is your pattern of interest variable names like Sdddd_XXXX where dddd can be a number between 1 and 9999 and XXXX can be any character?  So take the list of names that actually exist and apply some logic to find the one you are interested in.  You can then either rename them or just use them.  Might help to put the list into a macro variable (or two).

proc contents data=have noprint out=contents; run;
data vars_of_interest;
  set contents;
  name=upcase(name);
  if name=:'S' and index(name,'_');
  if 1 <= input(substr(scan(name,1,'_',2),?32.) <= 9999;
  new_name = scan(name,1,'_');
run;
proc sql noprint;
  select name, new_name, catx('=',name,new_name)
    into :old_names separated by ' '
       , :new_names separated by ' '
       , :rename separated by ' '
  from vars_of_interest
  ;
quit;

Then you can use the macro variable to help you write the code to use your dataset.

data want;
  set have;
  array svars &old_names ;
  ....
  rename &rename ;
run; 
tsap
Pyrite | Level 9

using this dummy dataset:

DATA WORK.HAVE;
FORMAT 	 ID $1. S1 $1.	S2 $1. S3 $1. L1 $1.L2 $1. L3 $1.;
INFORMAT ID $1. S1 $1.	S2 $1. S3 $1. L1 $1.L2 $1. L3 $1.;
INPUT    ID 	S1 		S2 	   S3 	  L1 	L2	   L3;
INFILE DATALINES DLM='|' DSD;
DATALINES;
1|A|B|C|X|Y|Z
;

 

 

I tested this logic:

PROC SQL;
	SELECT DISTINCT name	INTO :ColList separated by ','
	FROM dictionary.columns
    WHERE memname="HAVE"
	AND SUBSTR(name,1,2) IN('S1','S2','S3','S4','S5','S6','S7','S8','S9');
QUIT;

%PUT &=ColList.;

 

And produced the Macro variable "ColList" with this value printed in the log

%PUT &=ColList.;
COLLIST=S1,S2,S3

 

Then wherever you want to use the list of columns identified, you can drop that macro variable into place.

hubertsng
Fluorite | Level 6

I see that you can do that but I'm trying to make it so I don't have to specify each individual cell. Not sure if that's possible or not. So basically a "S1"-"S1000" type deal. 

tsap
Pyrite | Level 9
you aren't specifying each cell in my logic. The first two characters of the columns can be S1 through S9... so any number S999, S1000, S845... they would all get picked up using my logic without any changes to the code whatsoever.
Tom
Super User Tom
Super User

You can use the colon prefix to make a variable list of all variables that start with a specific prefix.

array all_cols col: ;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 21334 views
  • 4 likes
  • 6 in conversation