Solved: Re: Grabbing all columns with names that start with a string in a succ...

hubertsng · Posted 07-22-2019 01:00 PM

I have no idea how to condense my question into one sentence but what I am trying to do is perform a macro on all columns that name starts with a successive pattern (S1-S100).

So let's say there are 500 columns in which that 100 of them are named S1_Something to S100_Something and the rest are named SOMETHING1_Something to SOMETHING400_Something. I want to perform a macro on all of the columns that start with S1-S100. So what I have so far is kind of an ugly way of doing it but it is

PROC SQL;
    CREATE TABLE ques AS
    SELECT * FROM dictionary.columns
    WHERE memname="HAVE";
QUIT;

PROC TRANSPOSE DATA=ques OUT=quest;
    VAR name;
RUN;

DATA quest;
    SET quest;
    ARRAY cols COL1-COL%EVAL(&sqlobs.-2);
    DO OVER cols;
        IF SCAN(cols, 1, "_") IN ("S1", "S2", "S3") THEN PUT cols;
    END;
RUN;

But instead of outputting the column into the log, I would then concatenate it into a macro variable and then shove that macro variable into a macro that will then loop over all of the strings inside the macro variable and perform the macro on that. So what I have should work however I was wondering if there was an easier way of doing it. Basically having an array of "S1"-"S100" and seeing if it is in that array but I don't believe that is possible. I wouldn't be able to do LIKE S% because it would also grab all of the columns that are like S% which is all of the columns and not just S1_Something - S100_Something.

Go ahead and ask any questions if you are confused on what I want exactly since I'm bad at explaining.

hubertsng · Posted 07-22-2019 01:57 PM

Okay so I got closer to what I want by reading the link you gave me with

PROC SQL;
    SELECT name || "=" || CATS(SUBSTR(name, 1, 27), "_temp") INTO :rename_list SEPARATED BY " "
    FROM dictionary.columns
    WHERE libname="WORK"
    AND memname="HAVE";
QUIT;

PROC DATASETS LIBRARY=WORK NOLIST;
    MODIFY HAVE;
    RENAME &rename_list.;
QUIT;

PROC SQL;
    SELECT name || "=" || SCAN(name, 1, "_") INTO :rename_list SEPARATED BY " "
    FROM dictionary.columns
    WHERE libname="WORK"
    AND memname="HAVE";
QUIT;

PROC DATASETS LIBRARY=WORK NOLIST;
    MODIFY HAVE;
    RENAME &rename_list.;
QUIT;

And the only thing that doesn't work now is the macro. I successfully renamed all of the variables to S1-S7 P1-P9, etc. instead of having them be S1_Something-S7_Something

ANOTHER WAY OF DOING IT

https://communities.sas.com/t5/SAS-Programming/Grabbing-all-columns-with-names-that-start-with-a-str...

View solution in original post

Reeza · Posted 07-22-2019 01:02 PM

SASHELP.VCOLUMN -> has a list of all column names you can query.

If you're looping over columns is there a reason an array wouldn't work instead of macros? Macros are useful but there are other methods.

hubertsng · Posted 07-22-2019 01:08 PM

If you look at my code, I got all the way to the end but I don't know how to grab the columns that are in that successive pattern. I grabbed all of the columns and I can subset the columns by making it satisfy it being inside an array of strings that I enter manually but not a list of strings that I can set automatically. And I could possibly just loop over the columns instead of doing a macro but I feel as though the same problem that exists when looping over the columns exists when shoving all of the columns I want inside the macro.

Reeza · Posted 07-22-2019 01:12 PM

Have you checked out shortcut lists? If your variables are S1-S1000 you can refer to them as that within your array code at least, this is valid:

array myVars(*) S1-S1000;

Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

@hubertsng wrote:

If you look at my code, I got all the way to the end but I don't know how to grab the columns that are in that successive pattern. I grabbed all of the columns and I can subset the columns by making it satisfy it being inside an array of strings that I enter manually but not a list of strings that I can set automatically. And I could possibly just loop over the columns instead of doing a macro but I feel as though the same problem that exists when looping over the columns exists when shoving all of the columns I want inside the macro.

hubertsng · Posted 07-22-2019 01:18 PM

How do you make a string of successive S1-S1000 to then compare it to the string of the column name or how do I get a SCAN(colname, 1, "_") to equal the successive variables S1-S1000.

Reeza · Posted 07-22-2019 01:22 PM

@hubertsng wrote:

How do you make a string of successive S1-S1000 to then compare it to the string of the column name or how do I get a SCAN(colname, 1, "_") to equal the successive variables S1-S1000.

I don't know what that means. Show some example input data with 5 columns and explain what you want as output, or include the example output and someone can likely provide several different methods to accomplish it. There's clearly people here who are more than eager to help if you provide enough details 🙂

hubertsng · Posted 07-22-2019 01:27 PM

I know I'm terrible at explaining.

So let's say I have 500 columns. 100 which is named S1_something, S2_something, S3_something... , S100_something and and SOMETHING1_something, SOMETHING2_something, SOMETHING3_something,... SOMETHING400_something. I want to grab just the 100 columns that start with S1-S100 and not the ones that start with SOMETHING1-SOMETHING400. The reason why I want to grab those columns is that I want to then perform a macro, or loop through those columns and perform a function. Or another way of looking at it is that I want to only KEEP those columns, S1- S100, and DROP the rest because if I do, then I can just loop over all of the columns of the remaining data set and then merge it with the original. Not sure if that clears anything up.

Reeza · Posted 07-22-2019 01:29 PM

Read this:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

hubertsng · Posted 07-22-2019 01:57 PM

Okay so I got closer to what I want by reading the link you gave me with

PROC SQL;
    SELECT name || "=" || CATS(SUBSTR(name, 1, 27), "_temp") INTO :rename_list SEPARATED BY " "
    FROM dictionary.columns
    WHERE libname="WORK"
    AND memname="HAVE";
QUIT;

PROC DATASETS LIBRARY=WORK NOLIST;
    MODIFY HAVE;
    RENAME &rename_list.;
QUIT;

PROC SQL;
    SELECT name || "=" || SCAN(name, 1, "_") INTO :rename_list SEPARATED BY " "
    FROM dictionary.columns
    WHERE libname="WORK"
    AND memname="HAVE";
QUIT;

PROC DATASETS LIBRARY=WORK NOLIST;
    MODIFY HAVE;
    RENAME &rename_list.;
QUIT;

And the only thing that doesn't work now is the macro. I successfully renamed all of the variables to S1-S7 P1-P9, etc. instead of having them be S1_Something-S7_Something

ANOTHER WAY OF DOING IT

https://communities.sas.com/t5/SAS-Programming/Grabbing-all-columns-with-names-that-start-with-a-str...

Reeza · Posted 07-22-2019 02:04 PM

Is that what you're trying to do? Rename variables? You should be using PROC DATASETS then instead of a data step. You can also rename series quite easily, but I do not recommend using a suffix because you cannot easily reference them later on.

So renaming S1-S7 to S_S1-S_S7 would be:

This would work.

RENAME S1-S7 = S_S1-S_S7;

Tom · Posted 07-22-2019 02:14 PM

@hubertsng wrote:

I know I'm terrible at explaining.

So let's say I have 500 columns. 100 which is named S1_something, S2_something, S3_something... , S100_something and and SOMETHING1_something, SOMETHING2_something, SOMETHING3_something,... SOMETHING400_something. I want to grab just the 100 columns that start with S1-S100 and not the ones that start with SOMETHING1-SOMETHING400. The reason why I want to grab those columns is that I want to then perform a macro, or loop through those columns and perform a function. Or another way of looking at it is that I want to only KEEP those columns, S1- S100, and DROP the rest because if I do, then I can just loop over all of the columns of the remaining data set and then merge it with the original. Not sure if that clears anything up.

So your issue is that someone or something has added suffixes to the variables of interest and you want to rename those variables?

Or do you just want to identify those variables?

So is your pattern of interest variable names like Sdddd_XXXX where dddd can be a number between 1 and 9999 and XXXX can be any character? So take the list of names that actually exist and apply some logic to find the one you are interested in. You can then either rename them or just use them. Might help to put the list into a macro variable (or two).

proc contents data=have noprint out=contents; run;
data vars_of_interest;
  set contents;
  name=upcase(name);
  if name=:'S' and index(name,'_');
  if 1 <= input(substr(scan(name,1,'_',2),?32.) <= 9999;
  new_name = scan(name,1,'_');
run;
proc sql noprint;
  select name, new_name, catx('=',name,new_name)
    into :old_names separated by ' '
       , :new_names separated by ' '
       , :rename separated by ' '
  from vars_of_interest
  ;
quit;

Then you can use the macro variable to help you write the code to use your dataset.

data want;
  set have;
  array svars &old_names ;
  ....
  rename &rename ;
run;

tsap · Posted 07-22-2019 01:13 PM

using this dummy dataset:

DATA WORK.HAVE;
FORMAT 	 ID $1. S1 $1.	S2 $1. S3 $1. L1 $1.L2 $1. L3 $1.;
INFORMAT ID $1. S1 $1.	S2 $1. S3 $1. L1 $1.L2 $1. L3 $1.;
INPUT    ID 	S1 		S2 	   S3 	  L1 	L2	   L3;
INFILE DATALINES DLM='|' DSD;
DATALINES;
1|A|B|C|X|Y|Z
;

I tested this logic:

PROC SQL;
	SELECT DISTINCT name	INTO :ColList separated by ','
	FROM dictionary.columns
    WHERE memname="HAVE"
	AND SUBSTR(name,1,2) IN('S1','S2','S3','S4','S5','S6','S7','S8','S9');
QUIT;

%PUT &=ColList.;

And produced the Macro variable "ColList" with this value printed in the log

%PUT &=ColList.;
COLLIST=S1,S2,S3

Then wherever you want to use the list of columns identified, you can drop that macro variable into place.

hubertsng · Posted 07-22-2019 01:20 PM

I see that you can do that but I'm trying to make it so I don't have to specify each individual cell. Not sure if that's possible or not. So basically a "S1"-"S1000" type deal.

tsap · Posted 07-22-2019 01:50 PM

you aren't specifying each cell in my logic. The first two characters of the columns can be S1 through S9... so any number S999, S1000, S845... they would all get picked up using my logic without any changes to the code whatsoever.

Tom · Posted 07-22-2019 01:14 PM

You can use the colon prefix to make a variable list of all variables that start with a specific prefix.

array all_cols col: ;

Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Re: Grabbing all columns with names that start with a string in a successive pattern

Registration is open

SAS Training: Just a Click Away