DATA Step, Macro, Functions and more

Delete all Columns that contain a certain String

Reply
Contributor
Posts: 45

Delete all Columns that contain a certain String

[ Edited ]

Is there some way to reference column names? I mean deleting observations in a ROW is easy but is there a way to delete COLUMNS that are named a certain way?

 

Example of what my data might look like:

 

ID:                            A    A_HET  B   B_HET  C   C_HET  D   D_HET                                   

41234                      0           0     2

12351                      1           1     1

41256                      0           0     0

54432                      2           0     0

 

etc...

 

I basically want to delete ALL columns with "HET" in them. I know for rows you could simply write "If colname := "HET" then delete" but how do I do this the "other" way around?

PROC Star
Posts: 307

Re: Delete all Columns that contain a certain String

Posted in reply to Leon27607

I would use metadata to create a macro variable to use in a drop statement.

 

proc sql noprint;
  SELECT name INTO :dropMe SEPARATED BY ' '
  FROM sashelp.vcolumn
  WHERE libname = 'YOURLIB' AND
                memname = 'HAVE' AND
                prxmatch('/het$/i', strip(name)) > 0;
quit;

data want;
  set have;
  drop &dropMe;
run;
Super User
Posts: 19,789

Re: Delete all Columns that contain a certain String

Posted in reply to Leon27607

If you have control of your naming conventions, if you name the variables with a consistent prefix this can be very simple. Unfortunately it doesn't work for suffixes. 

 

Drop het_: ;

Contributor
Posts: 45

Re: Delete all Columns that contain a certain String

[ Edited ]

@Reeza I don't really have control of the naming conventions. The data I'm working with deals with human subjects so I can't go into too much detail about it, however the names basically are like what I mentioned XXXXXX and XXXXXXX_HET for everything. I was hoping I could delete all the columns with _HET first, then rename the remaining columns to XX1, XX2, XX3, etc.... so I can do analysis easily just by doing a loop. I am just having trouble thinking about how I can reference the column names or maybe their positions because their names are very different and it would be troublesome to do "rename XXXXXX = XX1" for every single one of them.

PROC Star
Posts: 307

Re: Delete all Columns that contain a certain String

Posted in reply to Leon27607

Renaming all your variables does not necessarily have to be that much of a hassle. The following code gives an example of how you could use metadata to flip "_het" to "het_" for many variables. It would have to adapted to the particulars of your data, probably, but this should give you an idea. 

 

 

data have;
    retain a_het xx_het zdfe_het b_het a234_het 0;
    x = 1;
run;

proc sql;
    SELECT catx('=', name, prxchange('s/(\w{1,28})(_)(het)/\3\2\1/i', -1, strip(name))) INTO :renameMe SEPARATED BY ' '
    FROM sashelp.vcolumn
    WHERE libname = 'WORK' AND
          memname = 'HAVE' AND
          prxmatch('/_het$/i', strip(name)) > 0;
quit;

data want;
    set have;
    rename &renameMe;
run;
Ask a Question
Discussion stats
  • 4 replies
  • 245 views
  • 0 likes
  • 3 in conversation