Hello Experts,
I want to drop some variables from a dataset with over 200 variables. The variable names are unique but are grouped using a common prefix/characters.
The following variables table is a sample obtained using the code below. For example, I would like to drop the range of variables starting with AA01 - AA03, variables that start with 'LICAT' and variables containing 'XY' and 'BC'. Thank you.
proc contents data=moi.industry out=moi.names(keep=name) noprint ; run;
NAME |
AA08 |
AA01Z08 |
AA01Y08 |
AA02Z08 |
AA02Y08 |
AA03Z08 |
AA03Y08 |
AA04Z08 |
AA04Y08 |
AA05Z08 |
AA05Y08 |
BBCY08 |
BBCZ08 |
LICATY08 |
LICATN08 |
WXY01Y08 |
WXY01X08 |
proc sql noprint;
select name into :names separated by ' ' from moi.names
where name eqt 'LICAT' or name?'XY' or name?'BC';
quit;
Then in a data step
drop &names;
@PaigeMiller Thank you. It works, except for the range AA01 - AA03.
@michokwu wrote:
@PaigeMiller Thank you. It works, except for the range AA01 - AA03.
You have the change the WHERE clause in PROC SQL to select variables whose names begin with AA01-AA03. Since I showed you how to delete names that begin with LICAT, I leave deleting names that begin with AA01-AA03 up to you as a homework assignment.
For prefixed variables, it's very easy. You can use the colon as a wild card. For example in the data step below, all variables starting with "AA01" - "AA03" and "LICAT."
DATA moi.names;
DROP AA01:;
DROP AA02:;
DROP AA03:;
DROP LICAT:;
SET moi.industry;
RUN;
Jim
Unfortunately the above answer doesn't take care of "XY" embedded in the middle of a variable. However, since the "XY's" all are prefixed with a "W", you could code:
DATA Moi.Names;
DROP WXY:;
DROP AA01:;
DROP AA02:;
DROP AA03:;
DROP LICAT:;
SET Moi.INDUSTRY;
RUN;
Which would give you:
Jim
@jimbarbour Thanks. Surprisingly, it works for me. I don't really want to write out all the prefixes, that's why I'm trying to use a range of variables
I understand. In that case, it's better to go with what @PaigeMiller suggested. Would you like help with that or do understand what he was suggesting?
However, save the colon wildcard in the back of your mind somewhere. It is a useful technique. If you prefix all your work variables with an underscore, you can code a single drop statement such that no work variables wind up in your Production data. For example:
DATA Prod.Very_Important_Data;
DROP _:;
SET Input.Source_Data;
_Daily_Amount = Annual_Amount / 365;
_Standardized_Quarter = (30 * 3);
Quarterly_Amount = _Daily_Amount * _Standardized_Quarter;
RUN;
The above example is a bit contrived, but in this case _Daily_Amount and _Standardized_Quarter are removed from the final results because of the Drop statement. In a program with dozens of intermediate or work variables, the colon modifier is very handy.
Jim
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.