Hello Experts,
I want to drop some variables from a dataset with over 200 variables. The variable names are unique but are grouped using a common prefix/characters.
The following variables table is a sample obtained using the code below. For example, I would like to drop the range of variables starting with AA01 - AA03, variables that start with 'LICAT' and variables containing 'XY' and 'BC'. Thank you.
proc contents data=moi.industry out=moi.names(keep=name) noprint ; run;
NAME |
AA08 |
AA01Z08 |
AA01Y08 |
AA02Z08 |
AA02Y08 |
AA03Z08 |
AA03Y08 |
AA04Z08 |
AA04Y08 |
AA05Z08 |
AA05Y08 |
BBCY08 |
BBCZ08 |
LICATY08 |
LICATN08 |
WXY01Y08 |
WXY01X08 |
proc sql noprint;
select name into :names separated by ' ' from moi.names
where name eqt 'LICAT' or name?'XY' or name?'BC';
quit;
Then in a data step
drop &names;
@PaigeMiller Thank you. It works, except for the range AA01 - AA03.
@michokwu wrote:
@PaigeMiller Thank you. It works, except for the range AA01 - AA03.
You have the change the WHERE clause in PROC SQL to select variables whose names begin with AA01-AA03. Since I showed you how to delete names that begin with LICAT, I leave deleting names that begin with AA01-AA03 up to you as a homework assignment.
For prefixed variables, it's very easy. You can use the colon as a wild card. For example in the data step below, all variables starting with "AA01" - "AA03" and "LICAT."
DATA moi.names;
DROP AA01:;
DROP AA02:;
DROP AA03:;
DROP LICAT:;
SET moi.industry;
RUN;
Jim
Unfortunately the above answer doesn't take care of "XY" embedded in the middle of a variable. However, since the "XY's" all are prefixed with a "W", you could code:
DATA Moi.Names;
DROP WXY:;
DROP AA01:;
DROP AA02:;
DROP AA03:;
DROP LICAT:;
SET Moi.INDUSTRY;
RUN;
Which would give you:
Jim
@jimbarbour Thanks. Surprisingly, it works for me. I don't really want to write out all the prefixes, that's why I'm trying to use a range of variables
I understand. In that case, it's better to go with what @PaigeMiller suggested. Would you like help with that or do understand what he was suggesting?
However, save the colon wildcard in the back of your mind somewhere. It is a useful technique. If you prefix all your work variables with an underscore, you can code a single drop statement such that no work variables wind up in your Production data. For example:
DATA Prod.Very_Important_Data;
DROP _:;
SET Input.Source_Data;
_Daily_Amount = Annual_Amount / 365;
_Standardized_Quarter = (30 * 3);
Quarterly_Amount = _Daily_Amount * _Standardized_Quarter;
RUN;
The above example is a bit contrived, but in this case _Daily_Amount and _Standardized_Quarter are removed from the final results because of the Drop statement. In a program with dozens of intermediate or work variables, the colon modifier is very handy.
Jim
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.