I have the below table;
COLUMN 1 | COLUMN 2 ends in _x | COLUMN 3ends in _y | … | COLUMN 4 ends in _w |
x1 | 1 | 2 | 3 | |
y1 | 1 | 2 | 3 | |
z1 | 1 | 2 | 3 | |
w1 | 1 | 2 | 3 | |
x2 | 5 | 1 | 4 | |
y2 | 5 | 1 | 4 | |
z2 | 5 | 1 | 4 | |
w2 | 5 | 1 | 4 | |
a2 | 5 | 1 | 4 |
That I have aggregated into the below column;
COLUMN 1 | AGGREGATE COLUMN |
x1 | 1 |
y1 | 2 |
z1 | … |
w1 | 3 |
x2 | 5 |
y2 | 1 |
z2 | … |
w2 | … |
a2 | 4 |
using the code;
if find(column 2,' x ',' i ') then new_col=column_x;
else if find(column 3, ' y ', ' i ') then new_col=column_y;
etc.
How can I now drop or delete the old columns that i used to make the aggregate column (column 2, column 3, column 4, etc.) in the same step?
Presumably this code is part of a DATA step. Within the same DATA step you can drop variables:
drop column3 column4;
That can go anywhere within the DATA step. You just have to supply a list of the proper names.
We know they're not column 3 and column 4 because those aren't legal names. If the process of listing the names in a DROP statement is more complex for your application, we can revisit how to get a list of names.
Yes, this is currently in a data step, but my goal is to place the process into a pre-existing macro. Presently there are about 100 columns of column 2, column 3, column 4, ... , column 100, which is why i need to use find(column 2, 'x','i') etc. . column 2, column 3 and x,y,z do have macro variables assigned in the macro the code will eventually end up in.
@r_levy wrote:
Yes, this is currently in a data step, but my goal is to place the process into a pre-existing macro. Presently there are about 100 columns of column 2, column 3, column 4, ... , column 100, which is why i need to use find(column 2, 'x','i') etc. . column 2, column 3 and x,y,z do have macro variables assigned in the macro the code will eventually end up in.
Go back to the beginning. Post actual variable names and use the code with actual variable names and values in the Find statements.
Currently your example if find(column 2,' x ',' i ') then new_col=column_x; makes no sense as you show no values containing "X" in the example data.
Better would be to provide an actual data step that creates a small data set with your "column 1" through "column 4" with actual data, which can be fake but would allow us to test your "aggregate code".
The "drop" question can be very simple or somewhat complex depending on your actual data set structure. If all of the "column 2" through "column 100" variables are adjacent in the data set as sequential columns you can use:
Drop column2 -- column100;
The -- (yes two dashes) indicates as variables in sequence from column2 to column100 if you use the actual variable name.
Another approach is if all of the variables, or groups of variables have similar names. Then you can use the : list creator:
Drop grp: ;
would remove all variables whose names start with the three characters "grp".
Your approach to the "aggregate" looks like an array and possibly no macro code would be needed at all if the variables were named "nicely" and you had a better description of the look up. I suspect no macro would actually be needed if the data was structured in a cleaner manner. Your statement " Presently there are about 100 " implies to me that the number may be increasing meaning a poor data structure and hard to maintain code.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.