I have a large data set with a bunch of redundant values in many columns. each column corresponds to a variable, and repeats the same value for the other remaining variables like below;
COLUMN 1 | COLUMN 2 ends in _x | COLUMN 3ends in _y | … | COLUMN 4 ends in _w |
x1 | 1 | 2 | 3 | |
y1 | 1 | 2 | 3 | |
z1 | 1 | 2 | 3 | |
w1 | 1 | 2 | 3 | |
x2 | 5 | 1 | 4 | |
y2 | 5 | 1 | 4 | |
z2 | 5 | 1 | 4 | |
w2 | 5 | 1 | 4 | |
a2 | 5 | 1 | 4 |
x,y,z,w,a, etc. are the variables.
where x1,y1,z1,w1 are part data for customer 1 and x1,y2,z2,w2,a2 are data of customer 2.
column 1 is the value for variable x, and it's value repeats for the rest of the variables for that customer.
How can I aggregate these into one column for each variable for the customer like below;
COLUMN 1 | AGGREGATE COLUMN |
x1 | 1 |
y1 | 2 |
z1 | … |
w1 | 3 |
x2 | 5 |
y2 | 1 |
z2 | … |
w2 | … |
a2 | 4 |
I'm not sure how to tell SAS to serach the column name for the variable, and return just that value.
You can use FIND() function to search the values of a variable and then create a new variable based on that.
if find(column1,'x','i') then new_col=column_x;
else if find(column1,'y','i') then new_col=column_y;
...................so on.
You can use FIND() function to search the values of a variable and then create a new variable based on that.
if find(column1,'x','i') then new_col=column_x;
else if find(column1,'y','i') then new_col=column_y;
...................so on.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.