Hello,
I have a data set that contains about ~40 variables. But for simplicity I'll do 5 variables in this example.
Basically if 1 of the 5 variable has a value, the others are missing. How do I create one variable containing all values across the 5 variables?
have:
subject ID | P_00_05 | P_71_80 | P_141_50 | P_181_90 | P_291_00 |
1 | aa | ||||
2 | bb | ||||
3 | cc | ||||
4 | ee | ||||
5 | ee | ||||
6 | dd | ||||
7 | bb | ||||
8 | cc | ||||
9 | aa |
want:
subject id | new_var |
1 | aa |
2 | bb |
3 | cc |
4 | ee |
5 | ee |
6 | dd |
7 | bb |
8 | cc |
9 | aa |
If the variables are all character something like this should work:
data want; set have; array v (*) var1 - var5; newvar = cats( of v(*)); run;
Caveat: you will want to define the LENGTH of the Newvar to hold the longest expected value. Otherwise it would be set the length of the first result which could truncate values.
The array is basically just to allow some shorthand and avoid listing multiple variables. If the array is defined with 10, 50 or a 100 variables the only difference would be in the Array definition.
If the variables are all character something like this should work:
data want; set have; array v (*) var1 - var5; newvar = cats( of v(*)); run;
Caveat: you will want to define the LENGTH of the Newvar to hold the longest expected value. Otherwise it would be set the length of the first result which could truncate values.
The array is basically just to allow some shorthand and avoid listing multiple variables. If the array is defined with 10, 50 or a 100 variables the only difference would be in the Array definition.
If the names are not sequential, but the variables are sequential when you look at your SAS data set, then this will work:
array v (*) P_00_05--P_291_00;
Note the double-dash.
If the variable names are not sequential in the data set, then you will have to type them yourself. (maybe there's an exception if they all have names that begin with P_ and then followed only by digits and underscore, then you could use P_: or PROC SQL to determines the names and you would not have to type the names yourself)
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.