BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
HitmonTran
Pyrite | Level 9

Hello,

 

I have a data set that contains about ~40 variables. But for simplicity I'll do 5 variables in this example.

Basically if 1 of the 5 variable has a value, the others are missing.  How do I create one variable containing all values across the 5 variables? 

 

have:

subject IDP_00_05P_71_80P_141_50P_181_90P_291_00
1aa    
2 bb   
3  cc  
4    ee
5    ee
6   dd 
7 bb   
8  cc  
9aa    

 

want:

subject idnew_var
1aa
2bb
3cc
4ee
5ee
6dd
7bb
8cc
9aa
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

If the variables are all character something like this should work:

 

data want;
   set have;
   array v (*) var1 - var5;
   newvar = cats( of v(*));
run;

Caveat: you will want to define the LENGTH of the Newvar to hold the longest expected value. Otherwise it would be set the length of the first result which could truncate values.

The array is basically just to allow some shorthand and avoid listing multiple variables. If the array is defined with 10, 50 or a 100 variables the only difference would be in the Array definition.

 

View solution in original post

4 REPLIES 4
ballardw
Super User

If the variables are all character something like this should work:

 

data want;
   set have;
   array v (*) var1 - var5;
   newvar = cats( of v(*));
run;

Caveat: you will want to define the LENGTH of the Newvar to hold the longest expected value. Otherwise it would be set the length of the first result which could truncate values.

The array is basically just to allow some shorthand and avoid listing multiple variables. If the array is defined with 10, 50 or a 100 variables the only difference would be in the Array definition.

 

HitmonTran
Pyrite | Level 9
thanks for the quick response, I think that should work but now I'm having problems with the array statement. The variable names aren't really sequential as in the example (var1- var5). I updated my post with the new variables. Could you please help me? Thank you
PaigeMiller
Diamond | Level 26

If the names are not sequential, but the variables are sequential when you look at your SAS data set, then this will work:

 

array v (*) P_00_05--P_291_00;

Note the double-dash.

 

If the variable names are not sequential in the data set, then you will have to type them yourself. (maybe there's an exception if they all have names that begin with P_ and then followed only by digits and underscore, then you could use P_: or PROC SQL to determines the names and you would not have to type the names yourself)

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 759 views
  • 0 likes
  • 3 in conversation