BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lost_Gary
Quartz | Level 8

my data appears like this:  

 

ID A1 A2  A3 A4 A5 A6 A7 A8 A9 A10
1 B1 B2                
2 B1                  
3                    
4 B1 B2 B3 B4 B5 B6        
5 B1                  
6                    
7                    
8 B1 B2 B3 B4 B5 B6 B7 B8 B9  
9                    
10                    

 

and what I would like is to combine this data into a single data field (i.e. B1,B2,B3) with commas separating the data.  If I use something like this:

newvar = catx(',', A1, A2, A3........A10); 

I get a period at the end of the field - so the Newvar could look like:  B1,B2,.

 

Also my fields could range from A1 to A100 depending on the output.  Is there a simple way for me to make this field dynamic with the data?  

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

One way:

data junk;
  input a1 $ a2 $ a3 $ a4 $;
datalines;
b1 b5 b2 bc
bb .  b3 bd
b1 b2 .  b4
;

options missing=' ';
data example;
   set junk;
   newvar= catx(',',of a:);
run;
/* reset the option*/
options missing='.';

The current missing character is used in the CATX when a variable is missing (go figure). If you reset the missing character to a blank then the normal CATX behavior "skips" it. The list form "of a:" says to use all of the variables whose names start with the letter A". So may not be appropriate with more complex data but does get around the not knowing how many you have.

 

I'm  not fond of "compressing" the . as it will remove decimals in numeric values or other places where a period is needed to understand the value.

 

Caveat: you may have to set the LENGTH of the newvariable to a much larger than default value, 200 characters, to accomodate 100 values plus commas. Assuming each of your A1 through An could be up to 20 characters long. Then you need newvar to be 20*100 (or maximum number of A variables ever to be encountered) + 99 (or number of variables - 1) to be sure you will have enough space to store the result.

 

Now for the question of "why"? Notice that the example I provided includes missing values in the "middle". Does the output in that example actually make sense? What can you do with it? Especially when you have 100 or more values involved.

View solution in original post

3 REPLIES 3
r_behata
Barite | Level 11

Try :

 

newvar = compress(catx(',', of a1-a100),'.');
PaigeMiller
Diamond | Level 26

Before you go ahead and do this, think about what are the benefits and drawbacks to working with the data as a string B1,B2,B3... etc. compared to leaving the data as it is and then working with it that way. I really am not seeing a benefit to creating such a string. What in your opinion is the benefit of creating such a string?

--
Paige Miller
ballardw
Super User

One way:

data junk;
  input a1 $ a2 $ a3 $ a4 $;
datalines;
b1 b5 b2 bc
bb .  b3 bd
b1 b2 .  b4
;

options missing=' ';
data example;
   set junk;
   newvar= catx(',',of a:);
run;
/* reset the option*/
options missing='.';

The current missing character is used in the CATX when a variable is missing (go figure). If you reset the missing character to a blank then the normal CATX behavior "skips" it. The list form "of a:" says to use all of the variables whose names start with the letter A". So may not be appropriate with more complex data but does get around the not knowing how many you have.

 

I'm  not fond of "compressing" the . as it will remove decimals in numeric values or other places where a period is needed to understand the value.

 

Caveat: you may have to set the LENGTH of the newvariable to a much larger than default value, 200 characters, to accomodate 100 values plus commas. Assuming each of your A1 through An could be up to 20 characters long. Then you need newvar to be 20*100 (or maximum number of A variables ever to be encountered) + 99 (or number of variables - 1) to be sure you will have enough space to store the result.

 

Now for the question of "why"? Notice that the example I provided includes missing values in the "middle". Does the output in that example actually make sense? What can you do with it? Especially when you have 100 or more values involved.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1307 views
  • 2 likes
  • 4 in conversation