BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AVA_16
Obsidian | Level 7

I have a sas dataset in the form of:

 

Family_id      Members           Patient_ids

   

1                       3                1,4,8    

2                       4                9,10,100,200

3                       5                350,50,60,45,55

 

I would like to have this data in the form such that each patient_id is a single observation. I would like to drop variable members and have two cloums in the final data set, family id in front of each patient id. For eg, I would like to have:

 

Family_id         Patient_id

1                       1

1                       4

1                       8

 

Any help is appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Hi @AVA_16.

 

Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.

 

data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8 
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(compress(Patient_ids, ' '),',');
      Patient_id=scan(compress(Patient_ids, ' '), i, ',');
      output;
   end;
run;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

Do like this

 

data have;
input Family_id Members Patient_ids:$20.;
datalines;
1 3 1,4,8 
2 4 9,10,100,200
3 5 350,50,60,45,55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(Patient_ids,',');
      Patient_id=scan(Patient_ids, i, ',');
      output;
   end;
run;
AVA_16
Obsidian | Level 7

Hello,

 

Thank you for your helpful reply. I tried this code. it works perfect. However, it seems I have some other issue in my dataset:

 

# Variable Type Len Format Informat  123

Family_idsChar5$5.$5. 
membersNum8BEST.  
Patient_idsChar337$337.$337.

 

These are the variables in my dataset. Family ids and patient ids both are character variables.

 

After I use the below mentioned code, I get:

 

 

Obs Family_ids Patient_ids123456789101112        
11
1 
1 
29
2 
2 
2 
3350
3 
3 
3 
3 
  
  
  
  
  
  
  
  

 

Only the first patient id is taken and rest of the rows in from a particular family id are empty. I guess, this happens becasue there patient ids in the original data have space after coma as shown below. Any suggestions are appreciated.

 

Obs Family_ids members Patient_ids12

131, 4, 8
249, 10, 100, 200
PeterClemmensen
Tourmaline | Level 20

Hi @AVA_16.

 

Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.

 

data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8 
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(compress(Patient_ids, ' '),',');
      Patient_id=scan(compress(Patient_ids, ' '), i, ',');
      output;
   end;
run;
AVA_16
Obsidian | Level 7
Great! It worked now.
PeterClemmensen
Tourmaline | Level 20

Glad it worked 🙂

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1821 views
  • 1 like
  • 2 in conversation