BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AVA_16
Obsidian | Level 7

I have a sas dataset in the form of:

 

Family_id      Members           Patient_ids

   

1                       3                1,4,8    

2                       4                9,10,100,200

3                       5                350,50,60,45,55

 

I would like to have this data in the form such that each patient_id is a single observation. I would like to drop variable members and have two cloums in the final data set, family id in front of each patient id. For eg, I would like to have:

 

Family_id         Patient_id

1                       1

1                       4

1                       8

 

Any help is appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Hi @AVA_16.

 

Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.

 

data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8 
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(compress(Patient_ids, ' '),',');
      Patient_id=scan(compress(Patient_ids, ' '), i, ',');
      output;
   end;
run;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

Do like this

 

data have;
input Family_id Members Patient_ids:$20.;
datalines;
1 3 1,4,8 
2 4 9,10,100,200
3 5 350,50,60,45,55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(Patient_ids,',');
      Patient_id=scan(Patient_ids, i, ',');
      output;
   end;
run;
AVA_16
Obsidian | Level 7

Hello,

 

Thank you for your helpful reply. I tried this code. it works perfect. However, it seems I have some other issue in my dataset:

 

# Variable Type Len Format Informat  123

Family_idsChar5$5.$5. 
membersNum8BEST.  
Patient_idsChar337$337.$337.

 

These are the variables in my dataset. Family ids and patient ids both are character variables.

 

After I use the below mentioned code, I get:

 

 

Obs Family_ids Patient_ids123456789101112        
11
1 
1 
29
2 
2 
2 
3350
3 
3 
3 
3 
  
  
  
  
  
  
  
  

 

Only the first patient id is taken and rest of the rows in from a particular family id are empty. I guess, this happens becasue there patient ids in the original data have space after coma as shown below. Any suggestions are appreciated.

 

Obs Family_ids members Patient_ids12

131, 4, 8
249, 10, 100, 200
PeterClemmensen
Tourmaline | Level 20

Hi @AVA_16.

 

Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.

 

data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8 
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(compress(Patient_ids, ' '),',');
      Patient_id=scan(compress(Patient_ids, ' '), i, ',');
      output;
   end;
run;
AVA_16
Obsidian | Level 7
Great! It worked now.
PeterClemmensen
Tourmaline | Level 20

Glad it worked 🙂

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1875 views
  • 1 like
  • 2 in conversation