BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AVA_16
Obsidian | Level 7

I have a sas dataset in the form of:

 

Family_id      Members           Patient_ids

   

1                       3                1,4,8    

2                       4                9,10,100,200

3                       5                350,50,60,45,55

 

I would like to have this data in the form such that each patient_id is a single observation. I would like to drop variable members and have two cloums in the final data set, family id in front of each patient id. For eg, I would like to have:

 

Family_id         Patient_id

1                       1

1                       4

1                       8

 

Any help is appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Hi @AVA_16.

 

Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.

 

data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8 
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(compress(Patient_ids, ' '),',');
      Patient_id=scan(compress(Patient_ids, ' '), i, ',');
      output;
   end;
run;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

Do like this

 

data have;
input Family_id Members Patient_ids:$20.;
datalines;
1 3 1,4,8 
2 4 9,10,100,200
3 5 350,50,60,45,55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(Patient_ids,',');
      Patient_id=scan(Patient_ids, i, ',');
      output;
   end;
run;
AVA_16
Obsidian | Level 7

Hello,

 

Thank you for your helpful reply. I tried this code. it works perfect. However, it seems I have some other issue in my dataset:

 

# Variable Type Len Format Informat  123

Family_idsChar5$5.$5. 
membersNum8BEST.  
Patient_idsChar337$337.$337.

 

These are the variables in my dataset. Family ids and patient ids both are character variables.

 

After I use the below mentioned code, I get:

 

 

Obs Family_ids Patient_ids123456789101112        
11
1 
1 
29
2 
2 
2 
3350
3 
3 
3 
3 
  
  
  
  
  
  
  
  

 

Only the first patient id is taken and rest of the rows in from a particular family id are empty. I guess, this happens becasue there patient ids in the original data have space after coma as shown below. Any suggestions are appreciated.

 

Obs Family_ids members Patient_ids12

131, 4, 8
249, 10, 100, 200
PeterClemmensen
Tourmaline | Level 20

Hi @AVA_16.

 

Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.

 

data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8 
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(compress(Patient_ids, ' '),',');
      Patient_id=scan(compress(Patient_ids, ' '), i, ',');
      output;
   end;
run;
AVA_16
Obsidian | Level 7
Great! It worked now.
PeterClemmensen
Tourmaline | Level 20

Glad it worked 🙂

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 796 views
  • 1 like
  • 2 in conversation