I have a sas dataset in the form of:
Family_id Members Patient_ids
1 3 1,4,8
2 4 9,10,100,200
3 5 350,50,60,45,55
I would like to have this data in the form such that each patient_id is a single observation. I would like to drop variable members and have two cloums in the final data set, family id in front of each patient id. For eg, I would like to have:
Family_id Patient_id
1 1
1 4
1 8
Any help is appreciated.
Hi @AVA_16.
Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.
data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;
data want(keep=Family_id Patient_id);
set have;
do i=1 to countw(compress(Patient_ids, ' '),',');
Patient_id=scan(compress(Patient_ids, ' '), i, ',');
output;
end;
run;
Do like this
data have;
input Family_id Members Patient_ids:$20.;
datalines;
1 3 1,4,8
2 4 9,10,100,200
3 5 350,50,60,45,55
;
data want(keep=Family_id Patient_id);
set have;
do i=1 to countw(Patient_ids,',');
Patient_id=scan(Patient_ids, i, ',');
output;
end;
run;
Hello,
Thank you for your helpful reply. I tried this code. it works perfect. However, it seems I have some other issue in my dataset:
# Variable Type Len Format Informat 123
Family_ids | Char | 5 | $5. | $5. | |
members | Num | 8 | BEST. | ||
Patient_ids | Char | 337 | $337. | $337. |
These are the variables in my dataset. Family ids and patient ids both are character variables.
After I use the below mentioned code, I get:
1 | 1 |
1 | |
1 | |
2 | 9 |
2 | |
2 | |
2 | |
3 | 350 |
3 | |
3 | |
3 | |
3 | |
Only the first patient id is taken and rest of the rows in from a particular family id are empty. I guess, this happens becasue there patient ids in the original data have space after coma as shown below. Any suggestions are appreciated.
Obs Family_ids members Patient_ids12
1 | 3 | 1, 4, 8 |
2 | 4 | 9, 10, 100, 200 |
Hi @AVA_16.
Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.
data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;
data want(keep=Family_id Patient_id);
set have;
do i=1 to countw(compress(Patient_ids, ' '),',');
Patient_id=scan(compress(Patient_ids, ' '), i, ',');
output;
end;
run;
Glad it worked 🙂
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.