I have a sas dataset in the form of:
Family_id Members Patient_ids
1 3 1,4,8
2 4 9,10,100,200
3 5 350,50,60,45,55
I would like to have this data in the form such that each patient_id is a single observation. I would like to drop variable members and have two cloums in the final data set, family id in front of each patient id. For eg, I would like to have:
Family_id Patient_id
1 1
1 4
1 8
Any help is appreciated.
Hi @AVA_16.
Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.
data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;
data want(keep=Family_id Patient_id);
set have;
do i=1 to countw(compress(Patient_ids, ' '),',');
Patient_id=scan(compress(Patient_ids, ' '), i, ',');
output;
end;
run;
Do like this
data have;
input Family_id Members Patient_ids:$20.;
datalines;
1 3 1,4,8
2 4 9,10,100,200
3 5 350,50,60,45,55
;
data want(keep=Family_id Patient_id);
set have;
do i=1 to countw(Patient_ids,',');
Patient_id=scan(Patient_ids, i, ',');
output;
end;
run;
Hello,
Thank you for your helpful reply. I tried this code. it works perfect. However, it seems I have some other issue in my dataset:
# Variable Type Len Format Informat 123
Family_ids | Char | 5 | $5. | $5. | |
members | Num | 8 | BEST. | ||
Patient_ids | Char | 337 | $337. | $337. |
These are the variables in my dataset. Family ids and patient ids both are character variables.
After I use the below mentioned code, I get:
1 | 1 |
1 | |
1 | |
2 | 9 |
2 | |
2 | |
2 | |
3 | 350 |
3 | |
3 | |
3 | |
3 | |
Only the first patient id is taken and rest of the rows in from a particular family id are empty. I guess, this happens becasue there patient ids in the original data have space after coma as shown below. Any suggestions are appreciated.
Obs Family_ids members Patient_ids12
1 | 3 | 1, 4, 8 |
2 | 4 | 9, 10, 100, 200 |
Hi @AVA_16.
Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.
data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;
data want(keep=Family_id Patient_id);
set have;
do i=1 to countw(compress(Patient_ids, ' '),',');
Patient_id=scan(compress(Patient_ids, ' '), i, ',');
output;
end;
run;
Glad it worked 🙂
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.