Solved: From rows to columns

AVA_16 · Posted 03-11-2019 07:06 AM

I have a sas dataset in the form of:

Family_id Members Patient_ids

1 3 1,4,8

2 4 9,10,100,200

3 5 350,50,60,45,55

I would like to have this data in the form such that each patient_id is a single observation. I would like to drop variable members and have two cloums in the final data set, family id in front of each patient id. For eg, I would like to have:

Family_id Patient_id

1 1

1 4

1 8

Any help is appreciated.

PeterClemmensen · Posted 03-12-2019 04:42 AM

Hi @AVA_16.

Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.

data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8 
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(compress(Patient_ids, ' '),',');
      Patient_id=scan(compress(Patient_ids, ' '), i, ',');
      output;
   end;
run;

The DATA to DATA Step Macro
Blog: SASnrd

View solution in original post

PeterClemmensen · Posted 03-11-2019 07:11 AM

Do like this

data have;
input Family_id Members Patient_ids:$20.;
datalines;
1 3 1,4,8 
2 4 9,10,100,200
3 5 350,50,60,45,55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(Patient_ids,',');
      Patient_id=scan(Patient_ids, i, ',');
      output;
   end;
run;

The DATA to DATA Step Macro
Blog: SASnrd

AVA_16 · Posted 03-12-2019 03:29 AM

Hello,

Thank you for your helpful reply. I tried this code. it works perfect. However, it seems I have some other issue in my dataset:

# Variable Type Len Format Informat 123

Family_ids	Char	5	$5.	$5.
members	Num	8	BEST.
Patient_ids	Char	337	$337.	$337.

These are the variables in my dataset. Family ids and patient ids both are character variables.

After I use the below mentioned code, I get:

Obs Family_ids Patient_ids123456789101112

1	1
1
1
2	9
2
2
2
3	350
3
3
3
3

Only the first patient id is taken and rest of the rows in from a particular family id are empty. I guess, this happens becasue there patient ids in the original data have space after coma as shown below. Any suggestions are appreciated.

Obs Family_ids members Patient_ids12

1	3	1, 4, 8
2	4	9, 10, 100, 200

PeterClemmensen · Posted 03-12-2019 04:42 AM

Hi @AVA_16.

Check out the below code. I changed the example data set so it has spaces after the commas. Then, I use the Compress Function handle the spaces and get the desired result.

data have;
input Family_id Members Patient_ids:$20.;
infile datalines dlm='|';
datalines;
1|3|1, 4, 8 
2|4|9, 10, 100, 200
3|5|350, 50, 60, 45, 55
;

data want(keep=Family_id Patient_id);
   set have;
   do i=1 to countw(compress(Patient_ids, ' '),',');
      Patient_id=scan(compress(Patient_ids, ' '), i, ',');
      output;
   end;
run;

The DATA to DATA Step Macro
Blog: SASnrd