Re: how to convert long to wide data with grouping in each line

michellel · Posted 07-13-2015 05:41 PM

Hi, I have a question about converting long data to wide data within several groups.

My original data looks like below for same id.

id surgery_date medication_date medication_name

1 03/15/2010 05/01/2010 A

1 03/15/2010 05/01/2010 B

1 03/15/2010 09/01/2010 C

1 03/15/2010 11/01/2010 D

1 05/01/2011 09/01/2011 E

1 05/01/2011 09/01/2011 F

1 05/01/2011 09/01/2011 G

1 05/01/2011 12/01/2011 H

I need the new data looks like the following.

id surgery_date medication_date_1 medication_name_1 medication_name_2 medication_name_2 medication_date_2 medication_name_1 medication_name_2 medication_date_3 medication_name_1 medication_name_2

1 03/15/2010 05/01/2010 A B 09/01/2010 C 11/01/2010 D

1 05/01/2011 09/01/2011 E F G 12/01/2011 H

If there is no medication_date_&i. variables in each line, I know how to put the medication_name from long data to wide data. However now, in each line there are also medication name group. In this case, I am not sure if SAS could get the new dataset like the above.

Thanks so much for your help!

ballardw · Posted 07-13-2015 07:09 PM

I'm guessing as it's hard to read your desired output but does this come close to what you want?

data have;
informat id $4. surgery_date medication_date mmddyy10.   medication_name $4.;
format   surgery_date medication_date mmddyy10.   ;
input id     surgery_date    medication_date     medication_name;
datalines;
1          03/15/2010                  05/01/2010                   A
1          03/15/2010                  05/01/2010                   B
1          03/15/2010                  09/01/2010                   C
1          03/15/2010                  11/01/2010                   D
1          05/01/2011                  09/01/2011                   E
1          05/01/2011                  09/01/2011                   F
1          05/01/2011                  09/01/2011                   G
1          05/01/2011                  12/01/2011                   H
;
run;

proc transpose data=have out=trans1(drop=_name_) Prefix=Medication_date;
by id surgery_date;
var medication_date;
run;

proc transpose data=have out=trans2(drop=_name_) Prefix=medication_name;
by id surgery_date;
var medication_name;
run;

data want;
merge trans1 trans2;
by id surgery_date;
run;

michellel · Posted 07-13-2015 08:46 PM

Thanks ballardw for your effort. Sorry I did not notice that my result data looks not clear on different screen.

I got your answer, transposing the two variables medication_date and medication_name from long to wide and merge together, but I am wondering how to tell which medication_name within which medication_date in one row. The result I was trying to get looks as below for columns. Thanks!

id surgery_date med_date_1 med_name_1 med_name_2 med_date_2 med_name_1 med_name_2

Reeza · Posted 07-13-2015 11:53 PM

It doesn't do what you want. This is a bad data design in so many ways, it'll end up being wider than necessary by far with many empty cells. Why do you want it this way?

michellel · Posted 07-14-2015 08:49 PM

I agree with your point. This is not good data structure. The data format was required by someone else.

ballardw · Posted 07-14-2015 10:30 AM

unless the data get sorted in between the two transpose statements date1 should go with medication1

Reeza · Posted 07-14-2015 11:02 AM

In the OP expected output the date, 05/01/2010, is only outputted once whereas in yours it's outputted twice.

I don't agree with the structure of the expected data however and your output is a better format.

ballardw · Posted 07-14-2015 12:22 PM

As I said, I was guessing at the requirement. The apparent requirement I thought was there was to group on two variables as the statement was unclear. And considering an apparent 3 month gap between the medication dates ...

michellel · Posted 07-14-2015 08:50 PM

Thanks for your help. The data was required by someone else.

Ksharp · Posted 07-14-2015 09:03 AM

The most convenient way is using proc means .

Or use MERGE skill which me , Matt and Arthur.T have been talked at

http://support.sas.com/resources/papers/proceedings15/2785-2015.pdf

Code: Program

data have;
informat id $4.  surgery_date  medication_date mmddyy10.   medication_name $4.;
format   surgery_date  medication_date mmddyy10.   ;
input id   surgery_date   medication_date   medication_name;
datalines;
1 03/15/2010 05/01/2010 A
1 03/15/2010 05/01/2010 B
1 03/15/2010 09/01/2010 C
1 03/15/2010 11/01/2010 D
1 05/01/2011 09/01/2011 E
1 05/01/2011 09/01/2011 F
1 05/01/2011 09/01/2011 G
1 05/01/2011 12/01/2011 H
;
run;
proc sql;
 select max(count) into : n 
  from (select count(*) as count from have group by id,surgery_date);
quit;
proc summary data=have nway;
 class id surgery_date;
 output out=want idgroup(out[&n]  (medication_date  medication_name )=);
run;

Xia Keshan

michellel · Posted 07-14-2015 08:52 PM

Thanks Xia Ke for your help. Even though the result format is still not what I need, your coding is really nice.

Ksharp · Posted 07-15-2015 08:22 AM

OU. I just realize what you are looking for .

Code: Program.sas

data have;
informat id $4.  surgery_date  medication_date mmddyy10.   medication_name $4.;
format   surgery_date  medication_date mmddyy10.   ;
input id   surgery_date   medication_date   medication_name;
datalines;
1 03/15/2010 05/01/2010 A
1 03/15/2010 05/01/2010 B
1 03/15/2010 09/01/2010 C
1 03/15/2010 11/01/2010 D
1 05/01/2011 09/01/2011 E
1 05/01/2011 09/01/2011 F
1 05/01/2011 09/01/2011 G
1 05/01/2011 12/01/2011 H
;
run;
proc sql;
 select max(count) into : n separated by ' '
  from (select count(*) as count from have group by id,surgery_date,medication_date);
quit;
proc summary data=have nway;
 class id surgery_date medication_date;
 output out=temp(drop=_:) idgroup(out[&n]  (medication_name )=);
run;

proc sql;
 select max(count) into : m separated by ' '
  from (select count(*) as count from temp group by id,surgery_date);
quit;
proc summary data=temp nway;
 class id surgery_date;
 output out=want(drop=_:) idgroup(out[&m]  (medication_date medication_name_1-medication_name_&n )=);
run;

proc sql;
 select name into : list separated by ' '
  from dictionary.columns
   where libname='WORK' and memname='WANT' and lowcase(name) like 'medication%'
   order by input(scan(name,-1,'_'),best8.),name;
quit;
data final_want;
 retain id surgery_date &list ;
 set want;
run;

Xia Keshan

Code: Program

Code: Program.sas

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away