DATA Step, Macro, Functions and more

summarize diagnosis multiple columns and multiple per patients

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 128
Accepted Solution

summarize diagnosis multiple columns and multiple per patients

Hi I have the following dataset: I wan to create a table that summarizes dx1-dx4 by the diagnosis regardless of the "date" where each dx per patient can only appear one. or example. patient3 has 3 diagnosis of 250, I want this to be counted as only 1 "250"

patientid    dx1 dx2 dx3 dx4      date 
1               250 223 224 444    5/5/2009 
1               555 666 120 250    5/6/2008 
2               120  666                 1/2/2007 
2               120  666                 1/1/2007 
3               250                        2/2/2004
3               240 250                      3/3/2004 
3               250                          1/1/2004 

The summary table should look like this:

dx   freq

223  1

224 1

120  1

250 2

 

and so on.

Thanks 

 

 


Accepted Solutions
Solution
‎04-05-2016 12:12 PM
Contributor
Posts: 30

Re: summarize diagnosis multiple columns and multiple per patients

Posted in reply to lillymaginta

So I believe I created what you are looking for.  I turned it into a macro just in case if you have more diagnoses spots than you originally specified.  Let me know if you have any questions.

 

DATA HAVE;
INPUT patientid 1-2 dx1 3-5 dx2 7-9 dx3 11-13 dx4 15-17 date mmddyy10.;
datalines;
1 250 223 224 444 5/5/2009
1 555 666 120 250 5/6/2008
2 120 666 1/2/2007
2 120 666 1/1/2007
3 250 2/2/2004
3 240 250 3/3/2004
3 250 1/1/2004
;
run;

 

PROC SORT DATA = HAVE;
BY PatientID;
RUN;

 

%MACRO FREQ (END = );
%DO i = 1 %TO &END;
PROC FREQ DATA = HAVE NOPRINT;
TABLES PATIENTID*DX&i / OUT = FREQ_DX&i;
RUN;
DATA FREQ_DX&i;
SET FREQ_DX&i(RENAME = (COUNT = COUNT&i DX&i = DX));
DROP PERCENT;
RUN;
PROC SORT DATA = FREQ_DX&i;
BY PatientID DX;
RUN;
%END;
DATA ALMOST;
MERGE FREQ_DX1-FREQ_DX&END;
BY PatientID DX;
RUN;
PROC FREQ DATA = ALMOST;
TABLES DX / OUT = WANT;
RUN;
%MEND;

 

%FREQ(END=4);

View solution in original post


All Replies
Super User
Posts: 11,343

Re: summarize diagnosis multiple columns and multiple per patients

[ Edited ]
Posted in reply to lillymaginta

Assuming your example result miscounted Dx 120 (patient 1 and patient 2) this works for your example data.

 

I'm treating the Dx codes as numeric to simplify reading since you didn't provide a data step to generate data and ignoring dates as there was no apparent need for date in the requested summary.

data have;
   input patientid    dx1 dx2 dx3 dx4   ;
datalines;
1               250 223 224 444    5/5/2009 
1               555 666 120 250    5/6/2008 
2               120  666  .  .     1/2/2007 
2               120  666  .  .     1/1/2007 
3               250   .   .  .     2/2/2004 
3               240 250   .  .     3/3/2004 
3               250  .    .  .     1/1/2004 
;
run;

data trans (keep=patientid Dx);
   set have;
   array d Dx1-dx4;
   do i=1 to dim (d);
      Dx = d[i];
      if not missing(Dx) then output;
   end;
run;

Proc sql;
   select dx, count(dx) as freq
   from (select distinct patientid, dx from trans)
   group by dx;
quit;

 

Frequent Contributor
Posts: 128

Re: summarize diagnosis multiple columns and multiple per patients

Thank you. This worked well. Would it be possible to have the output in order descending or ascending? 

Solution
‎04-05-2016 12:12 PM
Contributor
Posts: 30

Re: summarize diagnosis multiple columns and multiple per patients

Posted in reply to lillymaginta

So I believe I created what you are looking for.  I turned it into a macro just in case if you have more diagnoses spots than you originally specified.  Let me know if you have any questions.

 

DATA HAVE;
INPUT patientid 1-2 dx1 3-5 dx2 7-9 dx3 11-13 dx4 15-17 date mmddyy10.;
datalines;
1 250 223 224 444 5/5/2009
1 555 666 120 250 5/6/2008
2 120 666 1/2/2007
2 120 666 1/1/2007
3 250 2/2/2004
3 240 250 3/3/2004
3 250 1/1/2004
;
run;

 

PROC SORT DATA = HAVE;
BY PatientID;
RUN;

 

%MACRO FREQ (END = );
%DO i = 1 %TO &END;
PROC FREQ DATA = HAVE NOPRINT;
TABLES PATIENTID*DX&i / OUT = FREQ_DX&i;
RUN;
DATA FREQ_DX&i;
SET FREQ_DX&i(RENAME = (COUNT = COUNT&i DX&i = DX));
DROP PERCENT;
RUN;
PROC SORT DATA = FREQ_DX&i;
BY PatientID DX;
RUN;
%END;
DATA ALMOST;
MERGE FREQ_DX1-FREQ_DX&END;
BY PatientID DX;
RUN;
PROC FREQ DATA = ALMOST;
TABLES DX / OUT = WANT;
RUN;
%MEND;

 

%FREQ(END=4);

Frequent Contributor
Posts: 128

Re: summarize diagnosis multiple columns and multiple per patients

Thank you for taking the time to write the macro. Both methods gave me the same answer, I really apprecaite it! 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 312 views
  • 2 likes
  • 3 in conversation