DATA Step, Macro, Functions and more

Count of Patients with particular set of diagnoses ever

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 17
Accepted Solution

Count of Patients with particular set of diagnoses ever

[ Edited ]

Hi,

 

Related to: https://communities.sas.com/t5/Base-SAS-Programming/summarize-diagnosis-multiple-columns-and-multipl...

 

I have a very large data set (>3 million observations) for which would like to count the number of patients (not total observations) have one of four possible diagnosis codes. Patients could have multiple observations denoting multiple visits and have up to ten diagnosis variable fields.

 

My four codes of interest are: 72632 or 72631 or 71922 or 71923.

 

I would like to create a command that searches DX01 to DX10 for all observations for each patient for presence of the above codes.

 

Example:

Patientid      dx01    dx02    dx03    dx04    dx05    …       dx10    date

1             72632   72631   71922   71923                          5/5/2009

1              72632   71922   70000   70001                          6/6/2008

2              70001   70000                                                    1/1/2007

2              71900   71901                                                  2/2/2007

3              72632   71900                                                   2/2/2004

3              72631   70000                                                   3/3/2004

3              71922                                                                4/4/2004

 

From the above example, I would like to produce output like the following

1. By Patient ID

                               Patient ID            Has (72632 OR 72631 OR 71922 OR 71923) [1=Yes, 2=No]

                                            1             1

                                           2              0

                                           3              1

2. By Total Number of Patients
                               Total Count of Patients with (72632 OR 72631 OR 71922 OR 71923)

                                          2

 

Separately, I would also like a count of the number of observations each patient has for the diagnoses (72632 OR 72631 OR 71922 OR 71923) that produces output like:

                              Patient ID            Has (72632 OR 72631 OR 71922 OR 71923)             Number of Visits Addressing Diagnosis

                                            1             1                                                                                   2

                                           2              0                                                                                   0

                                           3              1                                                                                    2

Thanks for your help.


Accepted Solutions
Solution
‎10-24-2016 10:25 AM
Super User
Posts: 6,936

Re: Count of Patients with particular set of diagnoses ever

[ Edited ]
data inter (keep=id diag);
set have;
array diags {*} dx1-dx10;
diag = 0;
do i = 1 to dim(diags);
  if diags{i} in (72632,72631,71922 71923) then diag = 1;
end;
run;

proc sort data=inter;
by id;
run;

data result (keep=id hasdiag visitcount) totalcount (keep=total_patients);
set inter end=done;
retain total_patients 0 visitcount hasdiag;
if first.id
then do;
  visitcount = 0;
  hasdiag = 0;
end;
if diag then do;
  visitcount + 1;
  hasdiag = 1;
end;
if last.id then do;
  output result;
  total_patients + hasdiag;
end;
if done then output totalcount;
run;

If your initial dataset is already sorted by id, you can omit the sort step.

 

Some edits done for typos.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Regular Contributor
Posts: 231

Re: Count of Patients with particular set of diagnoses ever

Hi

 

I am sure there are more efficient ways to handle this, but here's how I would do it using SQL:

 

 

proc sql;
/* This will put the data into categories of Yes = 1 and No = 0 */ create table work.DATA2 as  select patient_id,  case when (dx1 in (72632, 72631, 71922, 71923)                or (dx2 in (72632, 72631, 71922, 71923)  . . .             or (dx10 in (72632, 72631, 71922, 71923))  then 1  else 0 end as Yes_No from work.DATA;
/* This will sum the Yes_No column to give you a TOTAL number of instances that met
the Yes criteria above */ select sum(Yes_No) as Total_Count from work.DATA2; 
/* This will give you the total per patient ID */ select patient_ID, sum(Yes_No) as Pt_Count from work.DATA2 group by patient_ID order by patient_ID; quit;

This may require some fine tuning, and there are plenty of papers and articles (including a few written by me!) on PROC SQL, but this should get you started.


Good luck and happy coding!

Chris

Has my article or post helped? Please mark as Solution or Like the article!
Trusted Advisor
Posts: 1,377

Re: Count of Patients with particular set of diagnoses ever

[ Edited ]

Alternative code using data step:

 

%let ds_check = 72632 72631 71922 71923;  /* easy to change list if needed */

 

data want;

 set have;

  by patient_id;

       retain yes_no;

       if first.patient_id then yes_no = 0;

       

       array dx dx1-dx10;

       do i=1 to 10;

            if dx(i) in (&ds_check) then yes_no = 1;

        end;

 

        if last.patient_id then output;

        keep patient_id  yes_no;

        label yes_no = " HAS (&ds_check)";

run;

 

The TOTAL number can be achieved by various ways

     (e.g. by PROC MEANS or PROC SQL or PROC PRINT with SUM )

 

 

 

 

Occasional Contributor
Posts: 17

Re: Count of Patients with particular set of diagnoses ever

How do I deal with leading zeros in my diagnoses categories. For example, DX01, DX02, DX03...Dx10?

 

Thank you so much!

Trusted Advisor
Posts: 1,377

Re: Count of Patients with particular set of diagnoses ever

[ Edited ]

you may change the array definition into:

ARRAY DX DX01-DX10;

 

or alternatively do:

 

  SET WANT (Rename=(dx01=dx1 dx02=dx2 ... dx09=dx9));

Regular Contributor
Posts: 231

Re: Count of Patients with particular set of diagnoses ever

Unless I'm misunderstanding, the DX01 is a column header and at least for the SQL version, won't have an impact.  The only time you'd need to worry about leading 0s is if they're in your data (00745454 versus 745454).  Sometimes the number of leading 0s is important, and then you would have to treat the column as TEXT rather than NUMERIC.

 

hope this helps!

Chris

Has my article or post helped? Please mark as Solution or Like the article!
Super User
Posts: 6,936

Re: Count of Patients with particular set of diagnoses ever


DMMD wrote:

How do I deal with leading zeros in my diagnoses categories. For example, DX01, DX02, DX03...Dx10?

 

Thank you so much!


Just include the zero in the variable list. Just tested this short example:

data have;
input x01 x02 x03;
cards;
1 2 3
;
run;

data want;
set have;
array xx {*} x01-x03;
do i = 1 to dim(xx);
  put xx{i}=;
end;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 17

Re: Count of Patients with particular set of diagnoses ever

[ Edited ]

Fantastic! The problem was the diagnoses codes were character, not number variables as there are some codes starting with letters. Putting the codes in quotations allowed it to work. Thank you so much! This helped to answer #1 and #3 of my output.

Occasional Contributor
Posts: 17

Re: Count of Patients with particular set of diagnoses ever

[ Edited ]
 
Regular Contributor
Posts: 231

Re: Count of Patients with particular set of diagnoses ever

Hey!  I think the issue is the PROC FREQ; if I remember correctly, PROC FREQ counts instances, or something.  At a quick glance, your SQL looks good, and i'm not seeing any issues with your brackets or other logic.  I'll wait and see what @Shmuel says, and if I can i will do some testing tomorrow when I'm at work.

 

Good luck :-)
Chris

Has my article or post helped? Please mark as Solution or Like the article!
Occasional Contributor
Posts: 17

Re: Count of Patients with particular set of diagnoses ever

[ Edited ]

Thanks @DarthPathos, @Shmuel, and @KurtBremser! SAS can be a stressful experience to say the least.

 

Best,

David

Regular Contributor
Posts: 231

Re: Count of Patients with particular set of diagnoses ever

Yeah, I've got to admit i'm stumped LOL. Hopefully one of my esteemed co-users can point you in the right direction, and if I think of anything I'll you know. One of the benefits of being on the community - you get a wide variety of views Smiley Very Happy
Don't give up - I've been using SAS for 6 years, and have come a long, long way. You'll get there!
Chris
Has my article or post helped? Please mark as Solution or Like the article!
Trusted Advisor
Posts: 1,377

Re: Count of Patients with particular set of diagnoses ever

Within my code ( @shmuel ) I conunt per EUPIDE. 

 

It seems that same eupide can exist in more than one observation.

 

@KurtBremser counts each observation.

 

I have no time now to check the 3rd program.

Trusted Advisor
Posts: 1,377

Re: Count of Patients with particular set of diagnoses ever

[ Edited ]

I have analized the 3 programs , here is the explanation for the differences:

 

Let's have an input like:

        EUPIDE      YES-NO         my remark                      counts by:  SH        KB          DP 

              1               0                non of dx in list                                                0            1

              1               1               one or more dx in list                                       1            1               

              1               1                                                                           1          1            1  

                                                        TOTAL                                         1          2            3  

 

in words:

         I count each EUPIDE once only. Therefore my total is the least and I have missings in freq output.

         @KurtBremser assigns each line as 0 or 1 and sums the colomn yes-no.

         @DarthPathos counts each line as 1 if any line of same EUPIDE is 1.

 

@DMMD now it is your turn to choose which of the 3 fits more to you.

            

      

Occasional Contributor
Posts: 17

Re: Count of Patients with particular set of diagnoses ever

Thanks so much @Shmuel for helping to explain the function of each code.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 25 replies
  • 653 views
  • 15 likes
  • 4 in conversation