BookmarkSubscribeRSS Feed
devsas
Pyrite | Level 9

I have a clinic data with variables patient ID, Date of appointment and Physician ID. I want to find out the cases where a specific patient came back for a follow up appointment within a week of his first appointment. If there was no further appointment for that particular patient, its a success. If the second appointment was after a week, its again a success. But, if the first appointment was followed up with another appointment within a week, it's a failure.  I tried doing it with a do loop, but I'm sure there is a better way of doing it with PROC SQL. I'm attaching the data set which is already sorted with patient id.

Million thanks in advance!

3 REPLIES 3
Haikuo
Onyx | Level 15

I don't know why you insist pursuing proc sql solution, since you haven't shown your datastep code, I can't really comment. In my following test, data step is way faster (only use 1/50th of time) than my Proc sql (of course, there will be many ways of doing it using Proc sql); data step will mark the second record, while proc sql will mark the first one.

1    /*PROC SQL, SLOWWOWOW*/

2     proc sql;

3       create table want_SQL as

4         select *, CASE WHEN EXISTS (select APPOINTMENT_DATE  from test1 where PATIENT_ID =

4  ! A.PATIENT_ID

5

6                                     AND APPOINTMENT_DATE between A.APPOINTMENT_DATE+1 and

6  ! a.APPOINTMENT_DATE+7)

7

8                                    THEN 'FAILURE' ELSE 'SUCCESS' END AS FLAG

9            FROM TEST1 A

10    ;

NOTE: Table WORK.WANT_SQL created, with 3260 rows and 4 columns.

11   QUIT;

NOTE: PROCEDURE SQL used (Total process time):

      real time           1.01 seconds

      cpu time            0.99 seconds

12

13   /*DATA STEP: FAST*/

14   PROC SORT DATA=TEST1 OUT=TEST2;

15   BY patient_id APPOINTMENT_DATE;

16   RUN;

NOTE: There were 3260 observations read from the data set WORK.TEST1.

NOTE: The data set WORK.TEST2 has 3260 observations and 3 variables.

NOTE: PROCEDURE SORT used (Total process time):

      real time           0.01 seconds

      cpu time            0.01 seconds

17

18   DATA WANT_DS;

19     DO UNTIL (LAST.PATIENT_ID);

20     SET TEST2;

21     LENGTH FLAG $ 10;

22       BY patient_id APPOINTMENT_DATE;

23       FLAG=IFC(FIRST.PATIENT_ID OR DIF(APPOINTMENT_DATE)>6,'SUCCESS','FAILURE');

24       OUTPUT;

25      END;

26   RUN;

NOTE: There were 3260 observations read from the data set WORK.TEST2.

NOTE: The data set WORK.WANT_DS has 3260 observations and 4 variables.

NOTE: DATA statement used (Total process time):

      real time           0.01 seconds

      cpu time            0.01 seconds

Haikuo

devsas
Pyrite | Level 9

Thanks for your reply. I tried both your methods and while the Proc Sql one gave me errors and didnt run, the data step method worked and gave me a table which is not fully correct. I mean I need failure cases where the follow up appointment for a specific patient was within a week, but your result shows failure as only those cases where the follow up appointment or the next appointment was on the same day. Any other appointment was considered a success. For instance, 733 and 734 observation for the same patient is just 5 days apart, but it still calls it success. I'm not sure if there is some slight issue with your code or you just didnt understand my question correctly. The code I used earlier is below and it gave back all such cases where there was follow up appointment. Success is 1 and failure is 0.

proc sort data=vtest.test;

by PATIENT_ID APPOINTMENT_DATE;

run;

data vtest.q2;

retain adate i;

length followsuc 6;

set vtest.test;

by PATIENT_ID;

if first.patient_id then do;

i=1;

adate=APPOINTMENT_DATE;

end;

else do;

    i=i+1;

    if i=2 then do;

        if APPOINTMENT_DATE - adate <=604800 then followsuc=0;

        else followsuc=1;

        output;

    end;

end;

run;

overmar
Obsidian | Level 7

devsas,

The below code should have fixed the last issue with the day 5, as it is specified for dates between 0 and 7.

data test1a;

    set desktop.test1;

    apt_date = datepart(appointment_date);

    format apt_date mmddyy10.;

    keep patient_id apt_date;

run;

proc sort data = test1a;

    by patient_id apt_date;

run;

data test1a;

    set test1a;

    count + 1;

    by patient_id;

    if first.patient_id then count = 1;

run;

data test1b;

    set test1a;

    where count <=2; /*This is done because there are a couple of people with more than 2 visits and the point of the exercise is only after their first visit, so eliminate all of the others*/

run;

data test1c;

    set test1b;

    by patient_id apt_date;

    retain adate;

    if first.patient_id then do;

    adate = apt_date;

    end;

    if 0<datdif(adate,apt_date,'act/act')<=7 then do;

    flag = 0;

    output;

    end;

    else do;

    flag = 1;

    output;

    end;

    drop count adate;

run;

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1916 views
  • 0 likes
  • 3 in conversation