BookmarkSubscribeRSS Feed
devsas
Pyrite | Level 9

I have a clinic data with variables patient ID, Date of appointment and Physician ID. I want to find out the cases where a specific patient came back for a follow up appointment within a week of his first appointment. If there was no further appointment for that particular patient, its a success. If the second appointment was after a week, its again a success. But, if the first appointment was followed up with another appointment within a week, it's a failure.  I tried doing it with a do loop, but I'm sure there is a better way of doing it with PROC SQL. I'm attaching the data set which is already sorted with patient id.

Million thanks in advance!

3 REPLIES 3
Haikuo
Onyx | Level 15

I don't know why you insist pursuing proc sql solution, since you haven't shown your datastep code, I can't really comment. In my following test, data step is way faster (only use 1/50th of time) than my Proc sql (of course, there will be many ways of doing it using Proc sql); data step will mark the second record, while proc sql will mark the first one.

1    /*PROC SQL, SLOWWOWOW*/

2     proc sql;

3       create table want_SQL as

4         select *, CASE WHEN EXISTS (select APPOINTMENT_DATE  from test1 where PATIENT_ID =

4  ! A.PATIENT_ID

5

6                                     AND APPOINTMENT_DATE between A.APPOINTMENT_DATE+1 and

6  ! a.APPOINTMENT_DATE+7)

7

8                                    THEN 'FAILURE' ELSE 'SUCCESS' END AS FLAG

9            FROM TEST1 A

10    ;

NOTE: Table WORK.WANT_SQL created, with 3260 rows and 4 columns.

11   QUIT;

NOTE: PROCEDURE SQL used (Total process time):

      real time           1.01 seconds

      cpu time            0.99 seconds

12

13   /*DATA STEP: FAST*/

14   PROC SORT DATA=TEST1 OUT=TEST2;

15   BY patient_id APPOINTMENT_DATE;

16   RUN;

NOTE: There were 3260 observations read from the data set WORK.TEST1.

NOTE: The data set WORK.TEST2 has 3260 observations and 3 variables.

NOTE: PROCEDURE SORT used (Total process time):

      real time           0.01 seconds

      cpu time            0.01 seconds

17

18   DATA WANT_DS;

19     DO UNTIL (LAST.PATIENT_ID);

20     SET TEST2;

21     LENGTH FLAG $ 10;

22       BY patient_id APPOINTMENT_DATE;

23       FLAG=IFC(FIRST.PATIENT_ID OR DIF(APPOINTMENT_DATE)>6,'SUCCESS','FAILURE');

24       OUTPUT;

25      END;

26   RUN;

NOTE: There were 3260 observations read from the data set WORK.TEST2.

NOTE: The data set WORK.WANT_DS has 3260 observations and 4 variables.

NOTE: DATA statement used (Total process time):

      real time           0.01 seconds

      cpu time            0.01 seconds

Haikuo

devsas
Pyrite | Level 9

Thanks for your reply. I tried both your methods and while the Proc Sql one gave me errors and didnt run, the data step method worked and gave me a table which is not fully correct. I mean I need failure cases where the follow up appointment for a specific patient was within a week, but your result shows failure as only those cases where the follow up appointment or the next appointment was on the same day. Any other appointment was considered a success. For instance, 733 and 734 observation for the same patient is just 5 days apart, but it still calls it success. I'm not sure if there is some slight issue with your code or you just didnt understand my question correctly. The code I used earlier is below and it gave back all such cases where there was follow up appointment. Success is 1 and failure is 0.

proc sort data=vtest.test;

by PATIENT_ID APPOINTMENT_DATE;

run;

data vtest.q2;

retain adate i;

length followsuc 6;

set vtest.test;

by PATIENT_ID;

if first.patient_id then do;

i=1;

adate=APPOINTMENT_DATE;

end;

else do;

    i=i+1;

    if i=2 then do;

        if APPOINTMENT_DATE - adate <=604800 then followsuc=0;

        else followsuc=1;

        output;

    end;

end;

run;

overmar
Obsidian | Level 7

devsas,

The below code should have fixed the last issue with the day 5, as it is specified for dates between 0 and 7.

data test1a;

    set desktop.test1;

    apt_date = datepart(appointment_date);

    format apt_date mmddyy10.;

    keep patient_id apt_date;

run;

proc sort data = test1a;

    by patient_id apt_date;

run;

data test1a;

    set test1a;

    count + 1;

    by patient_id;

    if first.patient_id then count = 1;

run;

data test1b;

    set test1a;

    where count <=2; /*This is done because there are a couple of people with more than 2 visits and the point of the exercise is only after their first visit, so eliminate all of the others*/

run;

data test1c;

    set test1b;

    by patient_id apt_date;

    retain adate;

    if first.patient_id then do;

    adate = apt_date;

    end;

    if 0<datdif(adate,apt_date,'act/act')<=7 then do;

    flag = 0;

    output;

    end;

    else do;

    flag = 1;

    output;

    end;

    drop count adate;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1010 views
  • 0 likes
  • 3 in conversation