DATA Step, Macro, Functions and more

Comparing a row in a dataset with the next row

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 19
Accepted Solution

Comparing a row in a dataset with the next row

Good Day,

I'll start with a little background.  I have a dataset like the following :

Patient_IDAdmit_DateDisch_DateFacility_TypeFlag

12345

02JAN2014:00:00:00.000

05JAN2014:00:00:00.000

Nursing Home
12345

05JAN2014:00:00:00.000

07JAN2014:00:00:00.000

Hospital
23456

02JAN2014:00:00:00.000

06JAN2014:00:00:00.000

Nursing Home

23456

06JAN2014:00:00:00.000

09JAN2014:00:00:00.000

Hospital
23456

08JAN2014:00:00:00.000

11JAN2014:00:00:00.000

Nursing Home
23456

11JAN2014:00:00:00.000

19JAN2014:00:00:00.000

Hospital
23456

19JAN2014:00:00:00.000

23JAN2014:00:00:00.000

Nursing Home
23456

24JAN2014:00:00:00.000

25JAN2014:00:00:00.000

Hospital
23456

25JAN2014:00:00:00.000

30JAN2014:00:00:00.000

Nursing Home
34567

09JAN2014:00:00:00.000

15JAN2014:00:00:00.000

Nursing Home
45678

13JAN2014:00:00:00.000

20JAN2014:00:00:00.000

Nursing Home
etc....

I would like to loop through this dataset and update the Flag column with a 1 depending on information in the next row.  The logic I need to use is something like the following :

I = 1

do while I <= "num rows in dataset"

     If Patient_ID[I+1] = Patient_ID and

          Facility_Type[I+1] = "Hospital" and

          Facility_Type = "Nursing Home" and

          datepart(Admit_Date[I + 1]) >= datepart(Admit_Date) and

          datepart(Admit_Date[I + 1]) <= datepart(Disch_Date)

     Then

          Flag = 1

     Else Flag = 0

I = I+1

Next

This would update the dataset to be the following :

Patient_IDAdmit_DateDisch_DateFacility_TypeFlag

12345

02JAN2014:00:00:00.000

05JAN2014:00:00:00.000

Nursing Home1
12345

05JAN2014:00:00:00.000

07JAN2014:00:00:00.000

Hospital0
23456

02JAN2014:00:00:00.000

06JAN2014:00:00:00.000

Nursing Home

1

23456

06JAN2014:00:00:00.000

09JAN2014:00:00:00.000

Hospital0
23456

09JAN2014:00:00:00.000

11JAN2014:00:00:00.000

Nursing Home

1

23456

10JAN2014:00:00:00.000

19JAN2014:00:00:00.000

Hospital0
23456

19JAN2014:00:00:00.000

23JAN2014:00:00:00.000

Nursing Home0
23456

24JAN2014:00:00:00.000

25JAN2014:00:00:00.000

Hospital0
23456

25JAN2014:00:00:00.000

30JAN2014:00:00:00.000

Nursing Home0
34567

09JAN2014:00:00:00.000

15JAN2014:00:00:00.000

Nursing Home0
45678

13JAN2014:00:00:00.000

20JAN2014:00:00:00.000

Nursing Home0
etc....

I have been going nuts trying to figure out how to code this in SAS.  There's tons of documentation online that sort of applies, but nothing specific enough and try as I might, I can't seem to work it out.  I've been tempted to export the data to a file, parse it using VBS and then reimport, but that is horribly backwards and inefficient. Plus, I won't learn anything about SAS.

Any help would be greatly appreciated.

Thanks,

John


Accepted Solutions
Solution
‎10-14-2014 05:29 PM
SAS Super FREQ
Posts: 708

Re: Comparing a row in a dataset with the next row

SAS has the LAG function to look backwards as LinusH pointed out. However there is no looking forward function. But you can kind of simulate this using a second SET statement with the POINT= option so that you will read the I+1 observation, you will need to rename the variables from the second SET statement, so that you can compare. Taking your logic here is some sample code.

data have;
  infile cards dlm=",";
 
input
    Patient_ID :
8.
    Admit_Date :
anydtdtm.
    Disch_Date :
anydtdtm.
    Facility_Type :
$32.
  ;
  format admit_date disch_date datetime19.;
 
cards;
12345,02JAN2014:00:00:00.000,05JAN2014:00:00:00.000,Nursing Home
12345,05JAN2014:00:00:00.000,07JAN2014:00:00:00.000,Hospital
23456,02JAN2014:00:00:00.000,06JAN2014:00:00:00.000,Nursing Home
23456,06JAN2014:00:00:00.000,09JAN2014:00:00:00.000,Hospital
23456,08JAN2014:00:00:00.000,11JAN2014:00:00:00.000,Nursing Home
23456,11JAN2014:00:00:00.000,19JAN2014:00:00:00.000,Hospital
23456,19JAN2014:00:00:00.000,23JAN2014:00:00:00.000,Nursing Home
23456,24JAN2014:00:00:00.000,25JAN2014:00:00:00.000,Hospital
23456,25JAN2014:00:00:00.000,30JAN2014:00:00:00.000,Nursing Home
34567,09JAN2014:00:00:00.000,15JAN2014:00:00:00.000,Nursing Home
45678,13JAN2014:00:00:00.000,20JAN2014:00:00:00.000,Nursing Home
;

data want;
  obs1 = 1;
 
do while( obs1 <= nobs);
    set have nobs=nobs;
    obs2 = obs1 + 1;
   
set
      have(
       
rename=(
        Patient_ID = pId2
        Admit_Date = ad2  
        Disch_Date = dd2
        Facility_Type = ft2
        )
      )
point=obs2
    ;

   
If pId2 = Patient_ID and
      ft2 =
"Hospital" and
      Facility_Type =
"Nursing Home" and
      datepart(ad2) >= datepart(Admit_Date)
and
      datepart(ad2) <= datepart(Disch_Date)
Then
      Flag =
1;
   
Else Flag = 0;
   
output;
    obs1 +
1;
 
end;
 
drop obs1 obs2 pId2 ad2 dd2 ft2;
run;

View solution in original post


All Replies
Super User
Posts: 5,427

Re: Comparing a row in a dataset with the next row

If you resort your data descending, you could use retain statement (or lag() function) to keep the previous rows values in the current row. That would replace the "i+1" in your code.

Data never sleeps
Solution
‎10-14-2014 05:29 PM
SAS Super FREQ
Posts: 708

Re: Comparing a row in a dataset with the next row

SAS has the LAG function to look backwards as LinusH pointed out. However there is no looking forward function. But you can kind of simulate this using a second SET statement with the POINT= option so that you will read the I+1 observation, you will need to rename the variables from the second SET statement, so that you can compare. Taking your logic here is some sample code.

data have;
  infile cards dlm=",";
 
input
    Patient_ID :
8.
    Admit_Date :
anydtdtm.
    Disch_Date :
anydtdtm.
    Facility_Type :
$32.
  ;
  format admit_date disch_date datetime19.;
 
cards;
12345,02JAN2014:00:00:00.000,05JAN2014:00:00:00.000,Nursing Home
12345,05JAN2014:00:00:00.000,07JAN2014:00:00:00.000,Hospital
23456,02JAN2014:00:00:00.000,06JAN2014:00:00:00.000,Nursing Home
23456,06JAN2014:00:00:00.000,09JAN2014:00:00:00.000,Hospital
23456,08JAN2014:00:00:00.000,11JAN2014:00:00:00.000,Nursing Home
23456,11JAN2014:00:00:00.000,19JAN2014:00:00:00.000,Hospital
23456,19JAN2014:00:00:00.000,23JAN2014:00:00:00.000,Nursing Home
23456,24JAN2014:00:00:00.000,25JAN2014:00:00:00.000,Hospital
23456,25JAN2014:00:00:00.000,30JAN2014:00:00:00.000,Nursing Home
34567,09JAN2014:00:00:00.000,15JAN2014:00:00:00.000,Nursing Home
45678,13JAN2014:00:00:00.000,20JAN2014:00:00:00.000,Nursing Home
;

data want;
  obs1 = 1;
 
do while( obs1 <= nobs);
    set have nobs=nobs;
    obs2 = obs1 + 1;
   
set
      have(
       
rename=(
        Patient_ID = pId2
        Admit_Date = ad2  
        Disch_Date = dd2
        Facility_Type = ft2
        )
      )
point=obs2
    ;

   
If pId2 = Patient_ID and
      ft2 =
"Hospital" and
      Facility_Type =
"Nursing Home" and
      datepart(ad2) >= datepart(Admit_Date)
and
      datepart(ad2) <= datepart(Disch_Date)
Then
      Flag =
1;
   
Else Flag = 0;
   
output;
    obs1 +
1;
 
end;
 
drop obs1 obs2 pId2 ad2 dd2 ft2;
run;
Occasional Contributor
Posts: 19

Re: Comparing a row in a dataset with the next row

Posted in reply to Bruno_SAS

WOW!!  This worked perfectly!!  Thank you so much!  I never would have arrived there without your help.

Frequent Contributor
Posts: 91

Re: Comparing a row in a dataset with the next row

Another way to do this is with a MERGE statement without a BY statement and FIRSTOBS options, like:

/* using same data member HAVE from above */

data want;

  MERGE have (firstobs=1)

               have (firstobs=2

                       rename=(Patient_ID = pId2

                                     Admit_Date=ad2

                                     Disch_Date=dd2

                                     Facility_Type=ft2)

                       );

  if    pId2                        = Patient_ID

   and ft2                         = "Hospital"

   and Facility_Type         = "Nursing Home"

   and datepart(ad2)       >= datepart(Admit_Date)

   and datepart(ad2)       <= datepart(Disch_Date) then Flag = 1;

  else                                                                       Flag = 0;

  drop pid2 ad2 dd2 ft2;

run;

Contributor
Posts: 67

Re: Comparing a row in a dataset with the next row

Good thought!

N/A
Posts: 1

Re: Comparing a row in a dataset with the next row

Starting from the Have dataset from the previous answers, this gives the same result:

proc sql;

  create table want as

    select a.*, case when a.Facility_type='Nursing Home' and b.facility_type='Hospital' and

                          b.admit_date le a.disch_date then 1

                     else 0

                end as flag

    from have a

    left join have b

      on a.patient_id=b.patient_id

      and b.admit_date gt a.admit_date

  group by a.patient_id, a.admit_date

  having b.admit_date=min(b.admit_date);

quit;

Contributor
Posts: 20

Re: Comparing a row in a dataset with the next row

Posted in reply to Bruno_SAS

Super helpful tip! My coworker and I do some crazy lag stuff when we need to do something like this. So happy to have an easier alternative.

Community Manager
Posts: 2,953

Re: Comparing a row in a dataset with the next row

Shout out to who shared a macro version of this approach on his blog: LagLead Macro.

Chris

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 22759 views
  • 10 likes
  • 8 in conversation