BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
aokolo
Fluorite | Level 6

Hello,

 

I'm a novice SAS user (SAS 9.4) and I'm working with a healthcare dataset where I need to count the number of observations that were discharged and admitted on the same day at different places of service. Example, if ID 1 was seen at the ED on 01/01/2020 and was discharged on 01/02/2020 and Inpatient admission on 01/02/2020 then the new variable would count that as 1 service but if ID 1 was seen at the ED on 01/10/2020 and discharged on 01/10/2020 and Inpatient was admitted on 01/31/2020 then the new variable wouldn't count that. How would I create a variable that would count that?

 

 

data SC.test;
input id $ admitdate:mmddyy10. dischargedate:mmddyy10. place $2. ;
format ADMITdate MMDDYY10. DISCHARGEdate MMDDYY10.;
datalines;
1 09/10/2020 09/10/2020 ED
1 09/10/2020 09/11/2020 IN
2 10/10/2020 10/12/2020 IN
3 04/03/2020 04/03/2020 ED
3 05/03/2020 05/03/2020 ED
3 05/29/2020 06/03/2020 IN
4 10/09/2020 10/10/2020 ED
4 10/10/2020 10/12/2020 IN
4 11/03/2020 11/03/2020 ED
4 12/29/2020 12/31/2020 IN
;
run;

 

 

My expected output would be something like this:

IDADMITDATEDISCHARGEDATEPLACENEWVAR
19/10/20209/10/2020ED1
19/10/20209/11/2020IN1
210/10/202010/12/2020IN.
34/3/20204/3/2020ED.
35/3/20205/3/2020ED.
35/29/20206/3/2020IN.
410/9/202010/10/2020ED1
410/10/202010/12/2020IN1
411/3/202011/3/2020ED.
412/29/202012/31/2020IN.

 

I would really appreciate any insight. Thank you!

 

Amanda

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You want to set a dummy variable to one for all records for which a discharge from one place occured on the same date as a admission to another place.  You want to set the dummy to 1 for both records.

 

If so, then:

 

data test;
input id $ admitdate:mmddyy10. dischargedate:mmddyy10. place $2. ;
format ADMITdate MMDDYY10. DISCHARGEdate MMDDYY10.;
datalines;
1 09/10/2020 09/10/2020 ED
1 09/10/2020 09/11/2020 IN
2 10/10/2020 10/12/2020 IN
3 04/03/2020 04/03/2020 ED
3 05/03/2020 05/03/2020 ED
3 05/29/2020 06/03/2020 IN
4 10/09/2020 10/10/2020 ED
4 10/10/2020 10/12/2020 IN
4 11/03/2020 11/03/2020 ED
4 12/29/2020 12/31/2020 IN
run;

data want (drop=nxt_:);
  set test (keep=id admitdate dischargedate);
  by id admitdate dischargedate;
  merge test
        test (firstobs=2 keep=admitdate rename=(admitdate=nxt_admdate));
  if (last.id=0 and nxt_admdate=dischargedate) or (first.id=0 and admitdate=lag(dischargedate)) then dummy=1;
  else dummy=0;
run;

This code assumes the data are sorted by ID/ADMITDATE ID/ADMITDATE/DISCHARGEDATE   The code logic really only needs a BY statement of

BY ID;

But I also included ADMITDATE  ADMITDATE and DISCHARGEDATE, so if the data set is not properly sorted, the program will stop with an error message.

ERROR: BY variables are not properly sorted on data set WORK.TEST.

and you'll know you need to sort the data.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
ballardw
Super User

When you say "count" you have to be pretty explicit about why None of these get a count:

2 10/10/2020 10/12/2020 IN .
3 4/3/2020 4/3/2020 ED .
3 5/3/2020 5/3/2020 ED .
3 5/29/2020 6/3/2020 IN

.

4 11/3/2020 11/3/2020 ED .
4 12/29/2020 12/31/2020 IN .

but this does:

1 9/10/2020 9/10/2020 ED 1

 

I am not seeing any Rule that says id 2 and 3 or part of id=4 should not have anything counted.

 

My observation of just the Want example would be that the DATES really are not playing an actual role in the result at all.

aokolo
Fluorite | Level 6
Thanks for responding! The reason id=2 and id=3 and part of id=4 wouldn't be counted is because their discharge and admit dates aren't on the same day. I am looking to count only encounters that have the same discharge and admit dates, as I'm looking for ED visits that resulted in an inpatient admission and looking at the dates is the only way I can observe this in my dataset. My apologies for not making that clear.
mkeintz
PROC Star

You want to set a dummy variable to one for all records for which a discharge from one place occured on the same date as a admission to another place.  You want to set the dummy to 1 for both records.

 

If so, then:

 

data test;
input id $ admitdate:mmddyy10. dischargedate:mmddyy10. place $2. ;
format ADMITdate MMDDYY10. DISCHARGEdate MMDDYY10.;
datalines;
1 09/10/2020 09/10/2020 ED
1 09/10/2020 09/11/2020 IN
2 10/10/2020 10/12/2020 IN
3 04/03/2020 04/03/2020 ED
3 05/03/2020 05/03/2020 ED
3 05/29/2020 06/03/2020 IN
4 10/09/2020 10/10/2020 ED
4 10/10/2020 10/12/2020 IN
4 11/03/2020 11/03/2020 ED
4 12/29/2020 12/31/2020 IN
run;

data want (drop=nxt_:);
  set test (keep=id admitdate dischargedate);
  by id admitdate dischargedate;
  merge test
        test (firstobs=2 keep=admitdate rename=(admitdate=nxt_admdate));
  if (last.id=0 and nxt_admdate=dischargedate) or (first.id=0 and admitdate=lag(dischargedate)) then dummy=1;
  else dummy=0;
run;

This code assumes the data are sorted by ID/ADMITDATE ID/ADMITDATE/DISCHARGEDATE   The code logic really only needs a BY statement of

BY ID;

But I also included ADMITDATE  ADMITDATE and DISCHARGEDATE, so if the data set is not properly sorted, the program will stop with an error message.

ERROR: BY variables are not properly sorted on data set WORK.TEST.

and you'll know you need to sort the data.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
aokolo
Fluorite | Level 6
Thank you! This was successful! Now I'm studying this to learn more about why and how this worked.
mkeintz
PROC Star

The real key to this solution is in the MERGE statement, which does a self-merge of TEST, simultaneously taking data from both observation i  and observation i+1 (where TEST has the FIRSTOBS=2 parameter).  This allows you to look-ahead and compare data in the current obs to the next obs.   Of course, that means you have to rename variables because both obs start out with the same variable names in a self-merge.

 

The SET statement is there merely to use with the BY statement, thereby allowing use of FIRST.ID and LAST.ID in the IF statement.   You could actually drop the SET and BY statements, through use of a slightly "busier" if statement, as in

data want2 (drop=nxt_:);
  merge test
        test (firstobs=2 keep=id admitdate rename=(id=nxt_id admitdate=nxt_admdate));
  if (id=nxt_id and nxt_admdate=dischargedate) or (id=lag(id) and admitdate=lag(dischargedate)) then dummy=1;
  else dummy=0;
RUN;

Of course, this latter code version doesn't protect you against data that might not be sorted BY ID/ADMITDATE/DISCHARGEDATE.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
aokolo
Fluorite | Level 6
Thank you so much for providing this explanation! It helps so much to understand how the code works.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2245 views
  • 1 like
  • 3 in conversation