BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aaronh
Quartz | Level 8

So I am trying to summarize data into distinct hospital stay.

 

data sample;
input member $ admission $ discharge $ hospital $;
datalines;
Jane 20150101 20150305 A
Jane 20150101 20150103 A
John 20160101 20160105 B
John 20160101 20160101 A
Jane 20190101 20190507 C
;

For my research purpose, my definition of a hospital stay is based on member, admission date, and hospital. So I am not considering internal transfer as a separate hospital stay. For example, in the sample, Jane was admitted to hospital A on 20150101, but internally transferred on 20150103 to another department/care team, and discharged from A on 20150305. I consider this a single hospital stay.

 

I tried the following and achieved what I wanted:

proc sort data=sample out=sample_srt;
by member admission discharge hospital;
run;

data want;
set sample_srt;
by member hospital admission;
if last.admission;
run;

What I don't understand is that if I try the following instead, I get something different.

data want2;
set sample_srt;
by member admission hospital;
if last.admission;
run;

I thought that the order of by group variables doesn't matter, but clearly I was wrong. I'd really appreciate if someone can explain why the order of by variables matters.

 

-AH

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

The order of variables in the BY statement creates a hierarchy. So BY A B C will create a different order than BY C B A.

In particular, keeping existing orders within a by group is NOT guaranteed if that order is not enforced by adding additional columns to the BY as needed.

In your case, I would sort

by member hospital admission discharge;

and then, in the following data step, use

by member hospital admission;
if last.admission;

so that you get the last discharge date for stays that share the same admission date.

View solution in original post

9 REPLIES 9
novinosrin
Tourmaline | Level 20

Hi @aaronh  The stay in a hospital would require a sort by member hospital admission discharge;

 

Whether a patient ever admitted to a hospital regardless of how many times or the tenure at a particular hospital would require a sort by member admission discharge;

Reeza
Super User

The Order of BY groups DOES matter. 

 

It's hierarchical, so your groups that are member/admission/hospital will be different than member/admission/hospital. 

 

Here's a quick example of an explanation. Hospitals each have their own ID systems so that IDs that are 4 in Hospital 1 are not the same as the ID=4 in Hospital 2. To account for this you would need to group your variables by Hospital ID Admission. 

 

In theory your example below should result in the same answers as you shouldn't have the same admission date to different hospitals but it clearly happens and I wonder if it's because my example above (ID different at different hospitals) isn't correct.

 

Here's an illustrated example from the documentation that may help. 

https://documentation.sas.com/?docsetId=lrcon&docsetTarget=p0xu93fy5eemkyn1p6mj5elses7j.htm&docsetVe...

 


@aaronh wrote:

So I am trying to summarize data into distinct hospital stay.

 

data sample;
input member $ admission $ discharge $ hospital $;
datalines;
Jane 20150101 20150305 A
Jane 20150101 20150103 A
John 20160101 20160105 B
John 20160101 20160101 A
Jane 20190101 20190507 C
;

For my research purpose, my definition of a hospital stay is based on member, admission date, and hospital. So I am not considering internal transfer as a separate hospital stay. For example, in the sample, Jane was admitted to hospital A on 20150101, but internally transferred on 20150103 to another department/care team, and discharged from A on 20150305. I consider this a single hospital stay.

 

I tried the following and achieved what I wanted:

proc sort data=sample out=sample_srt;
by member admission discharge hospital;
run;

data want;
set sample_srt;
by member hospital admission;
if last.admission;
run;

What I don't understand is that if I try the following instead, I get something different.

data want2;
set sample_srt;
by member admission hospital;
if last.admission;
run;

I thought that the order of by group variables doesn't matter, but clearly I was wrong. I'd really appreciate if someone can explain why the order of by variables matters.

 

-AH

 


 

smantha
Lapis Lazuli | Level 10

The problem could be the same person is admitted to two different hospitals on the same day. 

Kurt_Bremser
Super User

@smantha wrote:

The problem could be the same person is admitted to two different hospitals on the same day. 


To handle this situation, you would need datetimes instead of dates, so you could determine the order of admissions within a day.

 

aaronh
Quartz | Level 8

thanks Smantha for pointing that out. Yes, it happens because a patient could be admitted to hospital A, but it is determined that hospital A does not have the care level needed by the patient, so the patient is transferred immediately to a hospital with higher level of care.

smantha
Lapis Lazuli | Level 10
data sample;
input member $ admission $ discharge $ hospital $;
admission_dt=input(admission, yymmdd8.);
discharge_dt=input(discharge, yymmdd8.);
format admission_dt discharge_dt yymmdd10.;
datalines;
Jane 20150101 20150305 A
Jane 20150101 20150103 A
John 20160101 20160105 B
John 20160101 20160101 A
Jane 20190101 20190507 C
;
proc summary data=sample nway missing noprint;
class Member hospital;
var admission_dt discharge_dt;
output min(admission_dt) = max(discharge_dt)= out= want;
run;
Jagadishkatam
Amethyst | Level 16

It is due to the way the data is sorted and location of variables , like in the below example  sorting by and variables order of

member hospital admission will give 4 records due to hospital values distinguish the date 20160106 whereas in second example sorting by and variable order member admission hospital distinguish by admission date and we get 3 records where hospital is ignored as we are considering the admission date.

 

 

 

image.png

Thanks,
Jag
Kurt_Bremser
Super User

The order of variables in the BY statement creates a hierarchy. So BY A B C will create a different order than BY C B A.

In particular, keeping existing orders within a by group is NOT guaranteed if that order is not enforced by adding additional columns to the BY as needed.

In your case, I would sort

by member hospital admission discharge;

and then, in the following data step, use

by member hospital admission;
if last.admission;

so that you get the last discharge date for stays that share the same admission date.

aaronh
Quartz | Level 8

Thank you all for your quick and informative responses to my question! And thank you @Reeza @Jagadishkatam @Kurt_Bremser for pointing out that the by group variables form a hierarchy.

 

I ran the following to visualize of how the by variables behave differently under different hierarchy.

data visualize1;
set sample_srt;
by member hospital admission;
if first.admission then frst_adm = 1;
else frst_adm = 0;
if last.admission then last_adm = 1;
else last_adm = 0;

if first.hospital then frst_pos = 1;
else frst_pos = 0;
if last.hospital then last_pos = 1;
else last_pos = 0;
run;


data visualize2;
set sample_srt;
by member admission hospital;
if first.admission then frst_adm = 1;
else frst_adm = 0;
if last.admission then last_adm = 1;
else last_adm = 0;

if first.hospital then frst_pos = 1;
else frst_pos = 0;
if last.hospital then last_pos = 1;
else last_pos = 0;
run;

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1213 views
  • 7 likes
  • 6 in conversation