BookmarkSubscribeRSS Feed
Jse
Fluorite | Level 6 Jse
Fluorite | Level 6

Data Example:

                                

MemberID

Admit_date

      cost

Group

1111

1/1/10

3

Unavoidable

1111

1/1/10

2

Avoidable

1111

1/2/10

4

Unavoidable

2222

1/20/10

1

Avoidable

Would like to group by MemberID and consecutive admit dates in to order sum cost, and count the number of admit ‘episodes’ to the ER as well as the length of the ER episode span (unique count of admit days in that span). 

  • Defining an ER  'episode' by all the claims with consecutive admit dates for a member id
  • Going from many claim lines to one line per ER ‘episode’. 
  • Only have a single date variable to work with (no discharge date). 

Deisred Output:

MemberID

Admit_date

Episode cost     (sum of all claims in consecutive days)

Group

Episode count

Episode Span (unique count of admitdt for consecutive claims)

1111

1/1/10

9

Unavoidable

1

2

2222

1/20/10

1

Avoidable

1

1

I made a little headway with the information on this thread:


I am having trouble modifying to meet my needs.  Thank you for your time.  Any help is greatly appreciated. 

10 REPLIES 10
morgalr
Obsidian | Level 7

For needs like you describe I tend to use Proc SQL. The following will create a table, populate it, and give you the described result using Proc SQL.

proc sql;
create table myStuff(MemberID Char(4),
                                 Admit_Date Date,
                                 cost int,
                                 Group char(16));

  insert into myStuff values('1111', '01-JAN-2010'd, 3, 'Unavoidable');
  insert into myStuff values('1111', '01-JAN-2010'd, 2, 'Avoidable');
  insert into myStuff values('1111', '02-JAN-2010'd, 4, 'Unavoidable');
  insert into myStuff values('2222', '20-JAN-2010'd, 1, 'Avoidable');

  select MemberID, Min(Admit_Date) as Admit_Date, sum(cost) as Episode_Cost, 1 as Episode_Count, Count(Distinct(Admit_Date)) as Episode_Span from MyStuff group by MemberID;
quit;

If you want to have your answer by MemberID and Admit_Date, then just add Admit_Date to the "group by" clause: "group by MemberID, Admit_Date".

Jse
Fluorite | Level 6 Jse
Fluorite | Level 6

Thank you for the response.  I dont see how this is grouping by consecutive dates for each member ID.  Members will have multiple spans throughout the set.    Using min(admit_date) will only give me the earliest admit_date for each member. 

Using proc sql, I would want to group by Member ID and consecutive admit dates  to allow for multiple spans for each member (I think...).

Thank you for taking a look at my problem.

Jesse

Reeza
Super User

How did you calculate the group?

You probably need to provide more sample input and output data.

Collapsing data of this form isn't always a trivial process so you might not get a full solution on here.

Jse
Fluorite | Level 6 Jse
Fluorite | Level 6

Group is calculated by comparing all diagnoses on the claim line to a list of diagnoses related to avoidable ER visits.  If the claim line only contains avoidable diagnoses then the claim is flagged as avoidable.  If there are any other diagnoses (not in the avoidable diagnosis list)  present on the claim then it is flagged unavoidable.There are multiple claims per ER 'episode'.  If any of the claim lines in the episode span are flagged as unavoidable then the ER 'episode' is flagged as unavoidable.

I didn't mention this group flagging issue as I am still working on the roll up for ER episode per member.  I will add more input and desired output data.

Yes, this is certainly not trivial for me, so I came to the experts for advice.  Thank you for your response.

Jesse

morgalr
Obsidian | Level 7

Jesse,

Thank you for further defining your requirements, you need to move back to a dataset ordered by MemberID and Admit_Date. Use a manual "output" that is triggered by your MemberID or your Admit_Date changing. Each iteration you have to "retain" your calculation values and only keep the calculated values for your output.

Are you familiar with using "output" with a dataset?

Les

morgalr
Obsidian | Level 7


Jse,

You are correct, it does not group by Member ID and Date, but then neither does your example output data. That is why I add the comment:

If you want to have your answer by MemberID and Admit_Date, then just add Admit_Date to the "group by" clause: "group by MemberID, Admit_Date".

This will give:

   select MemberID, Admit_Date, sum(cost) as Episode_Cost, 1 as Episode_Count, Count(Distinct(Admit_Date)) as Episode_Span from MyStuff group by MemberID, Admit_date;

And if you want to be assured sequential MemberID's and Admit_Date's you need an "order by" clause" which will give:

select MemberID, Admit_Date, sum(cost) as Episode_Cost, 1 as Episode_Count, Count(Distinct(Admit_Date)) as Episode_Span from MyStuff group by MemberID, Admit_date order by MemberID, Admit_date;

Which I believe is what you asked for, but did not show, in the first place.

I am still at a loss as to what you want to do with your group field: in your sample output it is taken from the first obs of the group by MemberID--in both cases shown, it has been ignored.

Haikuo
Onyx | Level 15

More business rules need to be revealed.

How do you select group "Unavoidable" for "1111' in stead of "avoidable", where they have identical memberid and admit_date?

Haikuo
Onyx | Level 15

if take the group(unavailable/available) out of the picture, then the following may get you started:

data have;

     input MemberID Admit_date :mmddyy10.   cost Group:$20.;

     format admit_date mmddyy10.;

     cards;

1111 1/1/10 3 Unavoidable

1111 1/1/10 2 Avoidable

1111 1/2/10 4 Unavoidable

1111 1/7/10 8 Unavoidable

2222 1/20/10 1 Avoidable

;

/*simple dif()+simple SQL */

data have_dif;

     set have;

     by memberid admit_date;

     if first.memberid then

           grp=1;

     else grp+(dif(admit_date)>1);

run;

proc sql;

     create table want_1 as

           select distinct MemberID, Admit_date , sum(cost) as cost , range(admit_date)+1 as Episode_span, grp

                from have_dif

                     group by MemberID,  grp

                           having Admit_date = min(Admit_date)

     ;

quit;

seeing attempt of providing a pure SQL approach, frankly, SQL is not a sequential process, so there is a risk of  overstretching if you try to implement SQL for something that it is NOT good at. As you can see from the following, it is kinda strenuous, not only for the coding, but also for the performance. So the following approach is not recommended and only has 'research' value.

/*Complex Proc SQl*/

proc sql;

     create table want_2 as

           select distinct MemberID, Admit_date , sum(cost) as cost , range(admit_date)+1 as Episode_span ,

                1+(not exists(select * from have where a.memberid=memberid and admit_date ne a.admit_date and admit_date between a.admit_date-1 and a.admit_date+1)) as grp

           from have a

                group by MemberID, calculated grp

                     having Admit_date = min(Admit_date)

     ;

quit;

Ksharp
Super User

As you said : "Defining an ER  'episode' by all the claims with consecutive admit dates for a member id"

Every consecutive admit dates is a group, then Episode_count should always be 1. right ?

data have;
     input MemberID Admit_date :mmddyy10.   cost Group:$20.;
     format admit_date mmddyy10.;
     cards;
1111 1/1/10 3 Unavoidable
1111 1/1/10 2 Avoidable
1111 1/2/10 4 Unavoidable
1111 1/7/10 8 Unavoidable
2222 1/20/10 1 Avoidable
;
run;
data temp;
 set have;
 if MemberID ne lag(MemberID) or dif(Admit_date) gt 1 then n+1;
run;
proc summary data=temp;
by n;
var cost  Admit_date;
output out=summary(drop=_:) sum(cost)=Episode_cost range(Admit_date)=Episode_span;
format Admit_date best8.;
run;
data want;
 merge temp summary;
 by n;
 Episode_span=Episode_span+1;
 if first.n;
 drop n cost;
run;

Xia Keshan

Jse
Fluorite | Level 6 Jse
Fluorite | Level 6

Thank you all so much for the help.  I was able to get it working with this bit of code from Reeza found on a previous thread:

Data Test_ER_2;

set avoid1;

by memid admitdt;

retain continuous_group 1;

prev_date=lag(admitdt);

if first.memid then do;

  prev_date=.;

  continuous_group=1;

end;

format prev_date date9.;

diff=intck('day',prev_date,admitdt);

if intck('day', prev_date, admitdt)>1 then continuous_group + 1;

run;

Assigned a continuous_group number for each span for each memberID.  Was then able to aggregate using member ID and continuous_group.  I am going to try it with the code provided above as well.  Thank you everyone again for the help Smiley Happy

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 3387 views
  • 6 likes
  • 5 in conversation