BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sas_student1
Quartz | Level 8

 

Hello,

I have a question on how to consolidate a row of billing data with an admission and discharge date and to get the last row by client.

I have a dataset (dataline below) where I have a client ID, an admit date and discharge date.

Now the admit date and discharge dates can overlap, in some cases the admit date can be the same for a client (like client D admit date of 7/27/15) but the discharge dates are different.

What I want to do is if the difference between the admit date and the previous discharge date is more than 30 then to consolidate the first admit date and the last discharge date before the difference between the two subsequent row data is more than 30 days.

 

Its a bit tough to explain but I hope that the code below and the following tables will help.

First off the rownum variable I have is to help explain which rows I am expecting to get at the end.

 

DATA test;
     INPUT @1 rownum @5 Client $ @9 oadmit_dt mmddyy10. @21 odsch_dt mmddyy10.;
	 FORMAT oadmit_dt mmddyy10. odsch_dt mmddyy10.;
	 DATALINES;
1	A	10/16/2017	10/16/2017
2	B	1/28/2016	1/28/2016
3	B	1/29/2016	1/29/2016
4	B	2/24/2016	2/24/2016
5	B	2/25/2016	2/25/2016
6	B	3/3/2016	3/3/2016
7	B	3/23/2016	3/23/2016
8	B	3/24/2016	3/24/2016
9	B	4/20/2016	4/20/2016
10	B	5/4/2016	5/4/2016
11	B	5/5/2016	5/5/2016
12	B	5/12/2016	5/12/2016
13	B	5/19/2016	5/19/2016
14	B	6/16/2016	6/16/2016
15	B	6/23/2016	6/23/2016
16	B	6/30/2016	6/30/2016
17	B	8/4/2016	8/31/2016
18	B	8/11/2016	8/31/2016
19	B	8/18/2016	8/31/2016
20	B	9/1/2016	9/27/2016
21	B	9/8/2016	9/27/2016
22	B	9/14/2016	9/14/2016
23	C	1/5/2016	1/5/2016
24	C	1/5/2016	1/5/2016
25	D	7/27/2015	3/4/2016
26	D	7/27/2015	3/7/2016
27	D	7/27/2015	3/29/2016
28	D	7/27/2015	4/14/2016
29	D	7/27/2015	4/20/2016
30	D	7/27/2015	5/3/2016
31	D	7/27/2015	5/5/2016
32	D	7/27/2015	5/12/2016
33	D	7/27/2015	5/20/2016
34	D	7/27/2015	5/26/2016
35	D	7/27/2015	7/31/2018
36	D	1/5/2016	1/5/2016
37	D	2/17/2016	2/17/2016
38	D	2/22/2016	2/22/2016
39	D	2/25/2016	2/25/2016
40	D	2/29/2016	2/29/2016
41	D	8/10/2018	8/10/2018
42	D	9/12/2018	9/12/2018
43	D	9/21/2018	9/21/2018
44	D	10/2/2018	10/2/2018
45	D	10/18/2018	10/18/2018
46	D	11/5/2018	11/5/2018
47	D	11/6/2018	11/6/2018
48	D	11/13/2018	11/13/2018
49	D	11/26/2018	11/26/2018
50	D	12/17/2018	12/17/2018

	 ;
RUN;

 

I run the following code, to try to calculate the gaps between the admit date of one row to the discharge date of the row before. This way if there is a gap of more than 30 days then the previous row should be output into a dataset. Furthermore I get the first admit date (i.e start_dt below) and the last discharge date (ie. end_dt in the code below) in the output dataset.

 

 

data test2;
  set test;
    by client oadmit_dt odsch_dt;
            retain dsch_dt2;
              if first.client   then do;
              dsch_dt2=odsch_dt;
              gap=0;
              end;
            else do;
              gap=oadmit_dt-dsch_dt2;
     dsch_dt2=odsch_dt;
            end;
            format dsch_dt2 mmddyy10.;
run;

/*connect nested/continuous pieces*/
data test3;
  count=0;
   do until(last.client);
set test2;
    by client;
            if gap>30  or first.client then do;
              if count>0 then output;
count=0;
     start_dt=oadmit_dt;
            end;
            end_dt=odsch_dt;
count+1;
end;
output;
format start_dt end_dt mmddyy10.;
run;

 

 

What I expect to get from the code above is this (note the rownum to help indicate which row I am expecting)

 

rownumclientstart dateend date
1A10/16/201710/16/2017
16B1/28/20166/30/2016
22B8/4/20169/14/2016
24C1/5/20161/5/2016
34D7/27/20155/26/2016
35D7/27/20157/31/2018
36D1/5/20161/5/2016
40D2/17/20192/29/2016
41D8/10/20188/10/2018
50D9/12/201812/17/2018

 

 

But what I get is this, again note that the row numbers are not the same.

Now the dates seem to be fine (expect for the 2 rows for client D), but its not grabbing the correct row.

Any ideas on what I am missing in my code above?

 

rownumclientstart_dtend_dt
1A10/16/201710/16/2017
17B1/28/20166/30/2016
22B8/4/20169/14/2016
24C1/5/20161/5/2016
37D7/27/20151/5/2016
41D2/17/20162/29/2016
42D8/10/20188/10/2018
50D9/12/201812/17/2018

 

Thank you!

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

easy solution @sas_student1 

 



DATA test;
     INPUT @1 rownum @5 Client $ @9 oadmit_dt mmddyy10. @21 odsch_dt mmddyy10.;
	 FORMAT oadmit_dt mmddyy10. odsch_dt mmddyy10.;
	 DATALINES;
1	A	10/16/2017	10/16/2017
2	B	1/28/2016	1/28/2016
3	B	1/29/2016	1/29/2016
4	B	2/24/2016	2/24/2016
5	B	2/25/2016	2/25/2016
6	B	3/3/2016	3/3/2016
7	B	3/23/2016	3/23/2016
8	B	3/24/2016	3/24/2016
9	B	4/20/2016	4/20/2016
10	B	5/4/2016	5/4/2016
11	B	5/5/2016	5/5/2016
12	B	5/12/2016	5/12/2016
13	B	5/19/2016	5/19/2016
14	B	6/16/2016	6/16/2016
15	B	6/23/2016	6/23/2016
16	B	6/30/2016	6/30/2016
17	B	8/4/2016	8/31/2016
18	B	8/11/2016	8/31/2016
19	B	8/18/2016	8/31/2016
20	B	9/1/2016	9/27/2016
21	B	9/8/2016	9/27/2016
22	B	9/14/2016	9/14/2016
23	C	1/5/2016	1/5/2016
24	C	1/5/2016	1/5/2016
25	D	7/27/2015	3/4/2016
26	D	7/27/2015	3/7/2016
27	D	7/27/2015	3/29/2016
28	D	7/27/2015	4/14/2016
29	D	7/27/2015	4/20/2016
30	D	7/27/2015	5/3/2016
31	D	7/27/2015	5/5/2016
32	D	7/27/2015	5/12/2016
33	D	7/27/2015	5/20/2016
34	D	7/27/2015	5/26/2016
35	D	7/27/2015	7/31/2018
36	D	1/5/2016	1/5/2016
37	D	2/17/2016	2/17/2016
38	D	2/22/2016	2/22/2016
39	D	2/25/2016	2/25/2016
40	D	2/29/2016	2/29/2016
41	D	8/10/2018	8/10/2018
42	D	9/12/2018	9/12/2018
43	D	9/21/2018	9/21/2018
44	D	10/2/2018	10/2/2018
45	D	10/18/2018	10/18/2018
46	D	11/5/2018	11/5/2018
47	D	11/6/2018	11/6/2018
48	D	11/13/2018	11/13/2018
49	D	11/26/2018	11/26/2018
50	D	12/17/2018	12/17/2018
;
RUN;
data temp;
set test;
by client;
retain f;
k=lag(odsch_dt);
if first.client then f=1;
else do;
d=oadmit_dt-k;
if d>30 then f+1;
end;
drop k d;
run;

proc sql;
create table want as 
select client,min(oadmit_dt) as oadmit_dt format=mmddyy10.,max(odsch_dt) as odsch_dt format=mmddyy10.
from temp
group by client, f;
quit;

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20

easy solution @sas_student1 

 



DATA test;
     INPUT @1 rownum @5 Client $ @9 oadmit_dt mmddyy10. @21 odsch_dt mmddyy10.;
	 FORMAT oadmit_dt mmddyy10. odsch_dt mmddyy10.;
	 DATALINES;
1	A	10/16/2017	10/16/2017
2	B	1/28/2016	1/28/2016
3	B	1/29/2016	1/29/2016
4	B	2/24/2016	2/24/2016
5	B	2/25/2016	2/25/2016
6	B	3/3/2016	3/3/2016
7	B	3/23/2016	3/23/2016
8	B	3/24/2016	3/24/2016
9	B	4/20/2016	4/20/2016
10	B	5/4/2016	5/4/2016
11	B	5/5/2016	5/5/2016
12	B	5/12/2016	5/12/2016
13	B	5/19/2016	5/19/2016
14	B	6/16/2016	6/16/2016
15	B	6/23/2016	6/23/2016
16	B	6/30/2016	6/30/2016
17	B	8/4/2016	8/31/2016
18	B	8/11/2016	8/31/2016
19	B	8/18/2016	8/31/2016
20	B	9/1/2016	9/27/2016
21	B	9/8/2016	9/27/2016
22	B	9/14/2016	9/14/2016
23	C	1/5/2016	1/5/2016
24	C	1/5/2016	1/5/2016
25	D	7/27/2015	3/4/2016
26	D	7/27/2015	3/7/2016
27	D	7/27/2015	3/29/2016
28	D	7/27/2015	4/14/2016
29	D	7/27/2015	4/20/2016
30	D	7/27/2015	5/3/2016
31	D	7/27/2015	5/5/2016
32	D	7/27/2015	5/12/2016
33	D	7/27/2015	5/20/2016
34	D	7/27/2015	5/26/2016
35	D	7/27/2015	7/31/2018
36	D	1/5/2016	1/5/2016
37	D	2/17/2016	2/17/2016
38	D	2/22/2016	2/22/2016
39	D	2/25/2016	2/25/2016
40	D	2/29/2016	2/29/2016
41	D	8/10/2018	8/10/2018
42	D	9/12/2018	9/12/2018
43	D	9/21/2018	9/21/2018
44	D	10/2/2018	10/2/2018
45	D	10/18/2018	10/18/2018
46	D	11/5/2018	11/5/2018
47	D	11/6/2018	11/6/2018
48	D	11/13/2018	11/13/2018
49	D	11/26/2018	11/26/2018
50	D	12/17/2018	12/17/2018
;
RUN;
data temp;
set test;
by client;
retain f;
k=lag(odsch_dt);
if first.client then f=1;
else do;
d=oadmit_dt-k;
if d>30 then f+1;
end;
drop k d;
run;

proc sql;
create table want as 
select client,min(oadmit_dt) as oadmit_dt format=mmddyy10.,max(odsch_dt) as odsch_dt format=mmddyy10.
from temp
group by client, f;
quit;
sas_student1
Quartz | Level 8

@novinosrin  Thank you! this is great I will try it. Quick question: How do I keep the rownum column as well.

Asking as I would then want to join the dataset created here with another dataset that has the rownum so I can grab the proper items from that table too.

 

Thank you!

 

andreas_lds
Jade | Level 19

Add the variable to the select-statement.

novinosrin
Tourmaline | Level 20

Hi @sas_student1   Good morning

Quick question: How do I keep the rownum column as well.

 

For this, I want to ask why would you need a rownum in a summarized aka in this case so called collapsed records in the first place. So, obviously when the startdate happens to be from a row different enddate , what rownum you would want, logically speaking?

 

Asking as I would then want to join the dataset created here with another dataset that has the rownum so I can grab the proper items from that table too."

 

Perhaps, There is some kinda design improvement needed as a neat dimensional model would seldom require a join based on rownum and rather would most likely be based on keys.. Either way, you could post us a what your bigger picture and the objective.

 

 

sas_student1
Quartz | Level 8

Hello @novinosrin !

 

First thank you for your quick reply.

To answer  your question "why would you need a rownum in a summarized aka in this case so called collapsed records in the first place. So, obviously when the startdate happens to be from a row different enddate , what rownum you would want, logically speaking?". Actually I need to grab the first and last date in a collapsed record but then I also need to figure out what the pattern of treatment a client received. So if on one day they got a hostpital vist, and the next day they were in a clinic, and the third day they visited a radiologist then I would create a pattern that would show hospital-clinic-radiology.

Have created this pattern in the same database using a retain and a last. so the last row would have the full pattern. As such I would want to bring in those columns by matching to row.

But I was able to add another proc sql after your suggested min max SQL to join on client ID and discharge date to get the rownum.

 

So it worked out!

 

Thank you very much for helping me out!

 

novinosrin
Tourmaline | Level 20

Thank you @sas_student1  for the response and details.  We are glad that you made it intuitive and interactive. For us, that's what keeps us continuously interested to be part of offering solutions on the community. Have a good one!

sas_student1
Quartz | Level 8

Thank you @novinosrin for your helpful reply!!!

This community is great!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 880 views
  • 1 like
  • 3 in conversation