BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
allores
Calcite | Level 5

Below is a snap shot of the table, obviously it’s bigger than this, but these are the fields I need to use.  The issue is that lines 3 and 4 are all 1 inpatient stay, with just an interim bill submitted with a 1 day "gap".  Lines 1, 2, 5 are all discrete IP stays.

 

I need to group lines 3 and 4 and assign 06/15/2017 as the IP_START_DATE to both so that a “visit”: would be (123456||06/15/2017||987654) for all lines 3 and 4.  The other part of this is that it is not always just two lines, it could be 3 or more that all combine to a > 90 day stay with interim bills so the lag function doesn't seem to be working for me.

 

Index

Member ID

DOS Start

DOS End

Provider ID

OBS

FK_PTY_ID

MEDI_START_DATE

MEDI_END_DATE

HOSP_CODE

1

12345

02/01/2017

02/01/2017

987654

2

12345

03/01/2017

03/10/2017

987654

3

12345

06/15/2017

06/30/2017

987654

4

12345

07/01/2017

07/10/2017

987654

5

12345

08/20/2017

08/24/2017

987654

 

 

My idea was to make a “sub_visit” of (FK_PTY_ID||HOSP_CODE), sort by that, then list all the MEDI_START_DATE by ascending and get the “gap” from MEDI_END_DATE to MEDI_START_DATE2 (lagged).  If that “gap” is <= 1 then consider it all one visit.  If not, it’s a new visit.  Something like this:

 

Index

Member ID

DOS Start

DOS End

Provider ID

Created

Created

OBS

FK_PTY_ID

MEDI_START_DATE

MEDI_END_DATE

HOSP_CODE

SUB_VISIT

GAP

1

12345

02/01/2017

02/01/2017

987654

12345987654

.

2

12345

03/01/2017

03/10/2017

987654

12345987654

30

3

12345

06/15/2017

06/30/2017

987654

12345987654

97

4

12345

07/01/2017

07/10/2017

987654

12345987654

1

5

12345

08/20/2017

08/24/2017

987654

12345987654

41

 

Then, reading back it will look for a gap <= to 1 and assign the first MEDI_START_DATE within that visit only.  (without reading all the way back to line 1, b/c is was the same member at the same provider but a very different visit months before.)

 

Index

Member ID

DOS Start

DOS End

Provider ID

Created

Created

Created

OBS

FK_PTY_ID

MEDI_START_DATE

MEDI_END_DATE

HOSP_CODE

SUB_VISIT

GAP

IP_START

1

12345

02/01/2017

02/01/2017

987654

12345987654

.

02/01/2017

2

12345

03/01/2017

03/10/2017

987654

12345987654

30

03/01/2017

3

12345

06/15/2017

06/30/2017

987654

12345987654

97

06/15/2017

4

12345

07/01/2017

07/10/2017

987654

12345987654

1

06/15/2015

5

12345

08/20/2017

08/24/2017

987654

12345987654

41

08/20/2017

 

 

I had been trying to use code like this, but it will pnly look back one line.

 

 

PROC SORT DATA=ALLORES.SUB_VISIT  OUT=ALLORES.SUB_VISIT;

      BY SUB_VISIT;

      BY MEDI_START_DATE;

RUN;

 

 

DATA SUB_VISIT;

      SET WORK.SUB_VISIT;

      MEDI_S1= MEDI_START_DATE;

      MEDI_S2 = LAG(MEDI_START_DATE);

      MEDI_E1 = MEDI_END_DATE;

      MEDI_E2 = LAG(MEDI_END_DATE);

      GAP = MEDI_E1 - MEDI_S2;

      IF FK_PTY_ID = LAG(FK_PTY_ID)

            AND HOSP_CODE = LAG(HOSP_CODE)

            AND GAP <= 1

THEN IP_START_DATE = put(MEDI_START_DATE,yymmdd10.);

     ELSE IP_START_DATE = put(lag(MEDI_START_DATE),yymmdd10.);

RUN;

 

Thanks in adance!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

A few things stand out about your question.  Areas to learn more: 

 

First, become more comfortable with dates.  When creating IP_START_DATE, there is no need to create it as a character variable.  Keep it as a numeric SAS date, and apply a format to it.

 

Second, as you have seen the LAG function won't handle everything.  Sometimes it won't even give the right answer.  If you want to retrieve the value on the previous observation, LAG has to execute on every observation (outside of any IF/THEN conditions).

 

Third, learning the use of a BY statement in a DATA step is vital.  There's more to learn than can be posted here, but it is vital.

 

All that being said, let's take a look at the problem you posted.  Here is a variation on what you posted that works with the original data set (without adding SUB_VISIT):

 

proc sort data=original_data_set;

   by fk_pty_id hosp_code;

run;

 

The following approach will get you part-way there by assigning observations to a visit, based on the gap.  Other calculations (such as when does the visit begin, when does it end) are not included here:

 

data want;

set original_data_set;

by fk_pty_id hosp_code;

gap = medi_end_date - lag(medi_start_date);

if first.fk_pty_id then visit = 1;

else if first.hosp_code or gap > 1 then visit + 1;

run;

 

Take a look at just that much, and see how close it gets you to where you need to go with this.

View solution in original post

2 REPLIES 2
Astounding
PROC Star

A few things stand out about your question.  Areas to learn more: 

 

First, become more comfortable with dates.  When creating IP_START_DATE, there is no need to create it as a character variable.  Keep it as a numeric SAS date, and apply a format to it.

 

Second, as you have seen the LAG function won't handle everything.  Sometimes it won't even give the right answer.  If you want to retrieve the value on the previous observation, LAG has to execute on every observation (outside of any IF/THEN conditions).

 

Third, learning the use of a BY statement in a DATA step is vital.  There's more to learn than can be posted here, but it is vital.

 

All that being said, let's take a look at the problem you posted.  Here is a variation on what you posted that works with the original data set (without adding SUB_VISIT):

 

proc sort data=original_data_set;

   by fk_pty_id hosp_code;

run;

 

The following approach will get you part-way there by assigning observations to a visit, based on the gap.  Other calculations (such as when does the visit begin, when does it end) are not included here:

 

data want;

set original_data_set;

by fk_pty_id hosp_code;

gap = medi_end_date - lag(medi_start_date);

if first.fk_pty_id then visit = 1;

else if first.hosp_code or gap > 1 then visit + 1;

run;

 

Take a look at just that much, and see how close it gets you to where you need to go with this.

allores
Calcite | Level 5
Thank You! This worked as a step to what I needed to get to. I switched gap to = medi_start - lag(medi_end). I appreciate the help and I am also going to learn more about the 3 bullets in your response.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 636 views
  • 0 likes
  • 2 in conversation