BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
iroy
Calcite | Level 5

Hi I have a dataset that looks something like this:

data test;
input EP_ID $ SOC :mmddyy10. ROC :mmddyy10. ASSMT_DT :mmddyy10. ASSMT_RSN $;
format SOC ROC ASSMT_DT date9.;
datalines;
1 08/10/2016 . 08/10/2016 01
1 08/10/2016 . 08/18/2016 09
1 08/10/2016 . 10/08/2016 04
1 08/10/2016 . 11/25/2016 06
1 08/10/2016 . 12/06/2016 08
2 01/05/2017 . 01/05/2017 01
2 01/05/2017 . 01/15/2017 06
2 01/05/2017 01/19/2017 01/19/2017 03
2 01/05/2017 01/19/2017 01/30/2017 09
3 02/15/2017 . 02/15/2017 01
3 02/15/2017 . 02/25/2017 07
4 03/14/2017 . 03/14/2017 01
4 03/14/2017 . 03/18/2017 04
4 03/14/2017 . 03/26/2017 06
4 03/14/2017 04/02/2017 04/02/2017 03
4 03/14/2017 04/02/2017 04/12/2017 05
4 03/14/2017 04/02/2017 04/22/2017 06
4 03/14/2017 04/02/2017 04/28/2017 08
;

 

In my data set, I want to create a start and stop indicator using values of assmt_rsn. If assmt_rsn takes the values '06', '07', '08' or 09 then I want to stop at that line and count that date as the end of stay within that episode. Every stop of episode , should have a start denoted by assmt_rsn taking the value 01 or 03. In some cases when a 01 or 03 is missing before the stop, a 04 or 05 is considered as a start of stay. So, I finally want to achieve something like this:

 

ep_id  ep_stay_id        SOC             ROC           ep_stay_strt_dt   ep_stay_start_rsn   ep_stay_end_dt   ep_stay_end_rsn

1                  1             08/10/2016         .                 08/10/2016                     01                     08/18/2016                       09

1                  2            08/10/2016         .                 10/08/2016                     04                     11/25/2016                        06 

1                  3            08/10/2016         .                           .                                  .                       12/06/2016                       08

2                 1             01/05/2017         .                  01/05/2017                     01                      01/15/2017                        06

2                 2            01/05/2017    01/19/2017    01/19/2017                     03                     01/30/2017                        09

3                 1             02/15/2017          .                 02/15/2017                      01                     02/25/2017                        07

4                 1             03/14/2017          .                 03/14/2017                      01                     03/26/2017                       06

4                 2            03/14/2017   04/02/2017   04/02/2017                    03                    04/22/2017                       06

4                 3             03/14/2017          .                          .                                    .                      04/28/2017                       08

 

How can I achieve this?

1 ACCEPTED SOLUTION

Accepted Solutions
whymath
Lapis Lazuli | Level 10

You can use a sequence variable to help yourself:

data mid1;
  set test;
  by EP_ID;

  if first.EP_ID then seq=1;
  output;
  if first.EP_ID=0 and ASSMT_RSN in('06','07','08','09') then seq+1;
run;

data want;
  set mid1;
  by EP_ID seq;

  retain ep_stay_strt_dt ep_stay_start_rsn;
  if first.seq then do;
    ep_stay_strt_dt=ASSMT_DT;
    ep_stay_start_rsn=ASSMT_RSN;
  end;
  if last.seq then do;
    if first.seq then call missing(ep_stay_strt_dt,ep_stay_start_rsn);
    ep_stay_end_dt=ASSMT_DT;
    ep_stay_end_rsn=ASSMT_RSN;
    output;
  end;
  format ep_stay_end_dt ep_stay_strt_dt mmddyys10.;
run;

View solution in original post

5 REPLIES 5
whymath
Lapis Lazuli | Level 10

You can use a sequence variable to help yourself:

data mid1;
  set test;
  by EP_ID;

  if first.EP_ID then seq=1;
  output;
  if first.EP_ID=0 and ASSMT_RSN in('06','07','08','09') then seq+1;
run;

data want;
  set mid1;
  by EP_ID seq;

  retain ep_stay_strt_dt ep_stay_start_rsn;
  if first.seq then do;
    ep_stay_strt_dt=ASSMT_DT;
    ep_stay_start_rsn=ASSMT_RSN;
  end;
  if last.seq then do;
    if first.seq then call missing(ep_stay_strt_dt,ep_stay_start_rsn);
    ep_stay_end_dt=ASSMT_DT;
    ep_stay_end_rsn=ASSMT_RSN;
    output;
  end;
  format ep_stay_end_dt ep_stay_strt_dt mmddyys10.;
run;
iroy
Calcite | Level 5

Thank you, your code worked very well!

mkeintz
PROC Star

I took me a while to grasp what you want:

 

data test;
input EP_ID $ SOC :mmddyy10. ROC :mmddyy10. ASSMT_DT :mmddyy10. ASSMT_RSN $2.;
format SOC ROC ASSMT_DT date9.;
datalines;
1 08/10/2016 .          08/10/2016 01
1 08/10/2016 .          08/18/2016 09
1 08/10/2016 .          10/08/2016 04
1 08/10/2016 .          11/25/2016 06
1 08/10/2016 .          12/06/2016 08
2 01/05/2017 .          01/05/2017 01
2 01/05/2017 .          01/15/2017 06
2 01/05/2017 01/19/2017 01/19/2017 03
2 01/05/2017 01/19/2017 01/30/2017 09
3 02/15/2017 .          02/15/2017 01
3 02/15/2017 .          02/25/2017 07
4 03/14/2017 .          03/14/2017 01
4 03/14/2017 .          03/18/2017 04
4 03/14/2017 .          03/26/2017 06
4 03/14/2017 04/02/2017 04/02/2017 03
4 03/14/2017 04/02/2017 04/12/2017 05
4 03/14/2017 04/02/2017 04/22/2017 06
4 03/14/2017 04/02/2017 04/28/2017 08
run;

data episode (drop=i assmt_dt assmt_rsn);
  do i=1 by 1 until (assmt_rsn in ('06','07','08','09') or last.ep_id=1);
    set test;
    by ep_id;
    if first.ep_id then ep_stay_id=1;
    else if i=1 then ep_stay_id+1;

    if i=1 and not (assmt_rsn in ('06','07','08','09')) then do;
      ep_stay_start_dt=assmt_dt;
      ep_stay_start_rsn=assmt_rsn;
    end;
  end;
  ep_stay_end_dt=assmt_dt ;
  ep_stay_end_rsn=assmt_rsn;
  format ep_stay_start_dt mmddyy10. ep_stay_start_rsn $2.
         ep_stay_end_dt mmddyy10.   ep_stay_end_rsn $2. ;
run;

In your last observation in the desired output, you have ROC as missing, but I think your logic requires it to be 4/2/2017.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mkeintz
PROC Star

This can also be done without embedding the SET statement in a DO loop:

 

data episode (drop=assmt_:);
  set test;
  by ep_id;
  if first.ep_id then ep_stay_id=.;

  format ep_stay_start_dt mmddyy10. ep_stay_start_rsn $2.
         ep_stay_end_dt mmddyy10.   ep_stay_end_rsn $2. ;
  retain ep_stay_start: ;

  if first.ep_id=1 or lag(assmt_rsn) in ('06','07','08','09') then do;
    ep_stay_id+1;
    ep_stay_start_dt=assmt_dt;
    ep_stay_start_rsn=assmt_rsn;
    if assmt_rsn in ('06','07','08','09') then call missing(of ep_stay_start:);
  end;

  if (assmt_rsn in ('06','07','08','09') or last.ep_id);

  ep_stay_end_dt=assmt_dt ;
  ep_stay_end_rsn=assmt_rsn;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
iroy
Calcite | Level 5

My apologies if my question was unclear. 

Thank you, both your code suggestions worked very well. And you are correct, I made a mistake. In my desired output the ROC on the last observation should be 4/2/2017.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1380 views
  • 0 likes
  • 3 in conversation