I have a raw data and not sure how to flag if service_from_date is not a continuous month.
for example, from observation1-3, need flag=1; from observation 4-5, need flag=2; from observation 6-13, need flag=3.
observation 1-5 has a same provider, but there is a gap (not continuous) between observation 3 and 4.
MEMBER_ID | SERVICE_FROM_DATE | SERVICE_THROUGH_DATE | SERVICE_PROVIDER_ID |
ABC | 01/05/2017 | 01/05/2017 | 64058469 |
ABC | 02/24/2017 | 02/24/2017 | 64058469 |
ABC | 03/08/2017 | 03/08/2017 | 64058469 |
ABC | 07/10/2017 | 07/10/2017 | 64058469 |
ABC | 08/21/2017 | 08/21/2017 | 64058469 |
ABC | 03/25/2017 | 03/25/2017 | 64089900 |
ABC | 04/30/2017 | 04/30/2017 | 64089900 |
ABC | 05/28/2017 | 05/28/2017 | 64089900 |
ABC | 06/25/2017 | 06/25/2017 | 64089900 |
ABC | 07/23/2017 | 07/23/2017 | 64089900 |
ABC | 08/20/2017 | 08/20/2017 | 64089900 |
ABC | 09/24/2017 | 09/24/2017 | 64089900 |
ABC | 10/22/2017 | 10/22/2017 | 64089900 |
wanted results:
MEMBER_ID | START_DT | END__DT | SERVICE_PROVIDER_ID |
ABC | 01/05/2017 | 03/08/2017 | 64058469 |
ABC | 07/10/2017 | 08/21/2017 | 64058469 |
ABC | 03/25/2017 | 10/22/2017 | 64089900 |
is your definition of "continuous" that the month of service_from_date is exactly one month after the service_through_date from the previous record when the member_id and service_provider_id are the same?
Since your results do not show any flag values it looks like you want to collapse the from and to period, correct?
data have;
input MEMBER_ID $ SERVICE_FROM_DATE :mmddyy10. SERVICE_THROUGH_DATE :mmddyy10. SERVICE_PROVIDER_ID $;
format SERVICE_FROM_DATE SERVICE_THROUGH_DATE mmddyy10.;
datalines;
ABC 01/05/2017 01/05/2017 64058469
ABC 02/24/2017 02/24/2017 64058469
ABC 03/08/2017 03/08/2017 64058469
ABC 07/10/2017 07/10/2017 64058469
ABC 08/21/2017 08/21/2017 64058469
ABC 03/25/2017 03/25/2017 64089900
ABC 04/30/2017 04/30/2017 64089900
ABC 05/28/2017 05/28/2017 64089900
ABC 06/25/2017 06/25/2017 64089900
ABC 07/23/2017 07/23/2017 64089900
ABC 08/20/2017 08/20/2017 64089900
ABC 09/24/2017 09/24/2017 64089900
ABC 10/22/2017 10/22/2017 64089900
;
data want;
do n=1 by 1 until(last.member_id);
set have;
by MEMBER_ID SERVICE_FROM_DATE notsorted;
if n=1 then _start_DATE=SERVICE_FROM_DATE;
array temp(100);
array _temp2(100);
k=month(SERVICE_FROM_DATE);
retain grp 1;
k1=dif(k);
if k1 not in (1,.) then
do;
grp+1;
temp(n-1)=_start_DATE;
_start_DATE=SERVICE_FROM_DATE;
_temp2(n)=n-1;
end;
*output;
end;
do n1= 1 by 1 until(last.member_id);
set have;
by MEMBER_ID SERVICE_FROM_DATE notsorted;
if n1 in _temp2 then
do;
start_date=temp(n1);
output;
end;
end;
if last.member_id then do;
start_date=_start_DATE;
output;
end;
format start_date _start_date mmddyy10.;
drop n temp: _temp: k: grp n:;
run;
sorry for lazy and unclean code, but you can. i'll see if i can clean after lunch. I am hungry
Better version:
NOTES: The code may require tweaking a bit as the service provider changes for the same member id. I trust if you are capable of maintaining this code, you certainly can tweak on your own. Nevertheless, should you face any discrepancies while testing, let me know. And of course, I am sure some super user will give a far better solution than this
data have;
input MEMBER_ID $ SERVICE_FROM_DATE :mmddyy10. SERVICE_THROUGH_DATE :mmddyy10. SERVICE_PROVIDER_ID $;
format SERVICE_FROM_DATE SERVICE_THROUGH_DATE mmddyy10.;
datalines;
ABC 01/05/2017 01/05/2017 64058469
ABC 02/24/2017 02/24/2017 64058469
ABC 03/08/2017 03/08/2017 64058469
ABC 07/10/2017 07/10/2017 64058469
ABC 08/21/2017 08/21/2017 64058469
ABC 03/25/2017 03/25/2017 64089900
ABC 04/30/2017 04/30/2017 64089900
ABC 05/28/2017 05/28/2017 64089900
ABC 06/25/2017 06/25/2017 64089900
ABC 07/23/2017 07/23/2017 64089900
ABC 08/20/2017 08/20/2017 64089900
ABC 09/24/2017 09/24/2017 64089900
ABC 10/22/2017 10/22/2017 64089900
;
data want(rename=(SERVICE_THROUGH_DATE=end_date));
retain MEMBER_ID start_date;
do _n=1 by 1 until(last.member_id);
set have;
by MEMBER_ID SERVICE_FROM_DATE notsorted;
if _n=1 then _start_DATE=SERVICE_FROM_DATE;
array temp(100) _temporary_;
array _temp2(100) _temporary_;
_k=month(SERVICE_FROM_DATE);
retain _grp 1;
_k1=dif(_k);
if _k1 not in (1,.) then
do;
_grp+1;
temp(_n-1)=_start_DATE;
_start_DATE=SERVICE_FROM_DATE;
_temp2(_n)=_n-1;
end;
end;
do _n_= 1 by 1 until(last.member_id);
set have;
by MEMBER_ID SERVICE_FROM_DATE notsorted;
if _n_ in _temp2 then
do;
start_date=temp(_n_);
output;
end;
if last.member_id then
do;
start_date=_start_DATE;
output;
end;
end;
format start_date mmddyy10.;
drop _: SERVICE_FROM_DATE;
run;
@novinosrin "I am hungry".. gotta love it 😄
@PeterClemmensen Yep indeed loved it . I can't miss the amazing subway's sub of the day oven roasted chicken topping with mayonnaise and hot sauce lol
If you have SAS/ETS do something like this, since the creation of START_DT and END_DT really only depends on looking 1 obs back and 1 ahead 🙂
proc expand data=have out=have_ll method=none;
by SERVICE_PROVIDER_ID;
id SERVICE_FROM_DATE;
convert SERVICE_FROM_DATE=lead_SERVICE_FROM_DATE / transformout=(lead 1);
convert SERVICE_FROM_DATE=lag_SERVICE_FROM_DATE / transformout=(lag 1);
run;
data want(keep=MEMBER_ID START_DT END_DT SERVICE_PROVIDER_ID);
format MEMBER_ID START_DT END_DT SERVICE_PROVIDER_ID;
set have_ll;
by SERVICE_PROVIDER_ID;
monthback=month(SERVICE_FROM_DATE)-month(lag_SERVICE_FROM_DATE);
monthforward=month(lead_SERVICE_FROM_DATE)-month(SERVICE_FROM_DATE);
if first.SERVICE_PROVIDER_ID then START_DT=SERVICE_FROM_DATE;
else if (monthback=1 and monthforward>1) or last.SERVICE_PROVIDER_ID then do;
END_DT=SERVICE_THROUGH_DATE;
output;
START_DT=lead_SERVICE_FROM_DATE;
end;
retain START_DT;format START_DT END_DT mmddyy10.;
run;
I think that the following meets your specifications:
data want (keep=MEMBER_ID START_DT END_DT SERVICE_PROVIDER_ID); set have; format START_DT END_DT mmddyy10.; by SERVICE_PROVIDER_ID; retain START_DT in_process; set have ( firstobs = 2 keep = SERVICE_FROM_DATE rename = (SERVICE_FROM_DATE = Next_Date) ) have ( obs = 1 drop = _all_); if first.SERVICE_PROVIDER_ID or in_process eq 0 then do; in_process=1; START_DT=SERVICE_FROM_DATE; end; if last.SERVICE_PROVIDER_ID or intck('month',SERVICE_FROM_DATE,Next_Date) gt 1 then do; END_DT=SERVICE_THROUGH_DATE; output; in_process=0; end; run;
Art, CEO, AnalystFinder.com
Like this?
data WANT;
retain START_DT;
set HAVE end=LASTOBS;
PREV_ST =lag(SERVICE_FROM_DATE);
PREV_END =lag(SERVICE_THROUGH_DATE);
PREV_PROV =lag(SERVICE_PROVIDER_ID);
if _N_=1 then START_DT=SERVICE_FROM_DATE;
if _N_>1;
if (month(PREV_ST) + 1 ne month(SERVICE_FROM_DATE) )
or ( PREV_PROV ne SERVICE_PROVIDER_ID )
or LASTOBS then do;
END_DT = ifn(LASTOBS, SERVICE_THROUGH_DATE,PREV_END);
PROVIDER_ID= ifn(LASTOBS, SERVICE_PROVIDER_ID ,PREV_PROV);
output;
START_DT = SERVICE_FROM_DATE;
end;
format START_DT END_DT date9.;
keep MEMBER_ID START_DT END_DT PROVIDER_ID;
run;
MEMBER_ID | START_DT | END_DT | PROVIDER_ID |
---|---|---|---|
ABC | 05JAN2017 | 08MAR2017 | 64058469 |
ABC | 10JUL2017 | 21AUG2017 | 64058469 |
ABC | 25MAR2017 | 22OCT2017 | 64089900 |
@ChrisNZ: Using lag or reading the next record (like I did) accomplish the same thing. However, since you didn't use the INTCK function, your code will fail if the visits go across years. e.g.:
data have; input MEMBER_ID $ SERVICE_FROM_DATE :mmddyy10. SERVICE_THROUGH_DATE :mmddyy10. SERVICE_PROVIDER_ID $; format SERVICE_FROM_DATE SERVICE_THROUGH_DATE mmddyy10.; datalines; ABC 01/05/2017 01/05/2017 64058469 ABC 02/24/2017 02/24/2017 64058469 ABC 03/08/2017 03/08/2017 64058469 ABC 07/10/2017 07/10/2017 64058469 ABC 08/21/2017 08/21/2017 64058469 ABC 03/25/2017 03/25/2017 64089900 ABC 04/30/2017 04/30/2017 64089900 ABC 05/28/2017 05/28/2017 64089900 ABC 06/25/2017 06/25/2017 64089900 ABC 07/23/2017 07/23/2017 64089900 ABC 08/20/2017 08/20/2017 64089900 ABC 09/24/2017 09/24/2017 64089900 ABC 10/22/2017 10/22/2017 64089900 ABC 11/22/2017 11/22/2017 64089900 ABC 12/22/2017 12/22/2017 64089900 ABC 01/22/2018 01/22/2018 64089900 ABC 02/22/2018 02/22/2018 64089900 ;
Art, CEO, AnalystFinder.com
True. I just wanted to supply a much shorter data step than the 2 above, which were the only 2 when I started replying..
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.