customer_id | policy_id | policy_effective_dt | policy_expiration_dt | active_policy_flag |
127055 | 135418 | 27.04.2000 | 27.11.2004 | 0 |
127055 | 642156 | 27.11.2002 | 20.10.2003 | 0 |
127055 | 548874 | 20.05.2011 | 21.03.2013 | 0 |
127055 | 561321 | 20.05.2011 | 20.03.2016 | 0 |
127055 | 854876 | 25.04.2012 | 08.04.2013 | 0 |
127055 | 231554 | 12.07.2012 | 03.08.2013 | 0 |
127055 | 564498 | 13.08.2013 | 08.05.2015 | 0 |
127055 | 548879 | 16.06.2016 | 15.06.2017 | 1 |
Thoughts on how to handle it so far
1. Select the min(policy_effective_dt) of active policies (16.06.2016)
2. Establish a new variable that's loaded with the above value minus the 3 month period (16.03.2016)
3. Check each row in the policy_expiration_dt column to see if any date is higher than the variable from step
- If no, then 16.03.2016 is selected
- If yes (which is the case) then select that rows policy_effective_from_dt, subtract 3 months, and restart step 3 until there's no longer any new matches.
Do you have any suggestion on how to write this kind of code? I mostly just work in EG & DIS, so my SAS coding skills are very limited.
Thanks for your time!
OK. If I know what you are talking about .
data have;
infile cards expandtabs truncover;
input customer_id policy_id policy_effective_dt : ddmmyy10. policy_expiration_dt : ddmmyy10.;
format policy_effective_dt policy_expiration_dt ddmmyyp10.;
cards;
127055 135418 27.04.2000 27.11.2004 0
127055 642156 27.11.2002 20.10.2003 0
127055 548874 20.05.2011 21.03.2013 0
127055 561321 20.05.2011 20.03.2016 0
127055 854876 25.04.2012 08.04.2013 0
127055 231554 12.07.2012 03.08.2013 0
127055 564498 13.08.2013 08.05.2015 0
127055 548879 16.06.2016 15.06.2017
;
run;
data want;
array x{9999} _temporary_;
do i=1 by 1 until(last.customer_id);
set have;
by customer_id;
x{i}=policy_expiration_dt;
end;
max_date=.;
do j=1 by 1 until(last.customer_id);
set have;
by customer_id;
new_customer=0;
if intck('month',max_date,policy_effective_dt,'c') gt 3 then new_customer=1;
max_date=max(max_date,x{j});
output;
end;
drop i j max_date;
run;
Try this:
data have;
input
customer_id
policy_id
policy_effective_dt :ddmmyy10.
policy_expiration_dt :ddmmyy10.
active_policy_flag
;
format
policy_effective_dt
policy_expiration_dt
ddmmyyp10.
;
cards;
127055 135418 27.04.2000 27.11.2004 0
127055 642156 27.11.2002 20.10.2003 0
127055 548874 20.05.2011 21.03.2013 0
127055 561321 20.05.2011 20.03.2016 0
127055 854876 25.04.2012 08.04.2013 0
127055 231554 12.07.2012 03.08.2013 0
127055 564498 13.08.2013 08.05.2015 0
127055 548879 16.06.2016 15.06.2017 1
;
run;
proc sort data=have;
by customer_id policy_effective_dt;
run;
data ranges (keep=customer_id range_start range_end);
set have;
by customer_id;
retain range_start range_end;
format range_start range_end ddmmyyp10.;
if first.customer_id
then do;
range_start = policy_effective_dt;
range_end = policy_expiration_dt;
end;
else do;
if policy_effective_dt <= intnx('month',range_end,3)
then do;
range_end = max(policy_expiration_dt,range_end);
if last.customer_id then output;
end;
else do;
output;
range_start = policy_effective_dt;
range_end = policy_expiration_dt;
end;
end;
run;
data want;
merge
have (in=a)
ranges (
in=b
keep=customer_id range_start
rename=(range_start=policy_effective_dt)
)
;
by customer_id policy_effective_dt;
if a;
if b
then new_customer = 1;
else new_customer = 0;
run;
OK. If I know what you are talking about .
data have;
infile cards expandtabs truncover;
input customer_id policy_id policy_effective_dt : ddmmyy10. policy_expiration_dt : ddmmyy10.;
format policy_effective_dt policy_expiration_dt ddmmyyp10.;
cards;
127055 135418 27.04.2000 27.11.2004 0
127055 642156 27.11.2002 20.10.2003 0
127055 548874 20.05.2011 21.03.2013 0
127055 561321 20.05.2011 20.03.2016 0
127055 854876 25.04.2012 08.04.2013 0
127055 231554 12.07.2012 03.08.2013 0
127055 564498 13.08.2013 08.05.2015 0
127055 548879 16.06.2016 15.06.2017
;
run;
data want;
array x{9999} _temporary_;
do i=1 by 1 until(last.customer_id);
set have;
by customer_id;
x{i}=policy_expiration_dt;
end;
max_date=.;
do j=1 by 1 until(last.customer_id);
set have;
by customer_id;
new_customer=0;
if intck('month',max_date,policy_effective_dt,'c') gt 3 then new_customer=1;
max_date=max(max_date,x{j});
output;
end;
drop i j max_date;
run;
Thank you very much, @Ksharp & @Kurt_Bremser. I'm in the process of testing the logic for a much larger dataset, and it's looking very good so far. 🙂
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.