customer_id | policy_id | policy_effective_dt | policy_expiration_dt | active_policy_flag |
127055 | 135418 | 27.04.2000 | 27.11.2004 | 0 |
127055 | 642156 | 27.11.2002 | 20.10.2003 | 0 |
127055 | 548874 | 20.05.2011 | 21.03.2013 | 0 |
127055 | 561321 | 20.05.2011 | 20.03.2016 | 0 |
127055 | 854876 | 25.04.2012 | 08.04.2013 | 0 |
127055 | 231554 | 12.07.2012 | 03.08.2013 | 0 |
127055 | 564498 | 13.08.2013 | 08.05.2015 | 0 |
127055 | 548879 | 16.06.2016 | 15.06.2017 | 1 |
Thoughts on how to handle it so far
1. Select the min(policy_effective_dt) of active policies (16.06.2016)
2. Establish a new variable that's loaded with the above value minus the 3 month period (16.03.2016)
3. Check each row in the policy_expiration_dt column to see if any date is higher than the variable from step
- If no, then 16.03.2016 is selected
- If yes (which is the case) then select that rows policy_effective_from_dt, subtract 3 months, and restart step 3 until there's no longer any new matches.
Do you have any suggestion on how to write this kind of code? I mostly just work in EG & DIS, so my SAS coding skills are very limited.
Thanks for your time!
OK. If I know what you are talking about .
data have;
infile cards expandtabs truncover;
input customer_id policy_id policy_effective_dt : ddmmyy10. policy_expiration_dt : ddmmyy10.;
format policy_effective_dt policy_expiration_dt ddmmyyp10.;
cards;
127055 135418 27.04.2000 27.11.2004 0
127055 642156 27.11.2002 20.10.2003 0
127055 548874 20.05.2011 21.03.2013 0
127055 561321 20.05.2011 20.03.2016 0
127055 854876 25.04.2012 08.04.2013 0
127055 231554 12.07.2012 03.08.2013 0
127055 564498 13.08.2013 08.05.2015 0
127055 548879 16.06.2016 15.06.2017
;
run;
data want;
array x{9999} _temporary_;
do i=1 by 1 until(last.customer_id);
set have;
by customer_id;
x{i}=policy_expiration_dt;
end;
max_date=.;
do j=1 by 1 until(last.customer_id);
set have;
by customer_id;
new_customer=0;
if intck('month',max_date,policy_effective_dt,'c') gt 3 then new_customer=1;
max_date=max(max_date,x{j});
output;
end;
drop i j max_date;
run;
Try this:
data have;
input
customer_id
policy_id
policy_effective_dt :ddmmyy10.
policy_expiration_dt :ddmmyy10.
active_policy_flag
;
format
policy_effective_dt
policy_expiration_dt
ddmmyyp10.
;
cards;
127055 135418 27.04.2000 27.11.2004 0
127055 642156 27.11.2002 20.10.2003 0
127055 548874 20.05.2011 21.03.2013 0
127055 561321 20.05.2011 20.03.2016 0
127055 854876 25.04.2012 08.04.2013 0
127055 231554 12.07.2012 03.08.2013 0
127055 564498 13.08.2013 08.05.2015 0
127055 548879 16.06.2016 15.06.2017 1
;
run;
proc sort data=have;
by customer_id policy_effective_dt;
run;
data ranges (keep=customer_id range_start range_end);
set have;
by customer_id;
retain range_start range_end;
format range_start range_end ddmmyyp10.;
if first.customer_id
then do;
range_start = policy_effective_dt;
range_end = policy_expiration_dt;
end;
else do;
if policy_effective_dt <= intnx('month',range_end,3)
then do;
range_end = max(policy_expiration_dt,range_end);
if last.customer_id then output;
end;
else do;
output;
range_start = policy_effective_dt;
range_end = policy_expiration_dt;
end;
end;
run;
data want;
merge
have (in=a)
ranges (
in=b
keep=customer_id range_start
rename=(range_start=policy_effective_dt)
)
;
by customer_id policy_effective_dt;
if a;
if b
then new_customer = 1;
else new_customer = 0;
run;
OK. If I know what you are talking about .
data have;
infile cards expandtabs truncover;
input customer_id policy_id policy_effective_dt : ddmmyy10. policy_expiration_dt : ddmmyy10.;
format policy_effective_dt policy_expiration_dt ddmmyyp10.;
cards;
127055 135418 27.04.2000 27.11.2004 0
127055 642156 27.11.2002 20.10.2003 0
127055 548874 20.05.2011 21.03.2013 0
127055 561321 20.05.2011 20.03.2016 0
127055 854876 25.04.2012 08.04.2013 0
127055 231554 12.07.2012 03.08.2013 0
127055 564498 13.08.2013 08.05.2015 0
127055 548879 16.06.2016 15.06.2017
;
run;
data want;
array x{9999} _temporary_;
do i=1 by 1 until(last.customer_id);
set have;
by customer_id;
x{i}=policy_expiration_dt;
end;
max_date=.;
do j=1 by 1 until(last.customer_id);
set have;
by customer_id;
new_customer=0;
if intck('month',max_date,policy_effective_dt,'c') gt 3 then new_customer=1;
max_date=max(max_date,x{j});
output;
end;
drop i j max_date;
run;
Thank you very much, @Ksharp & @Kurt_Bremser. I'm in the process of testing the logic for a much larger dataset, and it's looking very good so far. 🙂
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.