08-11-2017 10:18 AM
I have a dataset that looks like the below...
ID Tran Date Enrol Date Expiry Date
12345 04JAN2015 01JAN2015 31DEC2015
12345 03DEC2016 01DEC2016 31NOV2017
12345 04APR2017 01APR2017 31MAR2018
12345 03JAN2018 01JAN2018 31DEC2018
Basically, I have multiple lines of record for each ID for every time they renew a membership. I'm trying to work out the difference bewteen the transaction date and the closest expiry date. This will then tell me whether the renew before or after the expiry date and how many days difference it is.
Any help appreciated.
08-11-2017 10:44 AM
Sounds like you need a catersian join, i.e. merge all the expiry dates to each transaction date (only pseudocode as not typing in that test data nor guessing structure):
proc sort data=have out=x (keep=id expiry) nodupkey; by id expiry; run; proc sql; create table inter as select A.ID, A.TRAN, B.EXPIRY, B.EXPIRY - A.TRAN as DIFF from HAVE A left join X B on A.ID=B.ID order by ID, TRAN, DIFF; quit; data inter; set inter; by id tran; if first.tran then output; run; data want; merge have inter (rename=(expiry=nearest)); by id tran; run;
08-11-2017 03:44 PM
From your description, three cases are possible:
1- Tran before Enrol
2- Tran during Enrol
3- Tran after Expiry
This should handle all:
data have; infile datalines missover; input ID (TranDate EnrolDate ExpiryDate) (:date9.); format TranDate EnrolDate ExpiryDate yymmdd10.; datalines; 12345 30DEC2014 01JAN2015 31DEC2015 Tran before first Enrol 12345 04JAN2015 01JAN2015 31DEC2015 12345 03DEC2016 01DEC2016 30NOV2017 12345 04FEB2017 01APR2017 31MAR2018 12345 03JAN2018 01JAN2018 31DEC2018 12345 03JAN2019 01JAN2018 31DEC2018 Tran when expired ; proc sql; select unique a.ID, a.TranDate, b.EnrolDate, b.expiryDate, intck('DAY', a.TranDate, b.Expirydate) as DaysLeftBeforeExpiry from have as a left join have as b on a.ID=b.ID and b.enrolDate <= a.tranDate group by a.ID, a.TranDate having intck('DAY', a.TranDate, b.Expirydate) = max(intck('DAY', a.TranDate, b.Expirydate)); quit;