Thank you.
No, the qualify dataset is a subset of the History table for the latest month (in this case Jan). There will never be records into the future.
There are many cases where a member will have a transaction across multiple product groups in the same day. When I made the change you suggested, it may pull the exact same transaction that is being evaluated, so the hash variables are equivalent to the qualified transaction.
Is there away to 'skip' that transaction and look for the other transaction in the different product group?
A proposed solution, array-based.
First, building 2 simulated datasets with shorter variable names:
Member_ID -> mbr_id
Transaction_ID -> trs_id
Service_Date -> srv_dt
Prod_Group -> prd_gp
Second, the proposed solution.
I assume that all mbr_id in the qualified table are present in the history table.
/*****************************************************************/
/*** simulated datasets for qualified (t_a) and historic (t_b) ***/
/*****************************************************************/
data t_a(keep=mbr_id srv_dt trs_id prd_gp rnd)
t_b(keep=mbr_id srv_dt trs_id prd_gp rnd);
do mbr_id = 1 to 10;
trs_id = 0;
srv_dt=15000;
do zR = 1 to 200;
rnd=ceil(60*ranuni(3));
srv_dt+rnd;
trs_id+ceil(6*ranuni(3));
prd_gp = ceil(10*ranuni(3));
if zR in (197,198,199,200) then do; output t_a; output t_b; end;
else output t_b;
end;
end;
format srv_dt date9.;
run;
/*****************************/
/*** solution using arrays ***/
/*****************************/
proc sort data=t_a; by mbr_id descending srv_dt; run;
proc sort data=t_b; by mbr_id descending srv_dt; run;
proc sql;
select max(cnts) into :cQual
from (select mbr_id, sum(1) as cnts from t_a group by mbr_id);
select max(cnts) into :cHist
from (select mbr_id, sum(1) as cnts from t_b group by mbr_id);
quit;
data t_c(keep=mbr_id aSrv_dt aPrd_gp aTrs_id bSrv_dt bPrd_gp bTrs_id dday
rename=(aSrv_dt=srv_dt aPrd_gp=prd_gp aTrs_id=trs_id));
array R1(&cQual); * qualified, srv_dt;
array R2(&cQual); * qualified, prd_gp;
array R3(&cQual); * qualified, trs_id;
array S1(&cHist); * history, srv_dt;
array S2(&cHist); * history, prd_gp;
array S3(&cHist); * history, trs_id;
aCnt=0; bCnt=0;
do until (last.mbr_id);
set t_a;
by mbr_id;
aCnt+1;
R1(aCnt)=srv_dt; R2(aCnt)=prd_gp; R3(aCnt)=trs_id;
end;
do until (last.mbr_id);
set t_b;
by mbr_id;
bCnt+1;
S1(bCnt)=srv_dt; S2(bCnt)=prd_gp; S3(bCnt)=trs_id;
end;
do i = 1 to aCnt;
zMatch=0; zOther=0;
aSrv_dt=R1(i); aPrd_gp=R2(i); aTrs_id=R3(i);
do j = 1 to bCnt;
if (R2(i)=S2(j)) and (0< R1(i)-S1(j)<= 365) then do;
zMatch=max(zMatch,1);
bSrv_dt=S1(j); bPrd_gp=S2(j); bTrs_id=S3(j); dday=R1(i)-S1(j);
output;
end;
if not(R2(i)=S2(j)) and (0<=R1(i)-S1(j)<= 365) and (zOther=0) then do;
zOther=max(zOther,1);
cSrv_dt=S1(j); cPrd_gp=S2(j); cTrs_id=S3(j); eday=R1(i)-S1(j);
end;
end;
if (zMatch=0) and (zOther=1) then do;
bSrv_dt=cSrv_dt; bPrd_gp=cPrd_gp; bTrs_id=cTrs_id; dday=eday;
output;
end;
if (zMatch=0) and (zOther=0) then do;
bSrv_dt=aSrv_dt; bPrd_gp=aPrd_gp; bTrs_id=aTrs_id; dday=0;
output;
end;
end;
format aSrv_dt bSrv_dt date9.;
run;
For mbr_id = 1 one gets:
mbr_id srv_dt prd_gp trs_id bSrv_dt bPrd_gp bTrs_id dday
1 20-Dec-17 9 737 14-Dec-17 9 736 6
1 20-Dec-17 9 737 27-Aug-17 9 730 115
1 20-Dec-17 9 737 17-May-17 9 721 217
1 20-Dec-17 9 737 1-Mar-17 9 713 294
1 14-Dec-17 9 736 27-Aug-17 9 730 109
1 14-Dec-17 9 736 17-May-17 9 721 211
1 14-Dec-17 9 736 1-Mar-17 9 713 288
1 25-Oct-17 8 735 27-Aug-17 9 730 59
1 27-Aug-17 9 730 17-May-17 9 721 102
1 27-Aug-17 9 730 1-Mar-17 9 713 179
for (srv_dt=25-Oct-17; prd_gp=8) there are no other transactions for mbr_id=1 and prd_gp=8 which is within 365 days, thus the entry for prd_gp=9.
Thank you. This is a great approach I will look into it with more detail.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.