Hello,
I have 2 tables as shown. Table 2 has duplicates. While joining the tables using left join, we want to pull from table 2 any dates >= the date for the ID from table T1 as well as dates within 7 days prior. Can someone help with the logic?
TABLE T1
ID dischargeDT
101 08/01/2019
TABLE T2
ID DATE
101 06/16/2019
101 07/25/2019
101 08/01/2019
101 09/05/2019
RESULT
ID discharged DATE
101 08/01/2019 07/25/2019 ---7 days prior to 08/01/2019
101 08/01/2019 08/01/2019 --on the same day as 08/01/2019
101 08/01/2019 09/05/2019--anything after 08/01/2019
Thanks
Is this what you are after?
on t1.id=t2.id and date>=intnx('days',dischargeDT,-7);
data T1;
input ID dischargeDT :mmddyy10.;
format dischargeDT mmddyy10.;
cards;
101 08/01/2019
;
data T2;
input ID DATE :mmddyy10.;
format date mmddyy10.;
cards;
101 06/16/2019
101 07/25/2019
101 08/01/2019
101 09/05/2019
;
proc sql;
create table want as
select t1.*,date
from t1 left join t2
on t1.id=t2.id and date>=intnx('days',dischargeDT,-7);
quit;
data t1;
input id dischargedt :mmddyy10.;
format dischargedt mmddyy10.;
datalines;
101 08/01/2019
;
data t2;
input id date :mmddyy10.;
format date mmddyy10.;
datalines;
101 06/16/2019
101 07/25/2019
101 08/01/2019
101 09/05/2019
;
proc sql;
create table want as
select t1.*, t2.date
from t1, t2
where t2.date ge intnx('day', t1.dischargedt, -7, 's')
group by t2.id
order by t2.id, date;
quit;
Result:
id dischargedt date 101 08/01/2019 07/25/2019 101 08/01/2019 08/01/2019 101 08/01/2019 09/05/2019
Please try
data T1;
input ID dischargeDT:mmddyy10.;
format dischargeDT date9.;
cards;
101 08/01/2019
;
data T2;
input ID DATE:mmddyy10.;
format DATE date9.;
cards;
101 06/16/2019
101 07/25/2019
101 08/01/2019
101 09/05/2019
;
proc sql;
create table want as select a.*,b.dischargeDT from t2 as a right join t1 as b on a.id=b.id and a.date>=b.dischargeDT or (a.date)>=(b.dischargeDT-7);
quit;
Here is a solution much faster than SQL if (1) your data are sorted by ID and (2) you have only 1 obs per ID in T1:
data t1;
input id dischargedt :mmddyy10. ;
format dischargedt date9.;
datalines;
101 08/01/2019
run;
data t2;
input id date :mmddyy10.;
format date date9.;
datalines;
101 06/16/2019
101 07/25/2019
101 08/01/2019
101 09/05/2019
run;
data want;
set t1 (keep=id in=in1) t2 (keep=id in=in2);
by id;
if in1 then set t1;
if in2 then set t2;
if nmiss(dischargedt,date)=0;
if date>=dischargedt-7 then output;
if last.id then call missing(of _all_);
run;
The first SET statement does nothing but interleave a sequence of identical ID values (and track their sources by menas of IN1 and IN2). Since T1 is the first object of the SET statement it provides the first instance of the set of identiflcal ID's. When that happens, then IN1=1 and the "if in1 then set t1" statement reads in all the remaining T1 variables - namely dischargedt, which will not be replaced until the next instance of IN1=1.
When IN2=1, then a regular DATE value is read from T2. It is compared to dischargedt-7 and output if it is no earlier.
Now since variables like DISCHARGEDT and DATE are "retained" until the next corresponding SET statement is executed, it is possible that the dischargedt for ID 101 could contaminate T2 data for ID 102 (if there is no T1 data for 102). That is why there is a "call missing (of _all_)" when each ID is exhausted.
Regards
Mark
Hi @mkeintz Beautiful algorithm. More importantly, Thank you for the notes!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.