Hi all,
I have to solve this using proc sql because it will then be converted to a sql script for automation and needs to be as close as possible (just how things work here).
My dataset has duplicate records like below with different dates. I need to join to a second table where b.date > a.date. But i only want to join to the closest match to b.date which is the max (a.date)
Table a | |
BAN | col_date |
123 | 20-Jun-20 |
123 | 21-Jul-20 |
Table b | |
status | status_date |
r | 20-Aug-20 |
to get
Table a | |||
BAN | col_date | status | status_date |
123 | 20-Jun-20 | ||
123 | 21-Jul-20 | r | 20-Aug-20 |
Would appreciate any and all guidance. thank you
@TheNovice wrote:
Apologies, yes, the join key would be ban
According to above I have added BAN to table b.
Next code is tested:
data table_a;
input ban col_date :date9.;
format col_date date9.;
cards;
123 20-jun-20
123 21-jul-20
; run;
data table_b;
input ban status $ status_date :date9.;
format status_date date9.;
cards;
123 r 20-aug-20
; run;
proc sql;
create table want(drop=dif_date min_date) as
select a.ban, a.col_date,
b.status_date - a.col_date as dif_date,
min(calculated dif_date) as min_date,
case when calculated dif_date = calculated min_date
then b.status
else ' ' end as status ,
case when calculated dif_date = calculated min_date
then b.status_date
else . end as status_date format=date9.
from table_a as a
left join table_b as b
on a.ban = b.ban
group by a.ban
;
quit;
There is no BAN in table b?
@TheNovice wrote:
Apologies, yes, the join key would be ban
Then, again: why is the variable BAN not in the second table?
@TheNovice wrote:
Apologies, yes, the join key would be ban
According to above I have added BAN to table b.
Next code is tested:
data table_a;
input ban col_date :date9.;
format col_date date9.;
cards;
123 20-jun-20
123 21-jul-20
; run;
data table_b;
input ban status $ status_date :date9.;
format status_date date9.;
cards;
123 r 20-aug-20
; run;
proc sql;
create table want(drop=dif_date min_date) as
select a.ban, a.col_date,
b.status_date - a.col_date as dif_date,
min(calculated dif_date) as min_date,
case when calculated dif_date = calculated min_date
then b.status
else ' ' end as status ,
case when calculated dif_date = calculated min_date
then b.status_date
else . end as status_date format=date9.
from table_a as a
left join table_b as b
on a.ban = b.ban
group by a.ban
;
quit;
data table_a;
input ban col_date :date9.;
format col_date date9.;
cards;
123 20-jun-20
123 21-jul-20
; run;
data table_b;
input ban status $ status_date :date9.;
format status_date date9.;
cards;
123 r 20-aug-20
; run;
proc sql;
create table want as
select a.ban, a.col_date,
case when abs(a.col_date-b.status_date)=
min(abs(a.col_date-b.status_date)) then b.status
else ' ' end as status
,
case when abs(a.col_date-b.status_date)=
min(abs(a.col_date-b.status_date)) then b.status_date
else . end as status_date format=date9.
from table_a as a
left join table_b as b
on a.ban = b.ban ;
quit;
I think @Ksharp 's answer doesn't quite answer the question. This might do better:
data a;
input ban col_date :date9.;
format col_date date9.;
cards;
123 20-jun-20
123 21-jul-20
567 20-jun-20
567 21-jul-20
;
data b;
input ban status $ status_date :date9.;
format status_date date9.;
cards;
123 r 20-aug-20
456 r 20-aug-20
;
proc sql;
select a.ban, col_date,
case when col_date = max(col_date) then status
else ' ' end as status
,
case when col_date = max(col_date) then status_date
else . end as status_date format=date9.
from a
left join b
on a.ban = b.ban and status_date > col_date
group by a.ban;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.