Hi All,
I have two tables
EMP
id name $ date1 yymmdd10.;
101 ram 2016/01/30
101 ram 2016/02/23
101 ram 2016/02/25
102 sam 2016/02/15
and TESTS
id date yymmdd10.;
101 2016/01/29
101 2016/01/30
102 2016/02/12
I would like to have a table like below:
ID DATE1 Date2
101 2016/01/30 2016/01/29
101 2016/02/23 2016/01/30
101 2016/02/25 2016/01/30
102 2016/02/15 2016/02/12
Please help!
Why is 1/29/2016 in table test closer than 1/30/2016 (also in tests) to 1/30/2016 in EMP?
Run next code:
data one;
input id name $ date1 yymmdd10. ;
format date1 date9.;
seq = _N_;
datalines;
101 ram 2016/01/30
101 ram 2016/02/23
101 ram 2016/02/25
102 sam 2016/02/15
; run;
data two;
input id date yymmdd10.;
format date date9.;
datalines;
101 2016/01/29
101 2016/01/30
102 2016/02/12
; run;
proc sql;
create table temp as select
a.id, a.seq, a.date1,
b.date as date2,
abs(a.date1 - b.date) as diff
from one as a
left join two as b
on a.id = b.id
group by a.id
order by id, seq, diff
;quit;
proc sql
create table want
as select id, date1, date2
from temp
group by id, seq
having min(diff)
; quit;
/* last step can be changed by: */
data want;
set temp;
by id seq;
if first.seq;
run;
this code will produce first output line as:
101 2016/01/30 2016/01/30
instead
101 2016/01/30 2016/01/29
If you insist getting the Jan 29 then change last step to:
data want;
set temp(where=(diff>0));
by id seq;
if first.seq;
run;
/* OR */
data want;
set temp;
by id seq;
retain flag; drop flag;
if first.seq then flag=0;
if flag=0 and diff>0 then do;
output; flag=1;
end;
run;
It seems that since you want the test date 2016/01/29 and not 2016/01/30 joined with the emp date 2016/01/30, you want the latest test date that precedes the actual date. The following SQL performs a join on ID where emp date is bigger than tests date and then selects the record with the smallest difference.
proc sql;
create table result (drop=dif) as
select
emp.id, emp.name, emp.date1, tests.date as date2, abs(emp.date1-tests.date) as dif
from emp left join tests
on emp.id = tests.id
where emp.date1 > tests.date
group by emp.id, emp.date1
having dif=min(dif);
quit;
101 2016-01-30 2016-01-29
101 2016-02-23 2016-01-30
101 2016-02-25 2016-01-30
102 2016-02-15 2016-02-12
@ErikLund_Jensen wrote:
It seems that since you want the test date 2016/01/29 and not 2016/01/30 joined with the emp date 2016/01/30, you want the latest test date that precedes the actual date. The following SQL performs a join on ID where emp date is bigger than tests date and then selects the record with the smallest difference.
proc sql;
create table result (drop=dif) as
select
emp.id, emp.name, emp.date1, tests.date as date2, abs(emp.date1-tests.date) as dif
from emp left join tests
on emp.id = tests.id
where emp.date1 > tests.date
group by emp.id, emp.date1
having dif=min(dif);
quit;
101 2016-01-30 2016-01-29
101 2016-02-23 2016-01-30
101 2016-02-25 2016-01-30
102 2016-02-15 2016-02-12
Since you apparently only want test dates preceding emp date, you could even abstain from calclulating DIF by a suitable modification of the JOIN .. ON expression, and corresponding change to the HAVING clause.
proc sql;
create table result as
select emp.id, emp.name, emp.date1, tests.date as date2
from emp left join tests
on emp.id = tests.id and emp.date1> tests.date
group by emp.id, emp.date1
having tests.date=max(tests.date);
quit;
Thank you all for your suggestions, appreciate it!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.