Hi All,
I have two tables
EMP
id name $ date1 yymmdd10.;
101 ram 2016/01/30
101 ram 2016/02/23
101 ram 2016/02/25
102 sam 2016/02/15
and TESTS
id date yymmdd10.;
101 2016/01/29
101 2016/01/30
102 2016/02/12
I would like to have a table like below:
ID DATE1 Date2
101 2016/01/30 2016/01/29
101 2016/02/23 2016/01/30
101 2016/02/25 2016/01/30
102 2016/02/15 2016/02/12
Please help!
Why is 1/29/2016 in table test closer than 1/30/2016 (also in tests) to 1/30/2016 in EMP?
Run next code:
data one;
input id name $ date1 yymmdd10. ;
format date1 date9.;
seq = _N_;
datalines;
101 ram 2016/01/30
101 ram 2016/02/23
101 ram 2016/02/25
102 sam 2016/02/15
; run;
data two;
input id date yymmdd10.;
format date date9.;
datalines;
101 2016/01/29
101 2016/01/30
102 2016/02/12
; run;
proc sql;
create table temp as select
a.id, a.seq, a.date1,
b.date as date2,
abs(a.date1 - b.date) as diff
from one as a
left join two as b
on a.id = b.id
group by a.id
order by id, seq, diff
;quit;
proc sql
create table want
as select id, date1, date2
from temp
group by id, seq
having min(diff)
; quit;
/* last step can be changed by: */
data want;
set temp;
by id seq;
if first.seq;
run;
this code will produce first output line as:
101 2016/01/30 2016/01/30
instead
101 2016/01/30 2016/01/29
If you insist getting the Jan 29 then change last step to:
data want;
set temp(where=(diff>0));
by id seq;
if first.seq;
run;
/* OR */
data want;
set temp;
by id seq;
retain flag; drop flag;
if first.seq then flag=0;
if flag=0 and diff>0 then do;
output; flag=1;
end;
run;
It seems that since you want the test date 2016/01/29 and not 2016/01/30 joined with the emp date 2016/01/30, you want the latest test date that precedes the actual date. The following SQL performs a join on ID where emp date is bigger than tests date and then selects the record with the smallest difference.
proc sql;
create table result (drop=dif) as
select
emp.id, emp.name, emp.date1, tests.date as date2, abs(emp.date1-tests.date) as dif
from emp left join tests
on emp.id = tests.id
where emp.date1 > tests.date
group by emp.id, emp.date1
having dif=min(dif);
quit;
101 2016-01-30 2016-01-29
101 2016-02-23 2016-01-30
101 2016-02-25 2016-01-30
102 2016-02-15 2016-02-12
@ErikLund_Jensen wrote:
It seems that since you want the test date 2016/01/29 and not 2016/01/30 joined with the emp date 2016/01/30, you want the latest test date that precedes the actual date. The following SQL performs a join on ID where emp date is bigger than tests date and then selects the record with the smallest difference.
proc sql;
create table result (drop=dif) as
select
emp.id, emp.name, emp.date1, tests.date as date2, abs(emp.date1-tests.date) as dif
from emp left join tests
on emp.id = tests.id
where emp.date1 > tests.date
group by emp.id, emp.date1
having dif=min(dif);
quit;
101 2016-01-30 2016-01-29
101 2016-02-23 2016-01-30
101 2016-02-25 2016-01-30
102 2016-02-15 2016-02-12
Since you apparently only want test dates preceding emp date, you could even abstain from calclulating DIF by a suitable modification of the JOIN .. ON expression, and corresponding change to the HAVING clause.
proc sql;
create table result as
select emp.id, emp.name, emp.date1, tests.date as date2
from emp left join tests
on emp.id = tests.id and emp.date1> tests.date
group by emp.id, emp.date1
having tests.date=max(tests.date);
quit;
Thank you all for your suggestions, appreciate it!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.