BookmarkSubscribeRSS Feed
renjithr
Quartz | Level 8

Hi All,

 

I have two tables 

 

EMP

 

 id name $ date1 yymmdd10.;

101 ram 2016/01/30
101 ram 2016/02/23

101 ram 2016/02/25
102 sam 2016/02/15

 

and TESTS

 

 id date yymmdd10.;
101 2016/01/29
101 2016/01/30
102 2016/02/12

 

I would like to have a table like below:

 

ID DATE1 Date2

101 2016/01/30 2016/01/29

101 2016/02/23 2016/01/30

101 2016/02/25 2016/01/30

102 2016/02/15 2016/02/12

 

Please help!

 

 

 

5 REPLIES 5
mkeintz
PROC Star

 

Why is 1/29/2016 in table test closer than 1/30/2016 (also in tests) to 1/30/2016 in EMP?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Shmuel
Garnet | Level 18

Run next code:

data one;
  input id name $ date1 yymmdd10. ;
  format date1 date9.;
  seq = _N_;
datalines;
101 ram 2016/01/30
101 ram 2016/02/23
101 ram 2016/02/25
102 sam 2016/02/15
; run;

data two;
  input id date yymmdd10.;
  format date date9.;
datalines;
101 2016/01/29
101 2016/01/30
102 2016/02/12
; run;

proc sql;
    create table temp as select
      a.id, a.seq, a.date1,
      b.date as date2,
      abs(a.date1 - b.date) as diff
    from one as a   
    left join two as b    
    on a.id = b.id 
    group by a.id
    order by id, seq, diff
;quit;

proc sql
create table want
as select id, date1, date2
from temp
group by id, seq
having min(diff)
; quit;
/* last step can be changed by: */ data want; set temp; by id seq; if first.seq; run;

this code will produce first output line as:

  101 2016/01/30 2016/01/30

  instead 

 101 2016/01/30 2016/01/29

 

If you insist getting the Jan 29 then change last step to:

data want;
 set temp(where=(diff>0));
  by id seq;
     if first.seq;
run;


/* OR */

data want;
 set temp;
  by id seq;
retain flag; drop flag; if first.seq then flag=0; if flag=0 and diff>0 then do; output; flag=1; end; run;
ErikLund_Jensen
Rhodochrosite | Level 12

It seems that since you want the test date 2016/01/29 and not 2016/01/30 joined with the emp date 2016/01/30, you want the latest test date that precedes the actual date. The following SQL performs a join on ID where emp date is bigger than tests date and then selects the record with the smallest difference.

 
proc sql;
     create table result (drop=dif) as
     select
          emp.id, emp.name, emp.date1, tests.date as date2, abs(emp.date1-tests.date) as dif
          from emp left join tests
          on emp.id = tests.id
          where emp.date1 > tests.date
          group by emp.id, emp.date1
          having dif=min(dif);
quit;

 

101 2016-01-30 2016-01-29
101 2016-02-23 2016-01-30
101 2016-02-25 2016-01-30
102 2016-02-15 2016-02-12

mkeintz
PROC Star

@ErikLund_Jensen wrote:

It seems that since you want the test date 2016/01/29 and not 2016/01/30 joined with the emp date 2016/01/30, you want the latest test date that precedes the actual date. The following SQL performs a join on ID where emp date is bigger than tests date and then selects the record with the smallest difference.

 
proc sql;
     create table result (drop=dif) as
     select
          emp.id, emp.name, emp.date1, tests.date as date2, abs(emp.date1-tests.date) as dif
          from emp left join tests
          on emp.id = tests.id
          where emp.date1 > tests.date
          group by emp.id, emp.date1
          having dif=min(dif);
quit;

 

101 2016-01-30 2016-01-29
101 2016-02-23 2016-01-30
101 2016-02-25 2016-01-30
102 2016-02-15 2016-02-12


 

Since you apparently only want test dates preceding emp date, you could even abstain from calclulating DIF by a suitable modification of the JOIN .. ON expression, and corresponding change to the HAVING clause. 

 

proc sql;

  create table result as

  select emp.id, emp.name, emp.date1, tests.date as date2

  from emp left join tests

    on emp.id = tests.id and emp.date1> tests.date

  group by emp.id, emp.date1

    having tests.date=max(tests.date);

quit;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
renjithr
Quartz | Level 8

Thank you all for your suggestions, appreciate it!

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 7918 views
  • 1 like
  • 4 in conversation