BookmarkSubscribeRSS Feed
renjithr
Quartz | Level 8

Hi All,

 

I have two tables 

 

EMP

 

 id name $ date1 yymmdd10.;

101 ram 2016/01/30
101 ram 2016/02/23

101 ram 2016/02/25
102 sam 2016/02/15

 

and TESTS

 

 id date yymmdd10.;
101 2016/01/29
101 2016/01/30
102 2016/02/12

 

I would like to have a table like below:

 

ID DATE1 Date2

101 2016/01/30 2016/01/29

101 2016/02/23 2016/01/30

101 2016/02/25 2016/01/30

102 2016/02/15 2016/02/12

 

Please help!

 

 

 

5 REPLIES 5
mkeintz
PROC Star

 

Why is 1/29/2016 in table test closer than 1/30/2016 (also in tests) to 1/30/2016 in EMP?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Shmuel
Garnet | Level 18

Run next code:

data one;
  input id name $ date1 yymmdd10. ;
  format date1 date9.;
  seq = _N_;
datalines;
101 ram 2016/01/30
101 ram 2016/02/23
101 ram 2016/02/25
102 sam 2016/02/15
; run;

data two;
  input id date yymmdd10.;
  format date date9.;
datalines;
101 2016/01/29
101 2016/01/30
102 2016/02/12
; run;

proc sql;
    create table temp as select
      a.id, a.seq, a.date1,
      b.date as date2,
      abs(a.date1 - b.date) as diff
    from one as a   
    left join two as b    
    on a.id = b.id 
    group by a.id
    order by id, seq, diff
;quit;

proc sql
create table want
as select id, date1, date2
from temp
group by id, seq
having min(diff)
; quit;
/* last step can be changed by: */ data want; set temp; by id seq; if first.seq; run;

this code will produce first output line as:

  101 2016/01/30 2016/01/30

  instead 

 101 2016/01/30 2016/01/29

 

If you insist getting the Jan 29 then change last step to:

data want;
 set temp(where=(diff>0));
  by id seq;
     if first.seq;
run;


/* OR */

data want;
 set temp;
  by id seq;
retain flag; drop flag; if first.seq then flag=0; if flag=0 and diff>0 then do; output; flag=1; end; run;
ErikLund_Jensen
Rhodochrosite | Level 12

It seems that since you want the test date 2016/01/29 and not 2016/01/30 joined with the emp date 2016/01/30, you want the latest test date that precedes the actual date. The following SQL performs a join on ID where emp date is bigger than tests date and then selects the record with the smallest difference.

 
proc sql;
     create table result (drop=dif) as
     select
          emp.id, emp.name, emp.date1, tests.date as date2, abs(emp.date1-tests.date) as dif
          from emp left join tests
          on emp.id = tests.id
          where emp.date1 > tests.date
          group by emp.id, emp.date1
          having dif=min(dif);
quit;

 

101 2016-01-30 2016-01-29
101 2016-02-23 2016-01-30
101 2016-02-25 2016-01-30
102 2016-02-15 2016-02-12

mkeintz
PROC Star

@ErikLund_Jensen wrote:

It seems that since you want the test date 2016/01/29 and not 2016/01/30 joined with the emp date 2016/01/30, you want the latest test date that precedes the actual date. The following SQL performs a join on ID where emp date is bigger than tests date and then selects the record with the smallest difference.

 
proc sql;
     create table result (drop=dif) as
     select
          emp.id, emp.name, emp.date1, tests.date as date2, abs(emp.date1-tests.date) as dif
          from emp left join tests
          on emp.id = tests.id
          where emp.date1 > tests.date
          group by emp.id, emp.date1
          having dif=min(dif);
quit;

 

101 2016-01-30 2016-01-29
101 2016-02-23 2016-01-30
101 2016-02-25 2016-01-30
102 2016-02-15 2016-02-12


 

Since you apparently only want test dates preceding emp date, you could even abstain from calclulating DIF by a suitable modification of the JOIN .. ON expression, and corresponding change to the HAVING clause. 

 

proc sql;

  create table result as

  select emp.id, emp.name, emp.date1, tests.date as date2

  from emp left join tests

    on emp.id = tests.id and emp.date1> tests.date

  group by emp.id, emp.date1

    having tests.date=max(tests.date);

quit;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
renjithr
Quartz | Level 8

Thank you all for your suggestions, appreciate it!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 7273 views
  • 1 like
  • 4 in conversation