BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
smiles
Calcite | Level 5

Dear SAS Experts,

 

Need your help in figuring out how to compare dates in 2 datasets having same ID for multiple observations and selecting the latest as I am bit new to SAS and still learning 

 

Table 1 has 3 columns

 

ID   No    Date

1     1      4-Apr-18

1     2     19-May-18

1     3      20-Jul-18

1     4       01-Aug-18

'

Table 2 has 3 columns

 

ID   No    Date

1     1      2-May-18

1     2     10-Jun-18

1     3      15-Jul-18

 

For every record in table1 with the same ID , i need to compare the date with table2  having same ID and ensure t1.date >=t2.date.

If there multiple records satisfying the condition from table 2 then take the latest and create a column valid with values Y or N, if condition not satisfied

 

e.g. for record 3 in table1 for ID=1 all the records from table2 satisfy the t1.date >=t2.date. in this case i have to pick the 15-jul-18 against that record. and put valid flag as Y

 

Thanks for all the help in advance.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Or

data want;
if _n_=1 then do;
if 0 then set have2;
  declare hash H (dataset:'have2(rename=(date=date1))',multidata:'y') ;
   h.definekey  ("id") ;
   h.definedata ("date1") ;
   h.definedone () ;
   call missing(date1);
   declare hash H2 (dataset:'have2(drop=no rename=(date=date2))',multidata:'y') ;
   h2.definekey  ("id",'date2') ;
   h2.defineData(all: 'y');
   h2.definedone () ;
   call missing(date2);
end;
do until(last.id);
set have1;
by id;
call missing(date2,acc);
do rc=h.find() by 0 while(rc=0);
if date>=date1 then date2=max(date1,date2);
rc=h.find_next();
end;
rc=h2.find(key:id,key:date2);
valid=ifc(date2=.,'N','Y');
output;
end;
format date: date9.;
drop date1 rc;
run;

 

This assumes, you have sufficient memory to have two hash tables 

View solution in original post

10 REPLIES 10
novinosrin
Tourmaline | Level 20

Can you post a sample of your requird output for your input sample to avoid assumptions plz

smiles
Calcite | Level 5

The desired output

 

Table 1 with additional columns

 

ID   No    Date                       t2.date      valid

1     1      4-Apr-18                    -                N

1     2     19-May-18              2-May           Y

1     3      20-Jul-18               15-Jul           Y

1     4      01-Aug-18             15-Jul           Y

 

Thanks

novinosrin
Tourmaline | Level 20


 data have1;
 input ID   No    Date :date9.;
 format date date9.;
cards;
1     1      4-Apr-18
1     2     19-May-18
1     3      20-Jul-18
1     4       01-Aug-18
;

data have2;
input ID   No    Date :date9.;
 format date date9.;
 cards;
1     1      2-May-18
1     2     10-Jun-18
1     3      15-Jul-18
;

proc sql;
create table want as
select distinct a.id, a.date as date1,ifn(a.date >=b.date,b.date,.) as date2	format =date9.,
case when calculated date2=. then 'N' else 'Y' end as valid
from have1 a, have2 b
where a.id=b.id
group by a.id,a.date 
having date2=max(date2);

quit;
novinosrin
Tourmaline | Level 20
 data have1;
 input ID   No    Date :date9.;
 format date date9.;
cards;
1     1      4-Apr-18
1     2     19-May-18
1     3      20-Jul-18
1     4       01-Aug-18
;

data have2;
input ID   No    Date :date9.;
 format date date9.;
 cards;
1     1      2-May-18
1     2     10-Jun-18
1     3      15-Jul-18
;
data want;
if _n_=1 then do;
if 0 then set have2;
  declare hash H (dataset:'have2(rename=(date=date1))',multidata:'y') ;
   h.definekey  ("id") ;
   h.definedata ("date1") ;
   h.definedone () ;
   call missing(date1);
end;
do until(last.id);
set have1;
by id;
call missing(date2);
do rc=h.find() by 0 while(rc=0);
if date>=date1 then date2=max(date1,date2);
rc=h.find_next();
end;
valid=ifc(date2=.,'N','Y');
output;
end;
format date: date9.;
drop date1 rc;
run;
smiles
Calcite | Level 5

Thanks for the quick help. It works 🙂

I have started reading about the hash object to understand how it actually works.

 

One quick question, if the table2 has additional columns and I want the same in output as well

 

data have2;. 
input ID No Acc Date :date9.;
format date date9.;
cards;
1 1 123 2-May-18
1 2 456 10-Jun-18
1 3 789 15-Jul-18
;

 

I modified h.definedata (all='Y') ; 

 

I am getting 789 as ACC for all cases. After getting date2 , I need to get the ACC associated with that date

 

ID No ACCDate date2valid
117894-Apr-18.N
1278919-May-182-May-18Y
1378920-Jul-1815-Jul-18Y
147891-Aug-1815-Jul-18Y

 

instead of

ID No ACCDate date2valid
11 4-Apr-18.N
1212319-May-182-May-18Y
1378920-Jul-1815-Jul-18Y
147891-Aug-1815-Jul-18Y

 

 

Thanks. 

novinosrin
Tourmaline | Level 20

That's simple. Can you plz post your complete modified sample or that's the best representative. Just making sure coz I can modify all in one shot

smiles
Calcite | Level 5
That's it. This is the case.

Really appreciate your help. I am beginner so getting to learn . Thanks
novinosrin
Tourmaline | Level 20

@smiles  Good morning,Welcome to SAS forum and my apologies for the delay. Obviously, I am sure you understand the time difference impact between locations where we login from. Messaging you @ chicago time 9:10am with train/bus delays having just made it to my lab and this is my second post for the day. Earlier, I basically acknowledged your message while waking up.

 

Here you go:

 

data have1;
 input ID   No    Date :date9.;
 format date date9.;
cards;
1     1      4-Apr-18
1     2     19-May-18
1     3      20-Jul-18
1     4       01-Aug-18
;


data have2;
input ID No Acc Date :date9.;
format date date9.;
cards;
1 1 123 2-May-18
1 2 456 10-Jun-18
1 3 789 15-Jul-18
;

data want;
if _n_=1 then do;
if 0 then set have2;
  declare hash H (dataset:'have2(drop=no rename=(date=date1))',multidata:'y',ordered:'y') ;
   h.definekey  ("id") ;
  h.defineData(all: 'y');;
   h.definedone () ;
   call missing(date1);
end;

do until(last.id);
set have1;
by id;
call missing(date2);
do rc=h.find() by 0 while(rc=0);
if date>=date1 then date2=max(date1,date2);
rc=h.find_next();
end;
if date2 eq . then call missing(acc);
else
do rc=h.find() by 0 while(rc=0);
if date2=date1 then leave;
rc=h.find_next();
end;
valid=ifc(date2=.,'N','Y');
output;
end;
format date: date9.;
drop date1 rc;
run;

Some NOTES:

1. if date2 eq . then call missing(acc);  This assumes you have one additional var, however if you have a very wide have2 and you wanna use all:y in definedata you may not know to group all that needs to be part of the call missing(var1,var2--varn). 

Call missing(a--z) can work if you know sequence of your definedata

2. No. variable overwrites have1 No. variable during the lookup, therefore I dropped that out of the hash table to maintain sequence order of have1 as is. Of course, you can reassign sequence, learning that No in have2 will overwrite, however there's no point

3. The more clear and best representative your sample is, the better to think through the robust solutions

 

Feel free to have any follow up questions, if you may. 

 

 

novinosrin
Tourmaline | Level 20

Or

data want;
if _n_=1 then do;
if 0 then set have2;
  declare hash H (dataset:'have2(rename=(date=date1))',multidata:'y') ;
   h.definekey  ("id") ;
   h.definedata ("date1") ;
   h.definedone () ;
   call missing(date1);
   declare hash H2 (dataset:'have2(drop=no rename=(date=date2))',multidata:'y') ;
   h2.definekey  ("id",'date2') ;
   h2.defineData(all: 'y');
   h2.definedone () ;
   call missing(date2);
end;
do until(last.id);
set have1;
by id;
call missing(date2,acc);
do rc=h.find() by 0 while(rc=0);
if date>=date1 then date2=max(date1,date2);
rc=h.find_next();
end;
rc=h2.find(key:id,key:date2);
valid=ifc(date2=.,'N','Y');
output;
end;
format date: date9.;
drop date1 rc;
run;

 

This assumes, you have sufficient memory to have two hash tables 

smiles
Calcite | Level 5
Thanks a lot!! Your fixed the problem in such a jiffy which I have been struggling with

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1297 views
  • 0 likes
  • 2 in conversation