BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ifti_ch2002
Obsidian | Level 7

I have following data

data have1;
infile datalines missover; 
   input IDnumber @3 date  mmddyy10.;
   datalines;  
1 4-7-2003 
1 4-7-2009 
1 4-7-2010 
1 4-7-2011 
1 4-7-2012 
1 4-7-2013 
1 4-7-2014 
1 4-7-2015 
1 4-7-2016 
1 4-7-2018 
2 4-7-2005 
2 4-7-2011 
2 4-7-2012 
2 4-7-2013 
2 4-7-2014 
2 4-7-2015 
2 4-7-2016 
2 4-7-2017 
2 4-7-2018 
2 4-7-2019 
2 4-7-2020 
;       
run; 

 And

data have2;
infile datalines missover; 
   input IDnumber @3 date  mmddyy10.;
   datalines;  
1 4-7-2003 
2 4-7-2005 
;       
run; 

I need the ouput as following

data want;
infile datalines missover; 
   input IDnumber @3 date  mmddyy10.;
   datalines;  
1 4-7-2009 
1 4-7-2010 
1 4-7-2011 
1 4-7-2012 
1 4-7-2013 
1 4-7-2014 
1 4-7-2015 
1 4-7-2016 
1 4-7-2018 
2 4-7-2011 
2 4-7-2012 
2 4-7-2013 
2 4-7-2014 
2 4-7-2015 
2 4-7-2016 
2 4-7-2017 
2 4-7-2018 
2 4-7-2019 
2 4-7-2020 
;       
run; 

I need to check by id number, if ID matches in both tables, and if the data in have1 has not exact match in have 2. 

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @ifti_ch2002 

Please try this to remove observations of have2 from have1:

 

data want;
	merge have1 (in=x) have2 (in=y);
	by IDnumber date;
	if x and not y;
run;

Best,

View solution in original post

6 REPLIES 6
ed_sas_member
Meteorite | Level 14

Hi @ifti_ch2002 

Please try this to remove observations of have2 from have1:

 

data want;
	merge have1 (in=x) have2 (in=y);
	by IDnumber date;
	if x and not y;
run;

Best,

ed_sas_member
Meteorite | Level 14

Another option could be the use of a SET operator in PROC SQL:

 

proc sql;
	create table want as
	select IDnumber, date from have1
	except
	select IDnumber, date from have2;
quit;

or a hash:

data want;
	if _n_=1 then do;
		declare hash h (dataset:'have2');
		h.definekey('IDnumber', 'date');
		h.definedata('IDnumber', 'date');
		h.definedone();
	end;
	set have1;
	if h.find() ne 0 then output;
run;

Best,

ifti_ch2002
Obsidian | Level 7

Thanks, I spent hours on Join.

ed_sas_member
Meteorite | Level 14
You're welcome @ifti_ch2002
novinosrin
Tourmaline | Level 20

Hi @ifti_ch2002  The effort -"Thanks, I spent hours on Join" is not a bad idea nor a wrong approach. The intricate details to pay attention to in my humble opinion is the risk of a cartesian join triggered by an inequality operator. However, in your case that concern doesn't surface for the reason, the combination of an equality and inequality operator makes it rather straight forward look-up as the SQL optimizer determines the best join algorithm principally on the equality operator.

 

@ChrisNZ  aka Mr Perfect eloquently writes the distinction between JOINS,Subqueries going above and beyond from a performance standpoint and similar practical usage have often been noticed in posts by @PGStats aka Pierre with no peers who is above all. Please do peek into their posts when you have the time in grasping the concepts.  

 

Kind Regards!

 



data have1;
infile datalines missover; 
   input IDnumber @3 date  mmddyy10.;
   format date mmddyy10.;
   datalines;  
1 4-7-2003 
1 4-7-2009 
1 4-7-2010 
1 4-7-2011 
1 4-7-2012 
1 4-7-2013 
1 4-7-2014 
1 4-7-2015 
1 4-7-2016 
1 4-7-2018 
2 4-7-2005 
2 4-7-2011 
2 4-7-2012 
2 4-7-2013 
2 4-7-2014 
2 4-7-2015 
2 4-7-2016 
2 4-7-2017 
2 4-7-2018 
2 4-7-2019 
2 4-7-2020 
;       
run; 

 

data have2;
infile datalines missover; 
   input IDnumber @3 date  mmddyy10.;
     format date mmddyy10.;
   datalines;  
1 4-7-2003 
2 4-7-2005 
;       
run; 

proc sql;
create table  want as
select a.*
from have1 a inner join have2 b
on a.IDnumber=b.IDnumber
and a.date ne b.date
order by a.IDnumber, a.date;
quit;

proc print noobs;run;

 

 

Ksharp
Super User

data have1;
infile datalines missover; 
   input IDnumber @3 date  mmddyy10.;
   format date mmddyy10.;
   datalines;  
1 4-7-2003 
1 4-7-2009 
1 4-7-2010 
1 4-7-2011 
1 4-7-2012 
1 4-7-2013 
1 4-7-2014 
1 4-7-2015 
1 4-7-2016 
1 4-7-2018 
2 4-7-2005 
2 4-7-2011 
2 4-7-2012 
2 4-7-2013 
2 4-7-2014 
2 4-7-2015 
2 4-7-2016 
2 4-7-2017 
2 4-7-2018 
2 4-7-2019 
2 4-7-2020 
;       
run; 

 

data have2;
infile datalines missover; 
   input IDnumber @3 date  mmddyy10.;
     format date mmddyy10.;
   datalines;  
1 4-7-2003 
2 4-7-2005 
;       
run; 

proc sql;
create table  want as
select * from have1 
except
select * from have2;
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1858 views
  • 0 likes
  • 4 in conversation