Hello everyone,
I have two datasets and I need to see if observations from one dataset are missing from another dataset. For example,
data have; input Name :$upcase4. id Net Reg Loss; cards;home 344 200 50 . home 543 80 . 5 TDR 453 75 . 3 TDR 543 . 5 .
;;;; run;
data have;
input Name :$upcase4. id Net Reg Loss;
cards;
home344 200 50 . home 543 80 . 5
;;;;
run;
I want to see if any ids are missing in the second dataset compared with the first dataset. For example, id 453 is missing but 543 and 344 are not.
Thanks in advance 🙂
Kind regards,
Jack
Assuming the tables are called First & Second:
proc sql;
/* Get IDs in First but not in Second */
select id from First except select id from Second;
/* Get IDs in Second but not in First */
select id from Second except select id from First;
/* Get IDs in both */
select id from Second intersect select id from First;
quit;
Does this help?
@AndrewHowell wrote:
Assuming the tables are called First & Second:
proc sql; /* Get IDs in First but not in Second */ select id from First except select id from Second; /* Get IDs in Second but not in First */ select id from Second except select id from First; /* Get IDs in both */ select id from Second intersect select id from First; quit;
Does this help?
May want select distinct id if there are duplicates.
proc sort data=have1;
by id;
run;
proc sort data=have2;
by id;
run;
data
miss_from_have2
miss_from_have1
both
;
merge
have1 (in=a keep=id)
have2 (in=b keep=id)
;
by id;
if a and b
then output both;
else if a
then output miss_from_have2;
else output miss_from_have1;
run;
might perform better on large datasets.
Elegant SQL intersection and exception of sets of data
Matching observations from one data to see if any are missing in another dataset
see
https://goo.gl/xCjIsU
https://communities.sas.com/t5/Base-SAS-Programming/Matching-observations-from-one-data-to-see-if-any-are-missing-in/m-p/343684
HAVE
=====
Up to 40 obs WORK.HAV1ST total obs=2
Obs NAME ID
1 home ___
2 home XXX
Up to 40 obs WORK.HAV2ND total obs=4
Obs NAME ID
1 home ___
2 home XXX
3 TDR @@@
4 TDR XXX
WANT
=====
SQLOBS=0 hav1st except hav2nd
SQLOBS=1 hav2nd except hav1st
ID
--------
@@@
SQLOBS=2 hav2nd intersect hav1st
ID
--------
XXX
___
FULL SOLUTION
===============
%utlnopts; /* turn options off so so only the puts statements are in the log */
proc sql;
/* ID @@@ Get IDs in hav1st but not in hav2nd */
select id from hav1st except select id from hav2nd;
%put &=sqlobs hav1st except hav2nd;
/* Get IDs in hav2nd but not in hav1st */
select id from hav2nd except select id from hav1st;
%put &=sqlobs hav2nd except hav1st;
/* Get IDs in both */
select id from hav2nd intersect select id from hav1st;
%put &=sqlobs hav2nd intersect hav1st;
quit;
proc sql;
select id from hav1st where id not in (select id from hav2nd);
%put &=sqlobs hav1st except hav2nd;
select id from hav2nd where id not in (select id from hav1st);
%put &=sqlobs hav2nd except hav1st;
select id from hav1st where id in (select id from hav2nd);
%put &=sqlobs hav2nd intersect hav1st;
quit;
This can also be a good task for beginning to use hash objects:
data first_not_second;
set first;
if _n_=1 then do;
declare hash second_ids (dataset:'second(keep=id)');
second_ids.definekey(all:'Y');
second_ids.definedone();
end;
if second_ids.find()^=0;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.