Re: Difference Between 2 Data sets

suzannep · Posted 08-05-2019 12:13 PM

Hi,

I have two data sets. Data set A has 1,000 observations and data set B (subset of data set A) with 800 observations. I want to make a data set C that ONLY has the other 200 observations from data set A.

I'm sure there is some easy code for this but I just don't know what it is. Please help!

Thanks!

Zad · Posted 08-05-2019 12:25 PM

proc sort data=a;by common_surveyid;

proc sort data=b;by common_surveyid;run;

data a_b;merge a(in=a) b(in=b);

by common_surveyid;

if a and not b;

run;

Kurt_Bremser · Posted 08-05-2019 01:55 PM

While your code will provide the intended results, it looks horrible.

Compare this;

proc sort data=a;
by common_surveyid;
run;

proc sort data=b;
by common_surveyid;
run;
 
data a_b;
merge
  a (in=a)
  b (in=b)
;
by common_surveyid;
if a and not b;
run;

and tell me, which one is easier to read and understand for another coder?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Kurt_Bremser · Posted 08-05-2019 01:57 PM

You can also do it in SQL:

proc sql;
create table want as
select a.*
from a
where key not in (
  select b.key from b
);
quit;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

ballardw · Posted 08-05-2019 03:39 PM

Or possibly

Proc Sql;
   create table want as
   select * from tableA
   except
   select * from tableb
  ;
quit;

Difference Between 2 Data sets