DATA Step, Macro, Functions and more

select the rows of a table that don't match another table in SQL?

Reply
Super Contributor
Posts: 647

select the rows of a table that don't match another table in SQL?

The above can be done in SAS data step

Data x y;

merge a(in=1) b(in=2);

by id;

if a and b then output x;

if a and not b then output y;

run;

how can it be done in Sql?

Respected Advisor
Posts: 3,124

Re: select the rows of a table that don't match another table in SQL?

Hi,

First of all, you have a typo in your code, in=1 and in=2 should be in=a and in=b. When comparing data step merge with SQL join, you need involve some presumptions, as they work in such a different way that in most of cases they can NOT make to be equivalent. So here let us assume that it is one to one match:

proc sql;

create table x as

select * from a,b where a.id=b.id;

create table y as

select * from a left join b on a.id=b.id;

quit;

Haikuo

Respected Advisor
Posts: 4,646

Re: select the rows of a table that don't match another table in SQL?

I believe the queries would work better as (assuming datasets a and b have only variable ID in common) :

proc sql;

create table X as

select * from a natural join b;

select * from X;

create table Y as

select * from a natural left join b

where b.id is missing;

select * from Y;

quit;

PG

PG
Super User
Posts: 9,681

Re: select the rows of a table that don't match another table in SQL?

proc sql;
create table x as
select * from a,b where a.id=b.id;

create table y as
select * from a where id not in (select id from b);
quit;

Ksharp

Regular Contributor
Posts: 241

Re: select the rows of a table that don't match another table in SQL?

I agree with Hai.kuo. In general, data step merge is hard to replicate in proc sql, mainly because the merge assumes order among the observations. To demonstrate:

/* test datasets */
data a;
  id=1; x=1; output;
  id=2; x=2; output;
  id=2; x=3; output;
  id=2; x=4; output;
  id=5; x=5; output;
run;
data b;
  id=2; y=6; output;
  id=2; y=7; output;
  id=3; y=8; output;
  id=3; y=9; output;
  id=4; y=0; output;
run;
 
data sasphile1 sasphile2;
  merge a(in=a) b(in=b);
  by id;
  if a and b then output sasphile1;
  if a and not b then output sasphile2;
run;
 
proc sql noprint;
  create table haikuo1 as select * from a, b where a.id=b.id;
  create table haikuo2 as select * from a left join b on a.id=b.id;
 
  create table pgstats1 as select * from a natural join b;
  create table pgstats2 as select * from a natural left join b where b.id is missing;
 
  create table ksharp1 as select * from a, b where a.id=b.id;
  create table ksharp2 as select * from a where id not in (select id from b);
quit;
 
%macro show(who);
  title "&who.1";
  proc print data=&who.1;
  run;
  title "&who.2";
  proc print data=&who.2;
  run;
  title;
%mend show;
 
%show(sasphile)
/* on lst
sasphile1              sasphile2   
Obs    id   x    y     Obs    id    x    y
1      2    2    6      1      1    1    .
2      2    3    7      2      5    5    .
3      2    4    7
*/
%show(haikuo)
/*
haikuo1                haikuo2 
Obs    id   x    y     Obs    id    x    y
1      2    2    6      1      1    1    .
2      2    2    7      2      2    3    7
3      2    3    6      3      2    2    7
4      2    3    7      4      2    4    7
5      2    4    6      5      2    3    6
6      2    4    7      6      2    2    6
                        7      2    4    6
                        8      5    5    .
*/
%show(pgstats)
/*
pgstats1                pgstats2
Obs    id   y    x     Obs    id    y    x
1      2    6    2      1      1    .    1
2      2    7    2      2      5    .    5
3      2    6    3   
4      2    7    3
5      2    6    4
6      2    7    4
*/
%show(ksharp)
/*
ksharp1                ksharp2
Obs    id   x    y     Obs    id    x
1      2    2    6      1      1    1
2      2    2    7      2      5    5
3      2    3    6   
4      2    3    7
5      2    4    6
6      2    4    7
*/

Occasional Contributor
Posts: 16

Re: select the rows of a table that don't match another table in SQL?

How about combining Queries with Set Operators?

Proc sql;

create table anotinb as

select * from A

except

select * from B;

quit;

proc sql;

  create table bothinab as

   select * from a

   intersect

   select * from b;

quit;


Ask a Question
Discussion stats
  • 5 replies
  • 3182 views
  • 0 likes
  • 6 in conversation