Re: select the rows of a table that don't match another table in SQL?

SASPhile · Posted 08-06-2012 04:37 PM

The above can be done in SAS data step

Data x y;

merge a(in=1) b(in=2);

by id;

if a and b then output x;

if a and not b then output y;

run;

how can it be done in Sql?

Haikuo · Posted 08-06-2012 05:51 PM

Hi,

First of all, you have a typo in your code, in=1 and in=2 should be in=a and in=b. When comparing data step merge with SQL join, you need involve some presumptions, as they work in such a different way that in most of cases they can NOT make to be equivalent. So here let us assume that it is one to one match:

proc sql;

create table x as

select * from a,b where a.id=b.id;

create table y as

select * from a left join b on a.id=b.id;

quit;

Haikuo

PGStats · Posted 08-06-2012 08:08 PM

I believe the queries would work better as (assuming datasets a and b have only variable ID in common) :

proc sql;

create table X as

select * from a natural join b;

select * from X;

create table Y as

select * from a natural left join b

where b.id is missing;

select * from Y;

quit;

PG

Ksharp · Posted 08-06-2012 10:46 PM

proc sql;
create table x as
select * from a,b where a.id=b.id;

create table y as
select * from a where id not in (select id from b);
quit;

Ksharp

sandyzman1 · Posted 03-12-2021 08:25 AM

Awesome. Simple and Straightforward.

chang_y_chung_hotmail_com · Posted 08-07-2012 04:32 PM

I agree with Hai.kuo. In general, data step merge is hard to replicate in proc sql, mainly because the merge assumes order among the observations. To demonstrate:

/* test datasets */
data a;
id=1; x=1; output;
id=2; x=2; output;
id=2; x=3; output;
id=2; x=4; output;
id=5; x=5; output;
run;
data b;
id=2; y=6; output;
id=2; y=7; output;
id=3; y=8; output;
id=3; y=9; output;
id=4; y=0; output;
run;

data sasphile1 sasphile2;
merge a(in=a) b(in=b);
by id;
if a and b then output sasphile1;
if a and not b then output sasphile2;
run;

proc sql noprint;
create table haikuo1 as select * from a, b where a.id=b.id;
create table haikuo2 as select * from a left join b on a.id=b.id;

create table pgstats1 as select * from a natural join b;
create table pgstats2 as select * from a natural left join b where b.id is missing;

create table ksharp1 as select * from a, b where a.id=b.id;
create table ksharp2 as select * from a where id not in (select id from b);
quit;

%macro show(who);
title "&who.1";
proc print data=&who.1;
run;
title "&who.2";
proc print data=&who.2;
run;
title;
%mend show;

%show(sasphile)
/* on lst
sasphile1              sasphile2
Obs    id   x    y     Obs    id    x    y
1      2    2    6      1      1    1    .
2      2    3    7      2      5    5    .
3      2    4    7
*/
%show(haikuo)
/*
haikuo1                haikuo2
Obs    id   x    y     Obs    id    x    y
1      2    2    6      1      1    1    .
2      2    2    7      2      2    3    7
3      2    3    6      3      2    2    7
4      2    3    7      4      2    4    7
5      2    4    6      5      2    3    6
6      2    4    7      6      2    2    6
                        7      2    4    6
                        8      5    5    .
*/
%show(pgstats)
/*
pgstats1                pgstats2
Obs    id   y    x     Obs    id    y    x
1      2    6    2      1      1    .    1
2      2    7    2      2      5    .    5
3      2    6    3
4      2    7    3
5      2    6    4
6      2    7    4
*/
%show(ksharp)
/*
ksharp1                ksharp2
Obs    id   x    y     Obs    id    x
1      2    2    6      1      1    1
2      2    2    7      2      5    5
3      2    3    6
4      2    3    7
5      2    4    6
6      2    4    7
*/

RD2 · Posted 08-10-2012 04:20 AM

How about combining Queries with Set Operators?

Proc sql;

create table anotinb as

select * from A

except

select * from B;

quit;

proc sql;

create table bothinab as

select * from a

intersect

select * from b;

quit;

pdhokriya · Posted 08-30-2019 03:42 AM

The above can be done in SAS data step

Data x y z a b ;

merge a(in=1) b(in=2);

by id name;

if id ne . and name ne "" ;

if a and b then output x;

if a and not b then output y;

if b and not a then output z;

if a then output a ;

if b then output b;

run;

How can we write these queries in proc sql ?

pdhokriya · Posted 09-03-2019 09:37 AM

Kindly share proc sql query for below mentioned query

data final;
length flag1 $100.;
merge ae (in=a) ex(in=b);
by subject site;
if a and not b ;
flag1 = "Subject is NOT present in Exposure form";
run;

Patrick · Posted 08-30-2019 05:09 AM

@SASPhile

A SQL can only ever have one target and though any solution will require two separate SQL's.

hashman · Posted 08-31-2019 02:05 AM

@SASPhile:

You have to define "IT" first:

1. "IT" = generate tables X and Y identical to those output by MERGE.

a. A.ID and B.ID are related as many-to-one or one-to-many: YES, via two separate queries.

b. A.ID and B.ID are related as many-to-many: NO. In this case, MERGE works completely differently from SQL.

2. "IT" = generate X and Y from the same SQL query and/or from a single pass through A and B: NO.

Methinks this pretty much sums it up.

Kind regards

Paul D.

Register Today!

SAS Training: Just a Click Away