BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8

The above can be done in SAS data step

Data x y;

merge a(in=1) b(in=2);

by id;

if a and b then output x;

if a and not b then output y;

run;

how can it be done in Sql?

10 REPLIES 10
Haikuo
Onyx | Level 15

Hi,

First of all, you have a typo in your code, in=1 and in=2 should be in=a and in=b. When comparing data step merge with SQL join, you need involve some presumptions, as they work in such a different way that in most of cases they can NOT make to be equivalent. So here let us assume that it is one to one match:

proc sql;

create table x as

select * from a,b where a.id=b.id;

create table y as

select * from a left join b on a.id=b.id;

quit;

Haikuo

PGStats
Opal | Level 21

I believe the queries would work better as (assuming datasets a and b have only variable ID in common) :

proc sql;

create table X as

select * from a natural join b;

select * from X;

create table Y as

select * from a natural left join b

where b.id is missing;

select * from Y;

quit;

PG

PG
Ksharp
Super User
proc sql;
create table x as
select * from a,b where a.id=b.id;

create table y as
select * from a where id not in (select id from b);
quit;

Ksharp

sandyzman1
Obsidian | Level 7
Awesome. Simple and Straightforward.
chang_y_chung_hotmail_com
Obsidian | Level 7

I agree with Hai.kuo. In general, data step merge is hard to replicate in proc sql, mainly because the merge assumes order among the observations. To demonstrate:

/* test datasets */
data a;
  id=1; x=1; output;
  id=2; x=2; output;
  id=2; x=3; output;
  id=2; x=4; output;
  id=5; x=5; output;
run;
data b;
  id=2; y=6; output;
  id=2; y=7; output;
  id=3; y=8; output;
  id=3; y=9; output;
  id=4; y=0; output;
run;
 
data sasphile1 sasphile2;
  merge a(in=a) b(in=b);
  by id;
  if a and b then output sasphile1;
  if a and not b then output sasphile2;
run;
 
proc sql noprint;
  create table haikuo1 as select * from a, b where a.id=b.id;
  create table haikuo2 as select * from a left join b on a.id=b.id;
 
  create table pgstats1 as select * from a natural join b;
  create table pgstats2 as select * from a natural left join b where b.id is missing;
 
  create table ksharp1 as select * from a, b where a.id=b.id;
  create table ksharp2 as select * from a where id not in (select id from b);
quit;
 
%macro show(who);
  title "&who.1";
  proc print data=&who.1;
  run;
  title "&who.2";
  proc print data=&who.2;
  run;
  title;
%mend show;
 
%show(sasphile)
/* on lst
sasphile1              sasphile2   
Obs    id   x    y     Obs    id    x    y
1      2    2    6      1      1    1    .
2      2    3    7      2      5    5    .
3      2    4    7
*/
%show(haikuo)
/*
haikuo1                haikuo2 
Obs    id   x    y     Obs    id    x    y
1      2    2    6      1      1    1    .
2      2    2    7      2      2    3    7
3      2    3    6      3      2    2    7
4      2    3    7      4      2    4    7
5      2    4    6      5      2    3    6
6      2    4    7      6      2    2    6
                        7      2    4    6
                        8      5    5    .
*/
%show(pgstats)
/*
pgstats1                pgstats2
Obs    id   y    x     Obs    id    y    x
1      2    6    2      1      1    .    1
2      2    7    2      2      5    .    5
3      2    6    3   
4      2    7    3
5      2    6    4
6      2    7    4
*/
%show(ksharp)
/*
ksharp1                ksharp2
Obs    id   x    y     Obs    id    x
1      2    2    6      1      1    1
2      2    2    7      2      5    5
3      2    3    6   
4      2    3    7
5      2    4    6
6      2    4    7
*/

RD2
Fluorite | Level 6 RD2
Fluorite | Level 6

How about combining Queries with Set Operators?

Proc sql;

create table anotinb as

select * from A

except

select * from B;

quit;

proc sql;

  create table bothinab as

   select * from a

   intersect

   select * from b;

quit;


pdhokriya
Pyrite | Level 9

The above can be done in SAS data step

 

Data x y z a b ;

merge a(in=1) b(in=2);

by id name;

if id ne . and name ne "" ;

if a and b then output x;

if a and not b then output y;

if b and not a then output z;

if a then output a ;

if b then output b;

run;

 

How can we write these queries in proc sql ?

pdhokriya
Pyrite | Level 9

Kindly share proc sql query for below mentioned query

 

data final;
length flag1 $100.;
merge ae (in=a) ex(in=b);
by subject site;
if a and not b ;
flag1 = "Subject is NOT present in Exposure form";
run;

Patrick
Opal | Level 21

@SASPhile 

A SQL can only ever have one target and though any solution will require two separate SQL's.

hashman
Ammonite | Level 13

@SASPhile:

You have to define "IT" first:

 

1. "IT" = generate tables X and Y identical to those output by MERGE.

  a. A.ID and B.ID are related as many-to-one or one-to-many: YES, via two separate queries.

  b. A.ID and B.ID are related as many-to-many: NO. In this case, MERGE works completely differently from SQL.

 

2. "IT" = generate X and Y from the same SQL query and/or from a single pass through A and B: NO. 

 

Methinks this pretty much sums it up.

 

Kind regards

Paul D.

    

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 37044 views
  • 2 likes
  • 10 in conversation