BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8

The above can be done in SAS data step

Data x y;

merge a(in=1) b(in=2);

by id;

if a and b then output x;

if a and not b then output y;

run;

how can it be done in Sql?

10 REPLIES 10
Haikuo
Onyx | Level 15

Hi,

First of all, you have a typo in your code, in=1 and in=2 should be in=a and in=b. When comparing data step merge with SQL join, you need involve some presumptions, as they work in such a different way that in most of cases they can NOT make to be equivalent. So here let us assume that it is one to one match:

proc sql;

create table x as

select * from a,b where a.id=b.id;

create table y as

select * from a left join b on a.id=b.id;

quit;

Haikuo

PGStats
Opal | Level 21

I believe the queries would work better as (assuming datasets a and b have only variable ID in common) :

proc sql;

create table X as

select * from a natural join b;

select * from X;

create table Y as

select * from a natural left join b

where b.id is missing;

select * from Y;

quit;

PG

PG
Ksharp
Super User
proc sql;
create table x as
select * from a,b where a.id=b.id;

create table y as
select * from a where id not in (select id from b);
quit;

Ksharp

sandyzman1
Obsidian | Level 7
Awesome. Simple and Straightforward.
chang_y_chung_hotmail_com
Obsidian | Level 7

I agree with Hai.kuo. In general, data step merge is hard to replicate in proc sql, mainly because the merge assumes order among the observations. To demonstrate:

/* test datasets */
data a;
  id=1; x=1; output;
  id=2; x=2; output;
  id=2; x=3; output;
  id=2; x=4; output;
  id=5; x=5; output;
run;
data b;
  id=2; y=6; output;
  id=2; y=7; output;
  id=3; y=8; output;
  id=3; y=9; output;
  id=4; y=0; output;
run;
 
data sasphile1 sasphile2;
  merge a(in=a) b(in=b);
  by id;
  if a and b then output sasphile1;
  if a and not b then output sasphile2;
run;
 
proc sql noprint;
  create table haikuo1 as select * from a, b where a.id=b.id;
  create table haikuo2 as select * from a left join b on a.id=b.id;
 
  create table pgstats1 as select * from a natural join b;
  create table pgstats2 as select * from a natural left join b where b.id is missing;
 
  create table ksharp1 as select * from a, b where a.id=b.id;
  create table ksharp2 as select * from a where id not in (select id from b);
quit;
 
%macro show(who);
  title "&who.1";
  proc print data=&who.1;
  run;
  title "&who.2";
  proc print data=&who.2;
  run;
  title;
%mend show;
 
%show(sasphile)
/* on lst
sasphile1              sasphile2   
Obs    id   x    y     Obs    id    x    y
1      2    2    6      1      1    1    .
2      2    3    7      2      5    5    .
3      2    4    7
*/
%show(haikuo)
/*
haikuo1                haikuo2 
Obs    id   x    y     Obs    id    x    y
1      2    2    6      1      1    1    .
2      2    2    7      2      2    3    7
3      2    3    6      3      2    2    7
4      2    3    7      4      2    4    7
5      2    4    6      5      2    3    6
6      2    4    7      6      2    2    6
                        7      2    4    6
                        8      5    5    .
*/
%show(pgstats)
/*
pgstats1                pgstats2
Obs    id   y    x     Obs    id    y    x
1      2    6    2      1      1    .    1
2      2    7    2      2      5    .    5
3      2    6    3   
4      2    7    3
5      2    6    4
6      2    7    4
*/
%show(ksharp)
/*
ksharp1                ksharp2
Obs    id   x    y     Obs    id    x
1      2    2    6      1      1    1
2      2    2    7      2      5    5
3      2    3    6   
4      2    3    7
5      2    4    6
6      2    4    7
*/

RD2
Fluorite | Level 6 RD2
Fluorite | Level 6

How about combining Queries with Set Operators?

Proc sql;

create table anotinb as

select * from A

except

select * from B;

quit;

proc sql;

  create table bothinab as

   select * from a

   intersect

   select * from b;

quit;


pdhokriya
Pyrite | Level 9

The above can be done in SAS data step

 

Data x y z a b ;

merge a(in=1) b(in=2);

by id name;

if id ne . and name ne "" ;

if a and b then output x;

if a and not b then output y;

if b and not a then output z;

if a then output a ;

if b then output b;

run;

 

How can we write these queries in proc sql ?

pdhokriya
Pyrite | Level 9

Kindly share proc sql query for below mentioned query

 

data final;
length flag1 $100.;
merge ae (in=a) ex(in=b);
by subject site;
if a and not b ;
flag1 = "Subject is NOT present in Exposure form";
run;

Patrick
Opal | Level 21

@SASPhile 

A SQL can only ever have one target and though any solution will require two separate SQL's.

hashman
Ammonite | Level 13

@SASPhile:

You have to define "IT" first:

 

1. "IT" = generate tables X and Y identical to those output by MERGE.

  a. A.ID and B.ID are related as many-to-one or one-to-many: YES, via two separate queries.

  b. A.ID and B.ID are related as many-to-many: NO. In this case, MERGE works completely differently from SQL.

 

2. "IT" = generate X and Y from the same SQL query and/or from a single pass through A and B: NO. 

 

Methinks this pretty much sums it up.

 

Kind regards

Paul D.

    

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 35310 views
  • 2 likes
  • 10 in conversation