- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The above can be done in SAS data step
Data x y;
merge a(in=1) b(in=2);
by id;
if a and b then output x;
if a and not b then output y;
run;
how can it be done in Sql?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
First of all, you have a typo in your code, in=1 and in=2 should be in=a and in=b. When comparing data step merge with SQL join, you need involve some presumptions, as they work in such a different way that in most of cases they can NOT make to be equivalent. So here let us assume that it is one to one match:
proc sql;
create table x as
select * from a,b where a.id=b.id;
create table y as
select * from a left join b on a.id=b.id;
quit;
Haikuo
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I believe the queries would work better as (assuming datasets a and b have only variable ID in common) :
proc sql;
create table X as
select * from a natural join b;
select * from X;
create table Y as
select * from a natural left join b
where b.id is missing;
select * from Y;
quit;
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sql; create table x as select * from a,b where a.id=b.id; create table y as select * from a where id not in (select id from b); quit;
Ksharp
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I agree with Hai.kuo. In general, data step merge is hard to replicate in proc sql, mainly because the merge assumes order among the observations. To demonstrate:
/* test datasets */
data a;
id=1; x=1; output;
id=2; x=2; output;
id=2; x=3; output;
id=2; x=4; output;
id=5; x=5; output;
run;
data b;
id=2; y=6; output;
id=2; y=7; output;
id=3; y=8; output;
id=3; y=9; output;
id=4; y=0; output;
run;
data sasphile1 sasphile2;
merge a(in=a) b(in=b);
by id;
if a and b then output sasphile1;
if a and not b then output sasphile2;
run;
proc sql noprint;
create table haikuo1 as select * from a, b where a.id=b.id;
create table haikuo2 as select * from a left join b on a.id=b.id;
create table pgstats1 as select * from a natural join b;
create table pgstats2 as select * from a natural left join b where b.id is missing;
create table ksharp1 as select * from a, b where a.id=b.id;
create table ksharp2 as select * from a where id not in (select id from b);
quit;
%macro show(who);
title "&who.1";
proc print data=&who.1;
run;
title "&who.2";
proc print data=&who.2;
run;
title;
%mend show;
%show(sasphile)
/* on lst
sasphile1 sasphile2
Obs id x y Obs id x y
1 2 2 6 1 1 1 .
2 2 3 7 2 5 5 .
3 2 4 7
*/
%show(haikuo)
/*
haikuo1 haikuo2
Obs id x y Obs id x y
1 2 2 6 1 1 1 .
2 2 2 7 2 2 3 7
3 2 3 6 3 2 2 7
4 2 3 7 4 2 4 7
5 2 4 6 5 2 3 6
6 2 4 7 6 2 2 6
7 2 4 6
8 5 5 .
*/
%show(pgstats)
/*
pgstats1 pgstats2
Obs id y x Obs id y x
1 2 6 2 1 1 . 1
2 2 7 2 2 5 . 5
3 2 6 3
4 2 7 3
5 2 6 4
6 2 7 4
*/
%show(ksharp)
/*
ksharp1 ksharp2
Obs id x y Obs id x
1 2 2 6 1 1 1
2 2 2 7 2 5 5
3 2 3 6
4 2 3 7
5 2 4 6
6 2 4 7
*/
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
How about combining Queries with Set Operators?
Proc sql;
create table anotinb as
select * from A
except
select * from B;
quit;
proc sql;
create table bothinab as
select * from a
intersect
select * from b;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The above can be done in SAS data step
Data x y z a b ;
merge a(in=1) b(in=2);
by id name;
if id ne . and name ne "" ;
if a and b then output x;
if a and not b then output y;
if b and not a then output z;
if a then output a ;
if b then output b;
run;
How can we write these queries in proc sql ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Kindly share proc sql query for below mentioned query
data final;
length flag1 $100.;
merge ae (in=a) ex(in=b);
by subject site;
if a and not b ;
flag1 = "Subject is NOT present in Exposure form";
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
A SQL can only ever have one target and though any solution will require two separate SQL's.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You have to define "IT" first:
1. "IT" = generate tables X and Y identical to those output by MERGE.
a. A.ID and B.ID are related as many-to-one or one-to-many: YES, via two separate queries.
b. A.ID and B.ID are related as many-to-many: NO. In this case, MERGE works completely differently from SQL.
2. "IT" = generate X and Y from the same SQL query and/or from a single pass through A and B: NO.
Methinks this pretty much sums it up.
Kind regards
Paul D.