BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nketata
Obsidian | Level 7

Hi all,

I want to create Table_C from Table_A by excluding the values found on TABLE_B based on both variables : ALPHA and BETA

Table_A has ALPHA and BETA as well as many other variables.

Table_B  has only the variables ALPHA and BETA

Table_C should have less observations then Table_A.

All SQL commands I tried (like the one below) gave me 9 times more observations than expected, meaning that the cartesian product is not the one expected.

proc sql;

create table table_C     as select g1.*,g2.*

from Table_A g1         left join TABLE_B g2

ON g1.ALPHA ne g2.ALPHA  and    g1.BETA ne g2.BETA  ;

quit;

Other way to describe the problem :

I want sashelp.class without the follwing observations :

Ageweight
1384
1398

Thanks in advance for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

A correlated sub-query is the simplest and most explicit approach:

proc sql;

create table tableC as

select *

from TableA as A

where not exists (select * from TableB where ALPHA=A.ALPHA and BETA=A.BETA);

quit;

PG

PG

View solution in original post

5 REPLIES 5
DBailey
Lapis Lazuli | Level 10

proc sql;

create table table_C     as select g1.*,g2.*

from Table_A g1         left join TABLE_B g2

ON g1.ALPHA eq g2.ALPHA  and    g1.BETA eq g2.BETA 

where g2.beta is null;

quit;

nketata
Obsidian | Level 7

Great, thanks. The log gave me a warning about alpha and beta being on both files, but the output is valid.

Steelers_In_DC
Barite | Level 11

Here is a solution using a datastep or proc sql;

data table_a;

infile cards;

input age weight;

cards;

13 84

13 98

14 101

15 250

;

run;

data table_b;

infile cards;

input age weight;

cards;

13 84

13 98

;

run;

data table_c_1;

merge table_a(in=a)

      table_b(in=b);

by age weight;

if a and not b;

run;

proc sql;

create table table_c_2 as

select age,weight

from table_a a where not exists (

select *

from table_b b

    where a.age = b.age and

          a.weight = b.weight);

PGStats
Opal | Level 21

A correlated sub-query is the simplest and most explicit approach:

proc sql;

create table tableC as

select *

from TableA as A

where not exists (select * from TableB where ALPHA=A.ALPHA and BETA=A.BETA);

quit;

PG

PG
nketata
Obsidian | Level 7

Thanks, I was aware of the if a and not b in the data-merge but I wanted to avoid the sorting on huge files.

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1981 views
  • 6 likes
  • 4 in conversation