Solved: Re: proc sql returning more rows than expected

aaronh · Posted 09-25-2019 05:12 PM

Hello experts,

So I have a dataset X that I want to subset based on the key values stored in dataset Y:

proc sql;
create table test as
select x.*
from x, y
where x.key1 = y.key1 and 
      x.key2 = y.key2;
quit;

However, when the proc sql completes running, I noticed that TEST dataset had almost twice as many rows as X. So I am wondering why this is happening and what alternatives are there?

To give you an example of what I am trying to achieve: say X has 'apple' and 'banana' under key1, but Y only has 'banana' under key1, how do I make sure the subset of X only has banana for key1?

Thank you!

PGStats · Posted 09-25-2019 05:25 PM

You must make sure that there are no duplicates of key1-key2 pairs in table y. Can be done with:

proc sql;
create table test as
select x.*
from x, (select distinct key1, key2 from y) as yy
where x.key1 = yy.key1 and  x.key2 = yy.key2;
quit;

PG

View solution in original post

PGStats · Posted 09-25-2019 05:25 PM

You must make sure that there are no duplicates of key1-key2 pairs in table y. Can be done with:

proc sql;
create table test as
select x.*
from x, (select distinct key1, key2 from y) as yy
where x.key1 = yy.key1 and  x.key2 = yy.key2;
quit;

PG

aaronh · Posted 09-25-2019 05:48 PM

Thank you so much PG! I was thinking of using a data step merge, but would require a proc sort dedup as well as a lot of renaming and sorting, while this can be achieved in SQL in one step.

proc sql returning more rows than expected

Re: proc sql returning more rows than expected

Re: proc sql returning more rows than expected

Re: proc sql returning more rows than expected

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away