BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AmirSari
Quartz | Level 8

Hi all,

 

I am trying to subset a dataset based on conditions that are true in another dataset. Basically, I have two datasets, say test_1 and test_2. The dataset test_1 also includes all observations in test_2. I want to drop the test_2 observations from the test_1 dataset. I tried to do that using proc sql but had no success so far.

Here is what I have.

 

proc sql;

create table want as select *

from test_2

where YEAR not in (select YEAR from test_1) and ID not in (select ID from test_1);
quit;

 

I am not sure what the problem is but I cannot get the sql to apply both conditions at the same time.

 

Thanks for your help,

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

If the variables in test_1 and test_2 are the same and you actually want to exclude the records that are in test_2 from test_1 then perhaps:

proc sql;
   create table want as
   select * from test_1
   except 
   select * from test_2
   ;
quit;

However if the records are not actually identical, different variables for instance, this wouldn't quite work.

 

In which case you can use except with a subset of variables to identify the records you want from test_1 and then join back with test_1 to get the other variables.

View solution in original post

7 REPLIES 7
Patrick
Opal | Level 21

@AmirSari

What is not working? What are you trying to achieve here?

Your where clause will always be true so you won't get any rows returned. What you're doing here: You lookup if a value exists in the same table - that will always be true for any variable as you will always find at least one match (the variable matching with itself from the same row).

AmirSari
Quartz | Level 8
I want to to create a dataset from test_2 where the observations that satisfy the two conditions are excluded.
AmirSari
Quartz | Level 8
I want to create a dataset from test_2. I edited my post. That was my bad.
ballardw
Super User

If the variables in test_1 and test_2 are the same and you actually want to exclude the records that are in test_2 from test_1 then perhaps:

proc sql;
   create table want as
   select * from test_1
   except 
   select * from test_2
   ;
quit;

However if the records are not actually identical, different variables for instance, this wouldn't quite work.

 

In which case you can use except with a subset of variables to identify the records you want from test_1 and then join back with test_1 to get the other variables.

Ksharp
Super User

Maybe you need this in the future.

 

proc sql;

create table want as select *

from test_2

where catx(' ',YEAR,ID) not in (select   catx(' ',YEAR,ID)  from  test_1 );
quit;

AmirSari
Quartz | Level 8
Thanks, @Ksharp!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2533 views
  • 2 likes
  • 5 in conversation