BookmarkSubscribeRSS Feed
David_Billa
Rhodochrosite | Level 12

Assuming I'm creating a dataset based on WHERE clause. It means only the data which passed the condition will be filtered and inserted into the dataset which we create. Is there a way to capture the non-matching records in one dataset which doesn't statisfy the condition mentioned in WHERE clause?

 

e.g. If there are 10 records in HAVE dataset and we're getting only 5 records in WANT dataset due to the WHERE clause and the other 5 records was excluded. I want to capture excluded records in other dataset EXCLUDED.

16 REPLIES 16
David_Billa
Rhodochrosite | Level 12
How do you tackle this if I ask you to write the condition in WHERE instead
of IF?
ballardw
Super User

@David_Billa wrote:
How do you tackle this if I ask you to write the condition in WHERE instead
of IF?

Basically you don't.

The WHERE is applied to the input vector.  Anything that does not match is excluded and cannot be manipulated in that data step.

You would have to use a second data step with a where clause that is basically the NOT of the one you are using.

David_Billa
Rhodochrosite | Level 12
Could you please help me with one example?
ballardw
Super User

@David_Billa wrote:
Could you please help me with one example?

If you mean of two data steps:

data want;
   set have;
   where value in ('a' 'b' 'c');
run;

data want2;
   set have;
   where not (value in ('a' 'b' 'c'));
run;

The NOT operation negates the logical value of the expression.

 

David_Billa
Rhodochrosite | Level 12

It works with one condition and I'm unable to mame it work for more than one condition.

 

Following code is working correctly:

 

data want;
   set sashelp.class;
   where not (age in (11,12,13));
run;

where as the below code is not working if I tweak the above code with more than one condition:

 

 

6         data want2;
27            set sashelp.class;
28            where not (age in (11,12,13)
29         and weight=112);
30         run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
      WHERE age not in (11, 12, 13) or (weight not = 112);
data want2;
   set sashelp.class;
   where not (age in (11,12,13)
and weight=112);
run;

Log:

 

 

 

David_Billa
Rhodochrosite | Level 12

I tried as syou suggested but it still seem to be not working. Not sure why second condition is resolving it as OR instead of AND

 

26         data want2;
27            set sashelp.class;
28            where not ((age in (11,12,13)
29         and weight=112));
30         run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
      WHERE age not in (11, 12, 13) or (weight not = 112);
NOTE: The data set WORK.WANT2 has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
David_Billa
Rhodochrosite | Level 12

then may I know how to make it work?

Kurt_Bremser
Super User

@David_Billa wrote:

then may I know how to make it work?


It DOES work.

See this example:

data want1;
set sashelp.class;
where age in (11,12,13) and weight=128;
run;

data want2;
set sashelp.class;
where not (age in (11,12,13) and weight=128);
run;

(I had to change the weight value, so that the first condition selects at least one observation)

Log:

 73         data want1;
 74         set sashelp.class;
 75         where age in (11,12,13) and weight=128;
 76         run;
 
 NOTE: There were 1 observations read from the data set SASHELP.CLASS.
       WHERE age in (11, 12, 13) and (weight=128);
 NOTE: The data set WORK.WANT1 has 1 observations and 5 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.00 seconds
       cpu time            0.00 seconds
       
 
 77         
 78         data want2;
 79         set sashelp.class;
 80         where not (age in (11,12,13) and weight=128);
 81         run;
 
 NOTE: There were 18 observations read from the data set SASHELP.CLASS.
       WHERE age not in (11, 12, 13) or (weight not = 128);
 NOTE: The data set WORK.WANT2 has 18 observations and 5 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.02 seconds
       cpu time            0.01 seconds

18 + 1 gives 19, the number of observations in sashelp.class.

Patrick
Opal | Level 21

@David_Billa 

Even though using the WHERE statement in a SAS data step is syntactically correct, you won't find a lot of experienced SAS users coding this way. 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 16 replies
  • 3036 views
  • 4 likes
  • 5 in conversation