DATA Step, Macro, Functions and more

Difference between IF and WHERE statements

Accepted Solution Solved
Reply
Contributor
Posts: 61
Accepted Solution

Difference between IF and WHERE statements

data test;

input ID$ flag_c$ flag_n;

cards;

A 0 0

B 1 1

C 2 2

;run;

/*Where statement: put function sensitive*/

*working;

data result1a;set test;

where put(flag_c, $1.) in ('0','1');

run;

*working;

data result1b;set test;

where put(flag_n, 1.) in ('0','1');

run;

*NOT working;

data result1c;set test;

where put(flag_c, 1.) in ('0','1');

run;

*NOT working;

data result1d;set test;

where put(flag_n, $1.) in ('0','1');

run;

/*IF statement: put function not sensitive*/

*working;

data result2a;set test;

if put(flag_c, $1.) in ('0','1');

run;

*working;

data result2b;set test;

if put(flag_n, 1.) in ('0','1');

run;

*working;

data result2c;set test;

if put(flag_c, 1.) in ('0','1');

run;

*working;

data result2d;set test;

if put(flag_n, $1.) in ('0','1');

run;

I was writing up a macro that has to subset a data set based on one flag. The flag can be character variable

in one data and numeric in another data. So I have to come up with a solution to handle both situations.

I decided to convert the flag (both character and numeric ones) into character variables using put function.

First, I tried where statement with put function. It doesn't work if the put function was not used properly or as intended

(see result1c , result1d).

Then, I tried if statement with put funciton. It looks like IF statement is not that sensentive to the proper and improper use of

put function and hanlded all the situations I may encounter.

MY question is:

How comes the difference between IF and WHERE statements in this case, why they act differently to

different (may be improper) uses of PUT function.

Thanks.


Accepted Solutions
Solution
‎01-22-2015 01:31 PM
Occasional Contributor
Posts: 15

Re: Difference between IF and WHERE statements

I was trying to answer your question about the difference between IF and WHERE.

But, there are some good points that have been brought up.

Here is a semi-generic macro that checks the flag variable to determine if it is numeric or character and allows you to use a where statement,

which is more efficient for large datasets since it is done as the data is brought in and not after, and avoids the automatic conversions done by SAS.

%macro chkit(indata=test,outdata=result,flag=flag);

%let dsid=%sysfunc(open(&indata));
%let vartype = %sysfunc(vartype(&dsid,%sysfunc(varnum(&dsid,&flag))));
%let rc=%sysfunc(close(&dsid));

data &outdata;
    set &indata;

    %if &vartype = N %then
        %do;
        where &flag in (0,1);
        %end;
    %else
        %do;
        where &flag in ('0','1');
        %end;
run;

%mend chkit;

Good luck!

View solution in original post


All Replies
Occasional Contributor
Posts: 15

Re: Difference between IF and WHERE statements

Where statements occur prior to the data being brought into the dataset.

If statements occur after the data is brought into the dataset.

Since the if statement occurs after the data is brought into the dataset SAS, SAS can do an automatic conversion of the data.

You should get notes in your log about this, something like

NOTE: Numeric data converted to character data at line so many times

NOTE: Character data converted to numeric data at line so many times

But, because the where statement occurs prior to the data being brought into the dataset, this automatic conversion can not occur.

So, you should get notes in your log, something like

NOTE: Data type must match in where statement

I'm doing the notes from memory, so I'm sure they are not right.

But, you get the idea.

Good luck!

Solution
‎01-22-2015 01:31 PM
Occasional Contributor
Posts: 15

Re: Difference between IF and WHERE statements

I was trying to answer your question about the difference between IF and WHERE.

But, there are some good points that have been brought up.

Here is a semi-generic macro that checks the flag variable to determine if it is numeric or character and allows you to use a where statement,

which is more efficient for large datasets since it is done as the data is brought in and not after, and avoids the automatic conversions done by SAS.

%macro chkit(indata=test,outdata=result,flag=flag);

%let dsid=%sysfunc(open(&indata));
%let vartype = %sysfunc(vartype(&dsid,%sysfunc(varnum(&dsid,&flag))));
%let rc=%sysfunc(close(&dsid));

data &outdata;
    set &indata;

    %if &vartype = N %then
        %do;
        where &flag in (0,1);
        %end;
    %else
        %do;
        where &flag in ('0','1');
        %end;
run;

%mend chkit;

Good luck!

Super User
Posts: 10,516

Re: Difference between IF and WHERE statements

If the datasets are supposed to represent similar data structures you need to go back further in your process and find out why one has the variable flag_n as text in some sets and numeric as others and address the issue at the "source".

Most likely culprit is a Proc Import step.

The earlier a standard is set the easier the later processing goes.

Super User
Posts: 5,085

Re: Difference between IF and WHERE statements

Well, the first thing to realize is that the PUT function should probably be removed from the program entirely.  Most likely, it gives you the wrong answer even when there are no syntax errors.  For example, this combination produces a match:

flag='123';

if put(flag, $1.) in ('0', '1');

It's perfectly OK to code:

if flag in ('0', '1');

At least that's assuming this matches your intentions.  If you want flags that begin with "0" or with "1", you can still code it without a PUT function:

if flag in : ('0', '1');

The second thing to realize is that this will not solve your problem.  To use a WHERE statement, you cannot compare numbers to characters.  An IF statement will convert one to the other, for purposes of making the comparison.  (You can check your log and see that a conversion took place for those cases.)  But the WHERE statement does not convert; it simply gives you the error you have already seen.

To solve this type of problem, you will need to write a more sophisticated macro.  Your macro will need to determine if your flag is numeric or character, and generate the appropriate WHERE statement.  There is more than one way to do this, but assuming you find one of them your macro will need to code the equivalent of:

%if (I found that FLAG is numeric in this particular data set) %then %do;

  where flag in (0, 1);

%end;

%else %do;

   where flag in ('0', '1');

%end;

Good luck.

Contributor
Posts: 61

Re: Difference between IF and WHERE statements

Thank you all.

I will study your comments and give feed back later.

Joe

Super User
Super User
Posts: 6,502

Re: Difference between IF and WHERE statements

A good way to silently convert unknown variable to character is to use one of the CAT() functions.

where cats(flag_n)  in ('0','1');


Note that you will still get an error message if the FLAG_N variable does not exist.

You really should fix the data type issue earlier in your process.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 785 views
  • 6 likes
  • 5 in conversation