data test;
input ID$ flag_c$ flag_n;
cards;
A 0 0
B 1 1
C 2 2
;run;
/*Where statement: put function sensitive*/
*working;
data result1a;set test;
where put(flag_c, $1.) in ('0','1');
run;
*working;
data result1b;set test;
where put(flag_n, 1.) in ('0','1');
run;
*NOT working;
data result1c;set test;
where put(flag_c, 1.) in ('0','1');
run;
*NOT working;
data result1d;set test;
where put(flag_n, $1.) in ('0','1');
run;
/*IF statement: put function not sensitive*/
*working;
data result2a;set test;
if put(flag_c, $1.) in ('0','1');
run;
*working;
data result2b;set test;
if put(flag_n, 1.) in ('0','1');
run;
*working;
data result2c;set test;
if put(flag_c, 1.) in ('0','1');
run;
*working;
data result2d;set test;
if put(flag_n, $1.) in ('0','1');
run;
I was writing up a macro that has to subset a data set based on one flag. The flag can be character variable
in one data and numeric in another data. So I have to come up with a solution to handle both situations.
I decided to convert the flag (both character and numeric ones) into character variables using put function.
First, I tried where statement with put function. It doesn't work if the put function was not used properly or as intended
(see result1c , result1d).
Then, I tried if statement with put funciton. It looks like IF statement is not that sensentive to the proper and improper use of
put function and hanlded all the situations I may encounter.
MY question is:
How comes the difference between IF and WHERE statements in this case, why they act differently to
different (may be improper) uses of PUT function.
Thanks.
I was trying to answer your question about the difference between IF and WHERE.
But, there are some good points that have been brought up.
Here is a semi-generic macro that checks the flag variable to determine if it is numeric or character and allows you to use a where statement,
which is more efficient for large datasets since it is done as the data is brought in and not after, and avoids the automatic conversions done by SAS.
%macro chkit(indata=test,outdata=result,flag=flag);
%let dsid=%sysfunc(open(&indata));
%let vartype = %sysfunc(vartype(&dsid,%sysfunc(varnum(&dsid,&flag))));
%let rc=%sysfunc(close(&dsid));
data &outdata;
set &indata;
%if &vartype = N %then
%do;
where &flag in (0,1);
%end;
%else
%do;
where &flag in ('0','1');
%end;
run;
%mend chkit;
Good luck!
Where statements occur prior to the data being brought into the dataset.
If statements occur after the data is brought into the dataset.
Since the if statement occurs after the data is brought into the dataset SAS, SAS can do an automatic conversion of the data.
You should get notes in your log about this, something like
NOTE: Numeric data converted to character data at line so many times
NOTE: Character data converted to numeric data at line so many times
But, because the where statement occurs prior to the data being brought into the dataset, this automatic conversion can not occur.
So, you should get notes in your log, something like
NOTE: Data type must match in where statement
I'm doing the notes from memory, so I'm sure they are not right.
But, you get the idea.
Good luck!
I was trying to answer your question about the difference between IF and WHERE.
But, there are some good points that have been brought up.
Here is a semi-generic macro that checks the flag variable to determine if it is numeric or character and allows you to use a where statement,
which is more efficient for large datasets since it is done as the data is brought in and not after, and avoids the automatic conversions done by SAS.
%macro chkit(indata=test,outdata=result,flag=flag);
%let dsid=%sysfunc(open(&indata));
%let vartype = %sysfunc(vartype(&dsid,%sysfunc(varnum(&dsid,&flag))));
%let rc=%sysfunc(close(&dsid));
data &outdata;
set &indata;
%if &vartype = N %then
%do;
where &flag in (0,1);
%end;
%else
%do;
where &flag in ('0','1');
%end;
run;
%mend chkit;
Good luck!
If the datasets are supposed to represent similar data structures you need to go back further in your process and find out why one has the variable flag_n as text in some sets and numeric as others and address the issue at the "source".
Most likely culprit is a Proc Import step.
The earlier a standard is set the easier the later processing goes.
Well, the first thing to realize is that the PUT function should probably be removed from the program entirely. Most likely, it gives you the wrong answer even when there are no syntax errors. For example, this combination produces a match:
flag='123';
if put(flag, $1.) in ('0', '1');
It's perfectly OK to code:
if flag in ('0', '1');
At least that's assuming this matches your intentions. If you want flags that begin with "0" or with "1", you can still code it without a PUT function:
if flag in : ('0', '1');
The second thing to realize is that this will not solve your problem. To use a WHERE statement, you cannot compare numbers to characters. An IF statement will convert one to the other, for purposes of making the comparison. (You can check your log and see that a conversion took place for those cases.) But the WHERE statement does not convert; it simply gives you the error you have already seen.
To solve this type of problem, you will need to write a more sophisticated macro. Your macro will need to determine if your flag is numeric or character, and generate the appropriate WHERE statement. There is more than one way to do this, but assuming you find one of them your macro will need to code the equivalent of:
%if (I found that FLAG is numeric in this particular data set) %then %do;
where flag in (0, 1);
%end;
%else %do;
where flag in ('0', '1');
%end;
Good luck.
Thank you all.
I will study your comments and give feed back later.
Joe
A good way to silently convert unknown variable to character is to use one of the CAT() functions.
where cats(flag_n) in ('0','1');
Note that you will still get an error message if the FLAG_N variable does not exist.
You really should fix the data type issue earlier in your process.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.