BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jiangmi
Calcite | Level 5

data test;

input ID$ flag_c$ flag_n;

cards;

A 0 0

B 1 1

C 2 2

;run;

/*Where statement: put function sensitive*/

*working;

data result1a;set test;

where put(flag_c, $1.) in ('0','1');

run;

*working;

data result1b;set test;

where put(flag_n, 1.) in ('0','1');

run;

*NOT working;

data result1c;set test;

where put(flag_c, 1.) in ('0','1');

run;

*NOT working;

data result1d;set test;

where put(flag_n, $1.) in ('0','1');

run;

/*IF statement: put function not sensitive*/

*working;

data result2a;set test;

if put(flag_c, $1.) in ('0','1');

run;

*working;

data result2b;set test;

if put(flag_n, 1.) in ('0','1');

run;

*working;

data result2c;set test;

if put(flag_c, 1.) in ('0','1');

run;

*working;

data result2d;set test;

if put(flag_n, $1.) in ('0','1');

run;

I was writing up a macro that has to subset a data set based on one flag. The flag can be character variable

in one data and numeric in another data. So I have to come up with a solution to handle both situations.

I decided to convert the flag (both character and numeric ones) into character variables using put function.

First, I tried where statement with put function. It doesn't work if the put function was not used properly or as intended

(see result1c , result1d).

Then, I tried if statement with put funciton. It looks like IF statement is not that sensentive to the proper and improper use of

put function and hanlded all the situations I may encounter.

MY question is:

How comes the difference between IF and WHERE statements in this case, why they act differently to

different (may be improper) uses of PUT function.

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
BurntDirt
Calcite | Level 5

I was trying to answer your question about the difference between IF and WHERE.

But, there are some good points that have been brought up.

Here is a semi-generic macro that checks the flag variable to determine if it is numeric or character and allows you to use a where statement,

which is more efficient for large datasets since it is done as the data is brought in and not after, and avoids the automatic conversions done by SAS.

%macro chkit(indata=test,outdata=result,flag=flag);

%let dsid=%sysfunc(open(&indata));
%let vartype = %sysfunc(vartype(&dsid,%sysfunc(varnum(&dsid,&flag))));
%let rc=%sysfunc(close(&dsid));

data &outdata;
    set &indata;

    %if &vartype = N %then
        %do;
        where &flag in (0,1);
        %end;
    %else
        %do;
        where &flag in ('0','1');
        %end;
run;

%mend chkit;

Good luck!

View solution in original post

6 REPLIES 6
BurntDirt
Calcite | Level 5

Where statements occur prior to the data being brought into the dataset.

If statements occur after the data is brought into the dataset.

Since the if statement occurs after the data is brought into the dataset SAS, SAS can do an automatic conversion of the data.

You should get notes in your log about this, something like

NOTE: Numeric data converted to character data at line so many times

NOTE: Character data converted to numeric data at line so many times

But, because the where statement occurs prior to the data being brought into the dataset, this automatic conversion can not occur.

So, you should get notes in your log, something like

NOTE: Data type must match in where statement

I'm doing the notes from memory, so I'm sure they are not right.

But, you get the idea.

Good luck!

BurntDirt
Calcite | Level 5

I was trying to answer your question about the difference between IF and WHERE.

But, there are some good points that have been brought up.

Here is a semi-generic macro that checks the flag variable to determine if it is numeric or character and allows you to use a where statement,

which is more efficient for large datasets since it is done as the data is brought in and not after, and avoids the automatic conversions done by SAS.

%macro chkit(indata=test,outdata=result,flag=flag);

%let dsid=%sysfunc(open(&indata));
%let vartype = %sysfunc(vartype(&dsid,%sysfunc(varnum(&dsid,&flag))));
%let rc=%sysfunc(close(&dsid));

data &outdata;
    set &indata;

    %if &vartype = N %then
        %do;
        where &flag in (0,1);
        %end;
    %else
        %do;
        where &flag in ('0','1');
        %end;
run;

%mend chkit;

Good luck!

ballardw
Super User

If the datasets are supposed to represent similar data structures you need to go back further in your process and find out why one has the variable flag_n as text in some sets and numeric as others and address the issue at the "source".

Most likely culprit is a Proc Import step.

The earlier a standard is set the easier the later processing goes.

Astounding
PROC Star

Well, the first thing to realize is that the PUT function should probably be removed from the program entirely.  Most likely, it gives you the wrong answer even when there are no syntax errors.  For example, this combination produces a match:

flag='123';

if put(flag, $1.) in ('0', '1');

It's perfectly OK to code:

if flag in ('0', '1');

At least that's assuming this matches your intentions.  If you want flags that begin with "0" or with "1", you can still code it without a PUT function:

if flag in : ('0', '1');

The second thing to realize is that this will not solve your problem.  To use a WHERE statement, you cannot compare numbers to characters.  An IF statement will convert one to the other, for purposes of making the comparison.  (You can check your log and see that a conversion took place for those cases.)  But the WHERE statement does not convert; it simply gives you the error you have already seen.

To solve this type of problem, you will need to write a more sophisticated macro.  Your macro will need to determine if your flag is numeric or character, and generate the appropriate WHERE statement.  There is more than one way to do this, but assuming you find one of them your macro will need to code the equivalent of:

%if (I found that FLAG is numeric in this particular data set) %then %do;

  where flag in (0, 1);

%end;

%else %do;

   where flag in ('0', '1');

%end;

Good luck.

jiangmi
Calcite | Level 5

Thank you all.

I will study your comments and give feed back later.

Joe

Tom
Super User Tom
Super User

A good way to silently convert unknown variable to character is to use one of the CAT() functions.

where cats(flag_n)  in ('0','1');


Note that you will still get an error message if the FLAG_N variable does not exist.

You really should fix the data type issue earlier in your process.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2172 views
  • 6 likes
  • 5 in conversation