BookmarkSubscribeRSS Feed
StevWPPI
Calcite | Level 5

This is roughly the DATA step I am running:

DATA work.dataset;

           SET db.table;

WHERE StringVariable = 'LOAD' AND NumberVariable = 1;

RUN;

In db.table, NumberVariable takes the value of either 0 or -1. Running this DATA step, one would think that this would bring in 0 observations since there is no instance where NumberVariable=1. However, It does load all observations into work.dataset where NumberVariable= -1 but not where NumberVariable = 0, so it's as if the WHERE statement only looks at the absolute value of NumberVariable

Both Format and Informat of NumberVariable is "2." If I remove both the format and informat (i.e. so that running a PROC CONTENTS of this dataset shows blank spaces under the FORMAT and INFORMAT columns for NumberVariable), it still brings in observations where NumberVariable = -1.

HOWEVER, when I use an IF statement instead of the WHERE statement, so that it reads:

DATA work.dataset;

SET db.table;

IF StringVariable = 'LOAD' AND NumberVariable = 1;

RUN;

….then 0 observations are read into work.dataset, as one would think as there are no observations in db.table where NumberVariable =1, only those where it =0 or =-1

Does anyone know why this is?

6 REPLIES 6
art297
Opal | Level 21

Can you provide a simple example, including a dataset in a data step (like the following) where this occurs?  When I run the following I don't get any records.  Do you if you run it?

data table;

  input StringVariable $ NumberVariable;

  cards;

LOAD 0

LOAD 0

LOAD -1

LOAD 0

LIGHT -1

LOAD 0

LOAD -1

;

DATA work.dataset;

  SET table;

  WHERE StringVariable = 'LOAD' AND NumberVariable = 1;

RUN;

StevWPPI
Calcite | Level 5

Thanks for your reply.

When I run your test code, I get the same results, 0 records in work.dataset.

I wonder if it has to do with how the data are stored on the database from which I am pulling the data? I believe our IT dept. stores these data on our servers using SQL. I am not terribly experienced with database management, so I don't want to say too much about unfamiliar waters.

Going back to my original example, if I PROC PRINT on the dataset in my work folder:

PROC PRINT DATA= work.dataset;

WHERE NumberVariable = 1;

RUN;

There were 0 observations read from the data set, according to my Log window.

However, if I PROC PRINT on the original dataset, db.table:

PROC PRINT DATA= db.table;

WHERE NumberVariable = 1;

RUN;

Then all observations in db.table are read, even though for every observation in the Ouptut window, NumberVariable is -1

art297
Opal | Level 21

Since the data are coming from SQL Server, one possibility is that they are actually boolean values.  In SQL Server true is represented as -1, where in SAS it is +1.  Are you using an older version of SAS (e.g., ver 8)?  I ask because there is a hotfix to correct for an optimizer problem related to that specific issue.  It was supposed to be corrected in ver 9.

StevWPPI
Calcite | Level 5

This is entirely possible. So far in other SQL server datasets at my company that I have encountered in SAS, binary variables are also either 0 or -1. I'm using SAS 9.2

Is SAS reading "WHERE NumberVariable=1" as "bring in observations from the SQL server where NumberVariable is true"? But then when the data are stored in a SAS dataset in my work folder, (in work.dataset), what was "true" is now stored as the number "-1" and SAS treats the variable NumberVariable in work.dataset as a number variable and not a boolean, so that -1 and 0 are disassociated from their orignial "true" and "false" meanings?

art297
Opal | Level 21

Steve, If that is indeed the issue, there is a saving grace.  SAS treats not zero as true.  Thus you could get around it by specifying the logic as boolean.  For example:

data table;

  format NumberVariable 2.;

  informat NumberVariable 2.;

  input StringVariable $ NumberVariable;

  cards;

LOAD 0

LOAD 0

LOAD -1

LOAD 0

LIGHT -1

LOAD 0

LOAD -1

;

DATA work.dataset;

  SET table;

  WHERE StringVariable = 'LOAD' AND NumberVariable;

RUN;

StevWPPI
Calcite | Level 5

Excellent, I imagine that this is the issue, and I will also consult with our DB admins.

Thank you very much for your assistance!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1267 views
  • 5 likes
  • 2 in conversation