I would like to bin a column with some numeric values, for some reason missing values are treated as negative in a CASE WHEN
statement, but works fine is a WHERE statement. Why?
data test;
infile datalines delimiter=',';
input data;
datalines;
-1
0
1
.
;RUN;
/* Why is missing treated as negative */
PROC SQL;
CREATE TABLE temp AS
SELECT
data
, CASE
WHEN data LT 0 THEN 'Negative'
WHEN data EQ 0 THEN 'Zero'
WHEN data GT 0 THEN 'Posiive'
WHEN data EQ . THEN 'Missing'
ELSE 'No Match' END AS data2
FROM
test
;QUIT;
/* Works as expected */
PROC SQL;
SELECT
*
FROM
test
WHERE
data EQ .
;QUIT;
Missing in SAS is considered to be less than any negative number, so when you test to see if DATA is less than 0, this is true, missing is less than zero, and your CASE clause never gets to the part where it tests for missing; hence the missing should return the value "Negative".
Please include the below step
data ne . and data LT 0
PROC SQL;
CREATE TABLE temp AS
SELECT
data
, CASE
WHEN data ne . and data LT 0 THEN 'Negative'
WHEN data EQ 0 THEN 'Zero'
WHEN data GT 0 THEN 'Posiive'
WHEN data EQ . THEN 'Missing'
ELSE 'No Match' END AS data2
FROM
test
;QUIT;
Slightly reorder and simplify your case logic:
data test;
infile datalines delimiter=',';
input data;
datalines;
-1
0
1
.
;
run;
proc sql;
create table temp AS
select
data,
case
when data eq . then 'Missing'
when data lt 0 then 'Negative'
when data eq 0 then 'Zero'
when data gt 0 then 'Positive'
end as data2
from test
;
quit;
proc print data=temp noobs;
run;
Result:
data data2 -1 Negative 0 Zero 1 Positive . Missing
HI @Jeg123 You could consider using formats (user defined format using proc format. Ideal for reuse
Hi @Jeg123 If you really want in proc sql. I would do
WHEN .<data<0 THEN 'Negative'
as that is very convenient to see and edit.
Applying the same in your example-
data test;
infile datalines delimiter=',';
input data;
datalines;
-1
0
1
.
;RUN;/* Why is missing treated as negative */
PROC SQL;
CREATE TABLE temp AS
SELECT
data
, CASE
WHEN .<data<0 THEN 'Negative'
WHEN data EQ 0 THEN 'Zero'
WHEN data GT 0 THEN 'Posiive'
WHEN data EQ . THEN 'Missing'
ELSE 'No Match' END AS data2
FROM
test
;QUIT;
data | data2 |
---|---|
-1 | Negative |
0 | Zero |
1 | Posiive |
. | Missing |
If this is part of a real problem needing information of negative, positive or 0 perhaps the SIGN function is sufficient. Though instead of returning text it returns values of -1 for negative, 0 or +1 for positive and missing when the value is missing.
data test; infile datalines delimiter=','; input data; datalines; -1 0 1 . -34 3.14156 ; RUN; PROC SQL; CREATE TABLE temp AS SELECT data ,sign(data) AS data2 FROM test ; QUIT;
As others have pointed out SAS treats missing values as less than any actual number.
Also note that SAS has 28 different missing values and your code is only testing for one of them.
It is probably simplest to just test for missing first in the CASE statement.
proc sql;
create table temp as
select
data
, case
when missing(data) then 'Missing'
when data lt 0 then 'Negative'
when data gt 0 then 'Posiive'
when data eq 0 then 'Zero'
else 'Impossible Value'
end as data2
from
test
;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.