Interesting article, and I think one of the things you did in the program is worthy of its own article someday.
You used the DIVIDE function to eliminate "divide by zero" messages in the log. It also eliminates "missing values were generated" and "mathematical operations could not be performed" messages.
DIVIDE is an underused function. There is great benefit to eliminating those messages from the log in cases when you know problematic values might appear in your data. A large number of unneeded messages in the log makes it too easy to overlook real errors. It's more work, but better programming practice, to create only messages that actually need attention.
DIVIDE is a relatively new function. You can create a similar result with old-fashioned code like
if y = 0 or nmiss(x, y) then
z = .;
else
z = x / y;
The result is similar but not identical to the DIVIDE function because DIVIDE might also return the special missing values .I, .M, and ._
In the olden days, we hardly ever encountered missing values other than . , and the test for missing values in a programs was a simple
if x = . then /* do stuff */;
But with the introduction of the DIVIDE function, that no longer suffices. It's better to code
if x is missing then /* do stuff */;
or
if nmiss(x) then /* do stuff */;
In PROC FORMAT, where those tests aren't available, you have to specify all the missing values explicitly:
proc format;
value missb
other = [best4.]
., .a-.z, ._ = 'missing';
run;
I wouldn't know without looking it up whether .A comes before or after . or ._, so it's easier for me to list all three missing values ranges. Looking it up, I see that I could have specified
proc format;
value missb
other = [best4.]
._ - .Z = 'missing';
run;
That's not an obvious order. The SAS missing value sort order is
._
.
.A-.Z
but even knowing the ASCII sorting sequence doesn't help; it's
<space> .
A-Z
_
and the EBCDIC sort sequence is
<space>
.
_
A-Z
... View more