Hello,
I am practicing with these data steps and if/else scenarios. Both data steps run fine but for the missing values being output to work.misschol I get different results. I can't understand why the highlited statement below isn't working when using = . but missing(cholesterol) is working. The cholesterol variable is numeric.
data work.lowchol work.highchol work.misschol; set sashelp.heart; if cholesterol lt 200 then output work.lowchol; else if cholesterol ge 200 then output work.highchol; else if cholesterol = . then output work.misschol; run; data work.lowchol work.highchol work.misschol; set sashelp.heart; if missing(cholesterol) then output work.misschol; else if cholesterol lt 200 then output work.lowchol; else if cholesterol ge 200 then output work.highchol; run;
Thank you for your time and help in advance!
In SAS, missing values for numerics are taken to be "less than" all valid values, including all valid negative values. So missing values meet your first test condition in
if cholesterol lt 200 then
output work.lowchol;
else if cholesterol ge 200 then
output work.highchol;
else if cholesterol = . then
output work.misschol;
As a result, the 3rd condition ("else if cholesterol=.") is never assessed.
If you change the order of conditions, you should get what you expect:
if cholesterol = . then
output work.misschol;
else if cholesterol lt 200 then
output work.lowchol;
else if cholesterol ge 200 then
output work.highchol;
or better yet (to include missing values from .A to .Z and ._):
if missing(cholesterol) then
output work.misschol;
else if cholesterol lt 200 then
output work.lowchol;
else if cholesterol ge 200 then
output work.highchol;
In SAS, missing values for numerics are taken to be "less than" all valid values, including all valid negative values. So missing values meet your first test condition in
if cholesterol lt 200 then
output work.lowchol;
else if cholesterol ge 200 then
output work.highchol;
else if cholesterol = . then
output work.misschol;
As a result, the 3rd condition ("else if cholesterol=.") is never assessed.
If you change the order of conditions, you should get what you expect:
if cholesterol = . then
output work.misschol;
else if cholesterol lt 200 then
output work.lowchol;
else if cholesterol ge 200 then
output work.highchol;
or better yet (to include missing values from .A to .Z and ._):
if missing(cholesterol) then
output work.misschol;
else if cholesterol lt 200 then
output work.lowchol;
else if cholesterol ge 200 then
output work.highchol;
Hi @mkeintz,
Thank you for your help and explanation. I wasn't aware that the order of my statements would cause this to occur. Glad to have learned that moving the =. to the first if statement would work. I did test and got the correct results. That being said I will continue to use missing() for this scenario to consider other values. Thanks again!
Well, it could be a couple of things.
Regarding #3, try this:
Change this:
if cholesterol lt 200 then
output work.lowchol;
else if cholesterol ge 200 then
output work.highchol;
else if cholesterol = . then
output work.misschol;
to the below and re-run. I'd be very interested to know your results.
if cholesterol = . then
output work.misschol;
else if cholesterol lt 200 then
output work.lowchol;
else
output work.highchol;
Hope that helps,
Jim
Hello @jimbarbour , Thanks for your time.I did try the below code and it did work perfect. It does seem to simply the code as all left over results are sent to highcol. The misschol also had the correct results, but to your point I think from here on out I will utilize missing() for this. Dumb question but can I use the missing function for character values as well or is this a numeric function only?
if cholesterol = . then output work.misschol; else if cholesterol lt 200 then output work.lowchol; else output work.highchol;
MISSING() works on character values also. Strings with only spaces are considered missing.
@JC411911 wrote:
Hello @jimbarbour , Thanks for your time.I did try the below code and it did work perfect. It does seem to simply the code as all left over results are sent to highcol. The misschol also had the correct results, but to your point I think from here on out I will utilize missing() for this. Dumb question but can I use the missing function for character values as well or is this a numeric function only?
if cholesterol = . then output work.misschol; else if cholesterol lt 200 then output work.lowchol; else output work.highchol;
As mentioned by @Tom, the MISSING function works on both Character or Numeric data. It's a great function. 🙂
So several I think good points here:
Jim
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.