BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jannie_D
Calcite | Level 5

Hi. I have a big dataset on a secure server.

I am cleaning my data, but now I found out that one code does not work and I can not figure out why, maybe you have an suggestion. 

The code is:

data total;

set mergetot;

where not missing(dato_diagnose) and not missing(totsssy) or not missing(totlmdb) or not missing(totdrga) or not missing(totdrgh);

run;

 

I want to only have the observations where there is a value for dato_diagnose and where at least one of totsssy, totlmdb, totdrga and totdrgh has a value.

somehow I still get observations where there is no value for dato_diagnose.

 

can you find the mistake?

 

kind regards 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

In Boolean (yes/no) logic, the AND operator takes precedence over OR (like multiplication takes precedence over addition in calculus). This means that your condition will be evaluated like this:

where
  (not missing(dato_diagnose) and not missing(totsssy))
  or not missing(totlmdb)
  or not missing(totdrga)
  or not missing(totdrgh)
;

To make your condition work as intended, you need to force precedence by using brackets:

where
  not missing(dato_diagnose) and (
    not missing(totsssy)
    or not missing(totlmdb)
    or not missing(totdrga)
    or not missing(totdrgh)
  )
;

Note that it is always a great help to make functional code blocks visible by using indentation.

View solution in original post

2 REPLIES 2
PaigeMiller
Diamond | Level 26

Perhaps this, which will test to see if there is at least one non-missing among the rest of the variables

 

where not missing(dato_diagnose) AND (not missing(totsssy) + not missing(totlmdb) + not missing(totdrga) + not missing(totdrgh))>0;

--
Paige Miller
Kurt_Bremser
Super User

In Boolean (yes/no) logic, the AND operator takes precedence over OR (like multiplication takes precedence over addition in calculus). This means that your condition will be evaluated like this:

where
  (not missing(dato_diagnose) and not missing(totsssy))
  or not missing(totlmdb)
  or not missing(totdrga)
  or not missing(totdrgh)
;

To make your condition work as intended, you need to force precedence by using brackets:

where
  not missing(dato_diagnose) and (
    not missing(totsssy)
    or not missing(totlmdb)
    or not missing(totdrga)
    or not missing(totdrgh)
  )
;

Note that it is always a great help to make functional code blocks visible by using indentation.