09-13-2012 05:28 PM
I'm doing some quality checking on a data set I have and want to include in a where statement some command to identify numbers that are not integers over a certain interval.
Particularly I'm trying to do this for the variable for maternal age (mage). How do I pull out observations where the mage variable is not an integer?
PROC PRINT DATA=work.DATA;
(dead NE 1 AND dead NE 0) OR
(sex NE 1 AND sex NE 0) OR
(rc_gp NE 1 AND rc_gp NE 2 AND rc_gp NE 3 AND rc_gp NE 4 AND rc_gp NE 5) OR
(hisp NE 0 AND hisp NE 1) OR
(8 GE mage OR mage GE 85)
(medu LE 5 OR medu GT 17);
09-14-2012 01:39 AM
May be a quite different approach:
- suppose you have SAS-formats defined to the wanted valid integer values
For the variable sex:
0='M' or male,
1='F' or female m ,
9= ' ' missing ,
other= "-" invalid ... the meaning.
- Testing on the formatted value of you variable will give a very strict data quality approach
Use the put or input function to recode/format values.
Remember that numbers in SAS are always of type floating. The storage needed is requiring normally 8 bytes.
Sometimes 1 is not 1 but nearly 1 and you don't see it. It can be a reason of a failed assignment. Rounding up can be necessary.
To circumvent the floating issues you can think about having it set as character-types. 0, 1 9 can be perfectly stored and processed as character-values. The requirement is you don't need to calculate with these values. Just counting the values in a population is no problem.