@Kurt_Bremser wrote:
I guess you did something to the dataset between the two PROC FREQs, or used different datasets. The number of missing values is not changed by a format that does not explicitly create missing values, or reformat them to a non-missing value. Both of which your format does not do.
A value not "caught" by the format would be used raw (unformatted).
Run both PROC FREQs which create your outputs, and the PROC FORMAT in immediate succession, and post the complete log of all three steps.
PROC FREQ will distinguish between character variables that are all blanks (what it count as missing) and those that are blank after the format is applied. If there are no observations with a completely empty value then the formatted blanks are treated as a normal category. If there is at least on completely empty value then all of the formatted values that print as blank are treated as missing.
data have ;
input crust_type $char30.;
cards;
Crust Thickness Unknown
Continental Crust
Intermediate Crust
Oceanic Crust
;
proc freq data=have;
tables crust_type;
format crust_type $1.;
run;
proc freq data=have;
where crust_type ne ' ';
tables crust_type;
format crust_type $1.;
run;
And if all of the values format as blank and at least one of them is all blank you get the output from the original posting. With an empty table section and the count of missing in the footer.
proc freq data=have;
where crust_type =: ' ';
tables crust_type;
format crust_type $1.;
run;
... View more