Calcite | Level 5

## Count Missing Values

Hello All.

I am currently a student and working on a [simple] problem where we must count the missing values in the DATA SET. Can anyone tell me what I'm doing wrong? Here is my source data and my code thus far:

1 2 3
4 5
6 7 8
9 10 11

data miss;
infile "/folders/myfolders/Chapter_8/missing.txt";
input A : \$1.
B : \$1.
C : \$1.;
if missing(A) then MissA + 1;
else if missing(B) then MissB + 1;
else if missing(C) then MissC + 1;

run;
4 REPLIES 4
Diamond | Level 26

## Re: Count Missing Values

@yautja33 wrote:

Hello All.

I am currently a student and working on a [simple] problem where we must count the missing values in the DATA SET. Can anyone tell me what I'm doing wrong? Here is my source data and my code thus far:

Tell us what you see that is wrong.

And we don't have your .txt file, so we can't run this ourselves.

--
Paige Miller
PROC Star

## Re: Count Missing Values

A few things to think about ...

Did you mean to read in your variables as character variables instead of numeric?  That's the impact of adding \$ to the INPUT statement.

There is nothing in your DATA step that prints the results.

Adding ELSE to the logic is probably the wrong thing to do.  If A has a missing value, you should not skip over B.  You still need to inspect whether B has a missing value.

The syntax you use to check for missing values looks like this:

missing(A)

That won't compute.  You could use this instead:

A = " "

If you switch and read the variables as numeric, you would check for a missing value using:

A = .

So in context you might end up with:

if A = . then MissA + 1;

Ammonite | Level 13

## Re: Count Missing Values

If you have a file of any size (both in terms of number of observations and number of variables) you're going to find it much more efficient to use one of the procedures which gives you this statistic. I'd use Proc Means - here's an example using a modified version of SASHELP.CARS

data cars;
set sashelp.cars;
if type="SUV" then msrp=.;
if origin="Asia" then invoice=.;
if make="Acura" then origin="";
run;

proc means data=cars nmiss n;
run;

If you run that code you'll see it gives you the total number of missing and non-missing observations for all the numeric variables without you having to specify the variable names. You'll also see that you don't get a figure for the missing values in the character variable ORIGIN which I introduced in the data step. To do that you can use a trick I learnt from Rick Wicklin's Blog post here -> https://blogs.sas.com/content/iml/2011/09/19/count-the-number-of-missing-values-for-each-variable.ht...

proc format;
value \$missfmt ' '='Missing' other='Not Missing';
value  missfmt  . ='Missing' other='Not Missing';
run;

proc freq data=cars;
format _CHAR_ \$missfmt.; /* apply format for the duration of this PROC */
tables _CHAR_ / missing missprint nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing missprint nocum nopercent;
run;

The output looks different to the previous example as it uses Proc Freq but it gives you values for all variables, including the character ones.

Lapis Lazuli | Level 10

## Re: Count Missing Values

Correction:

data miss;
infile "/folders/myfolders/Chapter_8/missing.txt" dlm=' ' missover;
input A : \$1.
B : \$1.
C : \$1.;
if missing(A) then MissA + 1;
else if missing(B) then MissB + 1;
else if missing(C) then MissC + 1;
run;

Please let us know if it worked for you.

Discussion stats
• 4 replies
• 935 views
• 1 like
• 5 in conversation