Calcite | Level 5

## count missing values by grouped variables

I have a data set with grouped variables, call them x y z. I also have a variable 'a' with observations corresponding to each x, y, z. I want to count missing values of variable 'a' for each record of grouped x y z.

My thoughts to do this were make a variable representing a numeric assignment for each unique x y z and then make a binary variable representing missing and non-missing for each value of 'a'. The next step would be to generate a frequency table for each numeric identifier given to groups x y z.

I know how to do the binary assignment of missing/non-missing for a but do not know how to assign an identifier for each x y z. I also do not know how to generate a freq. table for the numeric assignment of x y z. I am comfortable with proc freq as opposed to a more exotic procedure statement.

thanks much
3 REPLIES 3
PROC Star

## Re: count missing values by grouped variables

Something like this might do the trick without any pre-processing:

proc sql;
create table test2 as
select x
,y
,z
,sum(missing(a)) as miss_count
from test
group by x, y, z
;
quit;
Super User

## Re: count missing values by grouped variables

proc sort data=have;
by x y z;
run;

proc means data=have n nmiss;
by x y z;
var a;
run;

X Y Z do not have to be numeric.
In this example A would have to be numeric. If its not numeric you can do your conversion mentioned. Or you could apply a format to it.

There's a neat trick in the paper by Lois Levin (Paper 001-30). Add in the line
By x y z;
and replace the _numeric_ and _character_ by what type your variable is and delete the other.

[pre]
proc format;
value \$ missfmt ' '="Missing"
other="Not Missing"
;
value nmissfmt . ="Missing"
other="Not Missing"
;
run;
proc freq data=home.ex2;
tables _numeric_ _character_/missing;
format _numeric_ nmissfmt.
_character_ \$missfmt.;
run;
[/pre]
Super User

## Re: count missing values by grouped variables

proc format;

value nmissfmt . ="Missing"

other=
"Not Missing"

;

run;

data temp;

set sashelp.class;

if _n_ in ( 2 3 5 8) then call missing(weight);

run;

proc freq data=temp;

tables sex*age*weight /missing list nopercent nocum out=want;

format weight nmissfmt.;

run;

[pre]

Ksharp[/pre]

Discussion stats
• 3 replies
• 3975 views
• 0 likes
• 4 in conversation