- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posted 03-15-2011 02:14 PM
(4721 views)
I have a data set with grouped variables, call them x y z. I also have a variable 'a' with observations corresponding to each x, y, z. I want to count missing values of variable 'a' for each record of grouped x y z.
My thoughts to do this were make a variable representing a numeric assignment for each unique x y z and then make a binary variable representing missing and non-missing for each value of 'a'. The next step would be to generate a frequency table for each numeric identifier given to groups x y z.
I know how to do the binary assignment of missing/non-missing for a but do not know how to assign an identifier for each x y z. I also do not know how to generate a freq. table for the numeric assignment of x y z. I am comfortable with proc freq as opposed to a more exotic procedure statement.
thanks much
My thoughts to do this were make a variable representing a numeric assignment for each unique x y z and then make a binary variable representing missing and non-missing for each value of 'a'. The next step would be to generate a frequency table for each numeric identifier given to groups x y z.
I know how to do the binary assignment of missing/non-missing for a but do not know how to assign an identifier for each x y z. I also do not know how to generate a freq. table for the numeric assignment of x y z. I am comfortable with proc freq as opposed to a more exotic procedure statement.
thanks much
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Something like this might do the trick without any pre-processing:
proc sql;
create table test2 as
select x
,y
,z
,sum(missing(a)) as miss_count
from test
group by x, y, z
;
quit;
proc sql;
create table test2 as
select x
,y
,z
,sum(missing(a)) as miss_count
from test
group by x, y, z
;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
what about something like this:
proc sort data=have;
by x y z;
run;
proc means data=have n nmiss;
by x y z;
var a;
run;
X Y Z do not have to be numeric.
In this example A would have to be numeric. If its not numeric you can do your conversion mentioned. Or you could apply a format to it.
There's a neat trick in the paper by Lois Levin (Paper 001-30). Add in the line
By x y z;
and replace the _numeric_ and _character_ by what type your variable is and delete the other.
[pre]
proc format;
value $ missfmt ' '="Missing"
other="Not Missing"
;
value nmissfmt . ="Missing"
other="Not Missing"
;
run;
proc freq data=home.ex2;
tables _numeric_ _character_/missing;
format _numeric_ nmissfmt.
_character_ $missfmt.;
run;
[/pre]
proc sort data=have;
by x y z;
run;
proc means data=have n nmiss;
by x y z;
var a;
run;
X Y Z do not have to be numeric.
In this example A would have to be numeric. If its not numeric you can do your conversion mentioned. Or you could apply a format to it.
There's a neat trick in the paper by Lois Levin (Paper 001-30). Add in the line
By x y z;
and replace the _numeric_ and _character_ by what type your variable is and delete the other.
[pre]
proc format;
value $ missfmt ' '="Missing"
other="Not Missing"
;
value nmissfmt . ="Missing"
other="Not Missing"
;
run;
proc freq data=home.ex2;
tables _numeric_ _character_/missing;
format _numeric_ nmissfmt.
_character_ $missfmt.;
run;
[/pre]
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc format;
value nmissfmt . ="Missing"
other="Not Missing"
;
run;
data temp;
set sashelp.class;
if _n_ in ( 2 3 5 8) then call missing(weight);
run;
proc freq data=temp;
tables sex*age*weight /missing list nopercent nocum out=want;
format weight nmissfmt.;
run;
[pre]
Ksharp[/pre]