Percent Missing List

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

Percent Missing List

I have a large survey data set with hundreds of variables, both numeric and character.  Each variable can take two missing values, M or N.  I want to make a list that shows the percent of M values for each variable.  I can define formats to collapse all the variables into M and not-M, but I'm stuck at producing the list.  With this many variables, hand-coding is out of the question.  I thought to make an output dataset with PROC FREQ, but that allows only one frequency distribution per output dataset.  The regular output from PROC FREQ contains all the headers, which I don't want to have to edit out.  I've also tried PROC TABULATE, but I can't work out how to get it to do what I want.  I could set up a macro to loop through each variable, run PROC FREQ, and concatenate the output datasets.  However, I feel like there must be a smack-in-the-head-of-course method that I'm missing.  What is it?

 

I want a list like this:

 

Variable Name    Percent M.

 

--Dav


Accepted Solutions
Solution
‎04-17-2017 09:40 AM
Super User
Posts: 9,865

Re: Percent Missing List

" both numeric and character.  Each variable can take two missing values, M or N."

 

Show us the sample data and the output you are looking for .

 

data have;
input a $ b;
cards;
M 12
c .M
x .N
n .M
;
run;

data _null_;
 set sashelp.vcolumn(keep=libname memname name type
 where=(upcase(libname)='WORK' and upcase(memname)='HAVE')) end=last;
 if _n_=1 then call execute('proc sql;create table temp as select ');
 if type='char' then call execute(cat('sum(',name,'="M")/count(*) as ',name));
  else call execute(cat('sum(',name,'=.M)/count(*) as ',name));
 if last then call execute('from have;quit;');
  else call execute(',');
run;
proc transpose data=temp out=want;
run;

View solution in original post


All Replies
Super User
Posts: 11,114

Re: Percent Missing List

Some example data would be helpful. You may end up going back to re-reading your raw data.

 

SAS has more than one "missing" actually a single variable can have up to 27 different missing values.

 

Here's a brief example of recoding values to the missing for numeric values.

data example;
   input q1 q2 q3;
   /* assume questions q1 through 3 are coded the same
   and a recorded respose of 77 means "Don't know"
   and 99 is "Refused"*/
   array q q1-q3;
   do i=1 to dim(q);
      if q[i]= 77 then q[i]= .D;
      if q[i]= 99 then q[i]= .R;
   end;

datalines;
1 2 77
2 3 1
99 4 3
1 . 2
;
run;

proc freq data=example;
   tables q1 q2 q3 ;
run;
proc format library=work;
value myq
.D= "Don't Know"
.R= "Refused"
. = "No Answer"
;
run;

proc freq data=example;
   tables q1 q2 q3/missing ;
   format q: myq.; 
run;

proc tabulate data=example;
   class q1 q2 q3/missing;
   format q1 q2 q3 myq.;
   tables q1 q2 q3, n pctn;
run;

If you need to do the exact same thing to all of the numeric variables you can use: Array arrayname _numeric_; which places all of the numeric variables into the array.

 

Same with character variables:

Array charvars _character_;

 

For anything else you should provide some concrete input data and what the result from that data is expected.

Super User
Posts: 5,360

Re: Percent Missing List

(a) are your variables character (possibly with a value of "M") or numeric (possibly with a value of .M)?

 

(b) when computing percentages, do you want all non-M values (including missing values) included in the denominator?

 

The programming isn't that difficult, but I need the answers first.

Solution
‎04-17-2017 09:40 AM
Super User
Posts: 9,865

Re: Percent Missing List

" both numeric and character.  Each variable can take two missing values, M or N."

 

Show us the sample data and the output you are looking for .

 

data have;
input a $ b;
cards;
M 12
c .M
x .N
n .M
;
run;

data _null_;
 set sashelp.vcolumn(keep=libname memname name type
 where=(upcase(libname)='WORK' and upcase(memname)='HAVE')) end=last;
 if _n_=1 then call execute('proc sql;create table temp as select ');
 if type='char' then call execute(cat('sum(',name,'="M")/count(*) as ',name));
  else call execute(cat('sum(',name,'=.M)/count(*) as ',name));
 if last then call execute('from have;quit;');
  else call execute(',');
run;
proc transpose data=temp out=want;
run;

Contributor
Posts: 40

Re: Percent Missing List

Thanks to everyone who replied.  I am aware that SAS variables can take multiple missing values, and that there is a difference in the way they are treated for numeric an character variables.  In my case, I have both character and numeric variables.  For character variables, missing values are indcated by 'M' or 'N'; for numeric, .M or .N.  I want to list the percentage of all cases that have the 'M' or .M value, as applicable, for every variable, of which there are a great many.  

 

For a totally made-up example, see the pictures below:

 

have.PNG

 

want.PNG

Super User
Posts: 9,865

Re: Percent Missing List

So did you try my code ?


Contributor
Posts: 40

Re: Percent Missing List

Yes, I did.  It does work.  Thank you for your help.  My pride is assuaged by the fact that it was not as simple and obvious as I had feared.

 

--Dav

Super User
Posts: 9,865

Re: Percent Missing List

if you want simpler, IML code might do it .

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 208 views
  • 0 likes
  • 4 in conversation