BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Davanden
Obsidian | Level 7

I have a large survey data set with hundreds of variables, both numeric and character.  Each variable can take two missing values, M or N.  I want to make a list that shows the percent of M values for each variable.  I can define formats to collapse all the variables into M and not-M, but I'm stuck at producing the list.  With this many variables, hand-coding is out of the question.  I thought to make an output dataset with PROC FREQ, but that allows only one frequency distribution per output dataset.  The regular output from PROC FREQ contains all the headers, which I don't want to have to edit out.  I've also tried PROC TABULATE, but I can't work out how to get it to do what I want.  I could set up a macro to loop through each variable, run PROC FREQ, and concatenate the output datasets.  However, I feel like there must be a smack-in-the-head-of-course method that I'm missing.  What is it?

 

I want a list like this:

 

Variable Name    Percent M.

 

--Dav

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

" both numeric and character.  Each variable can take two missing values, M or N."

 

Show us the sample data and the output you are looking for .

 

data have;
input a $ b;
cards;
M 12
c .M
x .N
n .M
;
run;

data _null_;
 set sashelp.vcolumn(keep=libname memname name type
 where=(upcase(libname)='WORK' and upcase(memname)='HAVE')) end=last;
 if _n_=1 then call execute('proc sql;create table temp as select ');
 if type='char' then call execute(cat('sum(',name,'="M")/count(*) as ',name));
  else call execute(cat('sum(',name,'=.M)/count(*) as ',name));
 if last then call execute('from have;quit;');
  else call execute(',');
run;
proc transpose data=temp out=want;
run;

View solution in original post

7 REPLIES 7
ballardw
Super User

Some example data would be helpful. You may end up going back to re-reading your raw data.

 

SAS has more than one "missing" actually a single variable can have up to 27 different missing values.

 

Here's a brief example of recoding values to the missing for numeric values.

data example;
   input q1 q2 q3;
   /* assume questions q1 through 3 are coded the same
   and a recorded respose of 77 means "Don't know"
   and 99 is "Refused"*/
   array q q1-q3;
   do i=1 to dim(q);
      if q[i]= 77 then q[i]= .D;
      if q[i]= 99 then q[i]= .R;
   end;

datalines;
1 2 77
2 3 1
99 4 3
1 . 2
;
run;

proc freq data=example;
   tables q1 q2 q3 ;
run;
proc format library=work;
value myq
.D= "Don't Know"
.R= "Refused"
. = "No Answer"
;
run;

proc freq data=example;
   tables q1 q2 q3/missing ;
   format q: myq.; 
run;

proc tabulate data=example;
   class q1 q2 q3/missing;
   format q1 q2 q3 myq.;
   tables q1 q2 q3, n pctn;
run;

If you need to do the exact same thing to all of the numeric variables you can use: Array arrayname _numeric_; which places all of the numeric variables into the array.

 

Same with character variables:

Array charvars _character_;

 

For anything else you should provide some concrete input data and what the result from that data is expected.

Astounding
PROC Star

(a) are your variables character (possibly with a value of "M") or numeric (possibly with a value of .M)?

 

(b) when computing percentages, do you want all non-M values (including missing values) included in the denominator?

 

The programming isn't that difficult, but I need the answers first.

Ksharp
Super User

" both numeric and character.  Each variable can take two missing values, M or N."

 

Show us the sample data and the output you are looking for .

 

data have;
input a $ b;
cards;
M 12
c .M
x .N
n .M
;
run;

data _null_;
 set sashelp.vcolumn(keep=libname memname name type
 where=(upcase(libname)='WORK' and upcase(memname)='HAVE')) end=last;
 if _n_=1 then call execute('proc sql;create table temp as select ');
 if type='char' then call execute(cat('sum(',name,'="M")/count(*) as ',name));
  else call execute(cat('sum(',name,'=.M)/count(*) as ',name));
 if last then call execute('from have;quit;');
  else call execute(',');
run;
proc transpose data=temp out=want;
run;

Davanden
Obsidian | Level 7

Thanks to everyone who replied.  I am aware that SAS variables can take multiple missing values, and that there is a difference in the way they are treated for numeric an character variables.  In my case, I have both character and numeric variables.  For character variables, missing values are indcated by 'M' or 'N'; for numeric, .M or .N.  I want to list the percentage of all cases that have the 'M' or .M value, as applicable, for every variable, of which there are a great many.  

 

For a totally made-up example, see the pictures below:

 

have.PNG

 

want.PNG

Ksharp
Super User
So did you try my code ?


Davanden
Obsidian | Level 7

Yes, I did.  It does work.  Thank you for your help.  My pride is assuaged by the fact that it was not as simple and obvious as I had feared.

 

--Dav

Ksharp
Super User
if you want simpler, IML code might do it .

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2309 views
  • 0 likes
  • 4 in conversation