BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Davanden
Obsidian | Level 7

I have a large survey data set with hundreds of variables, both numeric and character.  Each variable can take two missing values, M or N.  I want to make a list that shows the percent of M values for each variable.  I can define formats to collapse all the variables into M and not-M, but I'm stuck at producing the list.  With this many variables, hand-coding is out of the question.  I thought to make an output dataset with PROC FREQ, but that allows only one frequency distribution per output dataset.  The regular output from PROC FREQ contains all the headers, which I don't want to have to edit out.  I've also tried PROC TABULATE, but I can't work out how to get it to do what I want.  I could set up a macro to loop through each variable, run PROC FREQ, and concatenate the output datasets.  However, I feel like there must be a smack-in-the-head-of-course method that I'm missing.  What is it?

 

I want a list like this:

 

Variable Name    Percent M.

 

--Dav

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

" both numeric and character.  Each variable can take two missing values, M or N."

 

Show us the sample data and the output you are looking for .

 

data have;
input a $ b;
cards;
M 12
c .M
x .N
n .M
;
run;

data _null_;
 set sashelp.vcolumn(keep=libname memname name type
 where=(upcase(libname)='WORK' and upcase(memname)='HAVE')) end=last;
 if _n_=1 then call execute('proc sql;create table temp as select ');
 if type='char' then call execute(cat('sum(',name,'="M")/count(*) as ',name));
  else call execute(cat('sum(',name,'=.M)/count(*) as ',name));
 if last then call execute('from have;quit;');
  else call execute(',');
run;
proc transpose data=temp out=want;
run;

View solution in original post

7 REPLIES 7
ballardw
Super User

Some example data would be helpful. You may end up going back to re-reading your raw data.

 

SAS has more than one "missing" actually a single variable can have up to 27 different missing values.

 

Here's a brief example of recoding values to the missing for numeric values.

data example;
   input q1 q2 q3;
   /* assume questions q1 through 3 are coded the same
   and a recorded respose of 77 means "Don't know"
   and 99 is "Refused"*/
   array q q1-q3;
   do i=1 to dim(q);
      if q[i]= 77 then q[i]= .D;
      if q[i]= 99 then q[i]= .R;
   end;

datalines;
1 2 77
2 3 1
99 4 3
1 . 2
;
run;

proc freq data=example;
   tables q1 q2 q3 ;
run;
proc format library=work;
value myq
.D= "Don't Know"
.R= "Refused"
. = "No Answer"
;
run;

proc freq data=example;
   tables q1 q2 q3/missing ;
   format q: myq.; 
run;

proc tabulate data=example;
   class q1 q2 q3/missing;
   format q1 q2 q3 myq.;
   tables q1 q2 q3, n pctn;
run;

If you need to do the exact same thing to all of the numeric variables you can use: Array arrayname _numeric_; which places all of the numeric variables into the array.

 

Same with character variables:

Array charvars _character_;

 

For anything else you should provide some concrete input data and what the result from that data is expected.

Astounding
PROC Star

(a) are your variables character (possibly with a value of "M") or numeric (possibly with a value of .M)?

 

(b) when computing percentages, do you want all non-M values (including missing values) included in the denominator?

 

The programming isn't that difficult, but I need the answers first.

Ksharp
Super User

" both numeric and character.  Each variable can take two missing values, M or N."

 

Show us the sample data and the output you are looking for .

 

data have;
input a $ b;
cards;
M 12
c .M
x .N
n .M
;
run;

data _null_;
 set sashelp.vcolumn(keep=libname memname name type
 where=(upcase(libname)='WORK' and upcase(memname)='HAVE')) end=last;
 if _n_=1 then call execute('proc sql;create table temp as select ');
 if type='char' then call execute(cat('sum(',name,'="M")/count(*) as ',name));
  else call execute(cat('sum(',name,'=.M)/count(*) as ',name));
 if last then call execute('from have;quit;');
  else call execute(',');
run;
proc transpose data=temp out=want;
run;

Davanden
Obsidian | Level 7

Thanks to everyone who replied.  I am aware that SAS variables can take multiple missing values, and that there is a difference in the way they are treated for numeric an character variables.  In my case, I have both character and numeric variables.  For character variables, missing values are indcated by 'M' or 'N'; for numeric, .M or .N.  I want to list the percentage of all cases that have the 'M' or .M value, as applicable, for every variable, of which there are a great many.  

 

For a totally made-up example, see the pictures below:

 

have.PNG

 

want.PNG

Ksharp
Super User
So did you try my code ?


Davanden
Obsidian | Level 7

Yes, I did.  It does work.  Thank you for your help.  My pride is assuaged by the fact that it was not as simple and obvious as I had feared.

 

--Dav

Ksharp
Super User
if you want simpler, IML code might do it .

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1761 views
  • 0 likes
  • 4 in conversation