The following code gets the job done for numeric variables but not for character variables because proc summary only deals with numeric. The output of the code in the table _final_Alpha gives me the number of missing values only for the numeric variables but I would like it to display that output for the character variables as well. I'm not looking for missing levels data but rather the absolute number of missing values.
The problem is in the proc summary part of the code:
proc summary data=&set;
var _numeric_;
output out=smry nmiss=;
run;
proc transpose data=smry(drop=_type_ _freq_) out=smry_;
run;
Do you know the equivalent code that deals with character variables? My idea is to then get two tables, one dealing with character variables and the other, with numeric. And then, finally, to just append one below the other.
This is what I have with me so far:
%macro task_Oct(set);
proc contents data=&set out=con_(rename=(name=tablevar));
run;
proc summary data=&set;
var _numeric_;
output out=smry nmiss=;
run;
proc transpose data=smry(drop=_type_ _freq_) out=smry_;
run;
proc sort data=smry_;
by _name_;
run;
proc freq nlevels data = &set;
ods output nlevels=distinct;
run;
proc sort data=distinct;
by tablevar;
run;
data final_Alpha (keep=TableVar distinct_ TYPE miss);
merge distinct(rename=(Nlevels=distinct_)) con_(keep=tablevar type) smry_(rename=(_name_= tablevar col1 = miss));
by tablevar;
run;
proc sort data=final_Alpha out=_final_Alpha;
by descending miss;
run;
ods select nlevels;
proc freq nlevels data=&set;
tables _all_;
run;
%mend task_Oct;
%task_Oct(sashelp.cars)
Thank you.
A bit overkill but this does missing/non missing for both numeric/character in one data set.
https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111
%macro task_Oct(set);
proc contents data=&set out=con_(rename=(name=tablevar));
run;
proc format;
value $ missfmt ' '="Missing" other="Not Missing";
value nmissfmt .="Missing" other="Not Missing";
run;
*Proc freq to count missing/non missing;
ods select none;
*turns off the output so the results do not get too messy;
ods table onewayfreqs=_summary;
proc freq data=&set.;
table _all_ / missing;
format _numeric_ nmissfmt. _character_ $missfmt.;
run;
ods select all;
data long;
length tablevar $32. variable_value $50.;
set _summary;
tablevar=scan(table, 2);
Variable_Value=strip(trim(vvaluex(tablevar)));
presentation=catt(frequency, " (", trim(put(percent/100, percent7.1)), ")");
keep tablevar variable_value frequency percent cum: presentation;
label tablevar='Variable' variable_value='Variable Value';
run;
proc sort data=long;
by tablevar;
run;
proc transpose data=long out=wide_N prefix=N_;
by tablevar;
id variable_value;
var frequency;
run;
proc sort data=wide_n;
by tablevar;
run;
proc freq nlevels data = &set;
ods output nlevels=distinct;
run;
proc sort data=distinct;
by tablevar;
run;
data final_Alpha;
length tablevar $32.;
merge distinct(rename=(Nlevels=distinct_)) con_(keep=tablevar type) wide_n;
by tablevar;
run;
proc sort data=final_Alpha out=_final_Alpha;
by descending n_missing;
run;
ods select nlevels;
proc freq nlevels data=&set;
tables _all_;
run;
%mend task_Oct;
%task_Oct(sashelp.cars)
If you can use sas/iml have a look at https://blogs.sas.com/content/iml/2011/09/19/count-the-number-of-missing-values-for-each-variable.ht...
Slightly modified, but most likely in need of improvement:
data work.cars;
set sashelp.cars;
if mod(_n_, 31)=0 then
do;
Origin=' ';
end;
if mod(_n_, 77)=0 then
do;
DriveTrain=' ';
end;
run;
proc iml;
use work.cars;
read all var _NUM_ into x[colname=nNames];
nmiss=countmiss(x, "col");
read all var _CHAR_ into x[colname=cNames];
cmiss=countmiss(x, "col");
close one;
Names=cNames || nNames;
cnt=cmiss || nmiss;
numMissing = cnt`;
variable = Names`;
create work.result var {variable numMissing};
append;
close work.result;
quit;
If on Viya, the distinct action returns NMiss for all the variables:
data casuser.cars;
set sashelp.cars;
run;
proc cas;
simple.distinct result=r / table={caslib="casuser", name="cars"};
saveresult r dataout=work.smry;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.