## count missing value for character variable

Solved
Frequent Contributor
Posts: 76

# count missing value for character variable

Hi

Accepted Solutions
Solution
‎07-14-2016 02:42 PM
Super Contributor
Posts: 320

## Re: count missing value for character variable

Do you want to take another way to get what you want? This way may help you to get more information of the observations.

Be brave to do some programming with arrays.

``````data want;
set sashelp.class end = eof;
N + 1;
array num[*] _numeric_;
array char[*] _character_;
array n_count[100] _temporary_ (100*0); * Assuming data set has no more than 100 OBS;
array c_count[100] _temporary_ (100*0);
do i = 1 to dim(num);
if missing(num[i]) then n_count[i] + 1;
end;
do i = 1 to dim(char);
if missing(char[i]) then c_count[i] + 1;
end;
if eof then do;
do i = 1 to dim(num);
Variable = vname(num[i]);
NMiss = n_count[i];
output;
end;
do i = 1 to dim(char);
Variable = vname(char[i]);
NMiss = c_count[i];
output;
end;
end;
keep Variable N NMiss;
run;``````

All Replies
Super User
Posts: 13,498

## Re: count missing value for character variable

Character variables in Proc Tabulate may only be CLASS variables. Because of the default that any record with a missing value for a class variable is excluded you would need to add the missing option. The only statistics you may request for character variables in Proc Tabulate are N and the various N-related such a Pctn, ColPctN, RowPctn, PagePctn and RepPctn.

The following code adds a separate table to your output that has each value of the character var and its count.

``````proc tabulate data=sashelp.class f=comma6.0;
title ;
var _numeric_;
table _numeric_,
nmiss n/ box='Variable';
class _character_ / missing;
table _character_,n;
run;``````

If you want a simplier count you might want to go through your data and add a numeric flag for when the character variable isn't missing and count (or sum) that. Though there isn't going be a short easy way to reference the character variable counted.

Frequent Contributor
Posts: 76

## Re: count missing value for character variable

thank you your help!!! But the result looks like a proc freq for each character variable, but I just want a count of non-missing, instead add them up manually do you have other solution? since the data set I deal with now have almost 1000 variables, so go through them one by one is not very realistic....
Solution
‎07-14-2016 02:42 PM
Super Contributor
Posts: 320

## Re: count missing value for character variable

Do you want to take another way to get what you want? This way may help you to get more information of the observations.

Be brave to do some programming with arrays.

``````data want;
set sashelp.class end = eof;
N + 1;
array num[*] _numeric_;
array char[*] _character_;
array n_count[100] _temporary_ (100*0); * Assuming data set has no more than 100 OBS;
array c_count[100] _temporary_ (100*0);
do i = 1 to dim(num);
if missing(num[i]) then n_count[i] + 1;
end;
do i = 1 to dim(char);
if missing(char[i]) then c_count[i] + 1;
end;
if eof then do;
do i = 1 to dim(num);
Variable = vname(num[i]);
NMiss = n_count[i];
output;
end;
do i = 1 to dim(char);
Variable = vname(char[i]);
NMiss = c_count[i];
output;
end;
end;
keep Variable N NMiss;
run;``````
Frequent Contributor
Posts: 118

## Re: count missing value for character variable

[ Edited ]

You can use PROC SQL to get the count of missing values for character or numeric variables.  A coworker asked how to find all the variables in a data set that are missing for all observations.  The code example below can answer that question.

``````options mprint;

/* Using SASHELP data sets while testing macro.
Macro should list variables that are
completely missing in data set, so need to
create some data sets with such variables.  */

data baseball;
set sashelp.baseball;
suzanne_favorite_team='';
cleveland_indians_rank=.;
run;

data class;
set sashelp.class;
teacher='';
run;

%macro missing_summary(lib=, dsn=);

/* use dummy macro so that syntax highlighting shows up within missing macro */

%macro dummy();
%mend dummy;

/* get data set variable names */

proc sql;
create table vars_to_summarize as
select name,
label
from sashelp.vcolumn
where libname="%upcase(&lib)" and
memname="%upcase(&dsn)";
quit;

/* count number of variables to process */

proc sql noprint;
select count(*) into
:var_counter
from vars_to_summarize;
quit;

/* do PROC SQL one variable at a time,
append details for variable to summary
report, then delete the variable's data
set so that work library doesn't get
cluttered. */

%do i=1 %to &var_counter;

data _null_;
obsnum=&i;
set vars_to_summarize point=obsnum;
if _error_ then abort;
call symputx('field',name);
call symputx('description',label);
stop;
run;

proc sql;
create table &field._report as
select count(*) as n,
count(&field) as not_missing,
nmiss(&field) as missing
from &lib..&dsn;
quit;

data &field._report;
length variable \$ 32 dataset \$ 41 label \$ 256;
set &field._report;
variable="&field";
dataset="&lib..&dsn";
label="&description";
run;

proc append base=summary data=&field._report;
run;

proc datasets nolist;
delete &field._report;
quit;

%end;

proc datasets nolist;
delete vars_to_summarize;
quit;

%mend missing_summary;

%missing_summary(lib=work, dsn=class);
%missing_summary(lib=work, dsn=baseball);

proc print data=summary noobs;
by dataset notsorted;
* uncomment statement below if you only want to see the variables missing for ALL obs ;
* where n=missing;
var dataset variable label n not_missing missing;
title 'Missing and not-missing counts by variable';
run;

* clean up;

proc datasets nolist;
delete summary;
quit;``````

References:

Top 10 Most Powerful Functions for PROC SQL

Dictionary Tables and Views: Essential Tools for Serious Applications

Frequent Contributor
Posts: 76

## Re: count missing value for character variable

Dear @SuzanneDorinski,

Thank you so much for working on this problem and wrote such a nice code with all comments on! But the data set i'm dealing with is very large, this code takes a lot time(about 10 minutes). Sorry I didn't accept this as solution......
Since i am a beginner in SAS, i did learn a lot from your code. Thanks again!!

Best,
Frequent Contributor
Posts: 76

## Re: count missing value for character variable

This is perfect! I got exactly what I want!! Just to double check with you, I just need to change all the "100" in this code to the number of obs in my data set, right? @datasp
Super User
Posts: 13,498

## Re: count missing value for character variable

Xiaoningdemao wrote:
This is perfect! I got exactly what I want!! Just to double check with you, I just need to change all the "100" in this code to the number of obs in my data set, right? @datasp

Ideally the 100 would be replaced with the number of numeric variables for the numeric array and the number of character variables for the character array.

Here is an example code snippet to get the counts of char and num variables.:

``````proc sql;
select type, count(*) as n
from dictionary.columns
where libname='SASHELP' and memname='CLASS'
group by type;
quit;``````
Frequent Contributor
Posts: 76

## Re: count missing value for character variable

Hi @ballardw,
Thank you!!
But I meant the code you fist provided me produced a proc freq of each character variable. Result like this:
N
Gender
11,972
male 1220
female 30

But the value I wanted is the number of nonmissing values in this example 1220+30=1250.

I'm wondering is there a solution to get this value?

thanks again!!

Best,

Super User
Posts: 13,498

## Re: count missing value for character variable

For character variables add a variable to count or sum depending on how you are summarizing.

data want;

set have;

GenderNotMiss= (not missing(gender)); /* creates a 1 for not missing, 0 for missing values of gender*/

run;

Then in proc tabulate request SUM of GenderNotMiss;

or: if not missing(gender0 then GenderNotMiss=1;

and you could use the NMISS in tabulate along with the other numerics.

I would suggest using a good label with these indicator variables.

Super Contributor
Posts: 320

## Re: count missing value for character variable

'100' in my code represents total number of VARIABLES in the data set and NOT the number of observations as I wrote on the comments. The array will work even with 10 Million Variables. It just provides memory space to hold that many variables but in reality it will much less and the counting is done for as many NUMERIC and CHARACTER Variables.

If you know the numbers of numeric variables (say 15) and character variables (say 20) before running the data step, you can size the arrays as:

``````   array n_count[15] _temporary_ (15*0); * Assuming data set has no more than 100 OBS;
array c_count[20] _temporary_ (20*0);``````

Hope this helps you.

Frequent Contributor
Posts: 76

## Re: count missing value for character variable

OK, I see. Thank you!!! @datasp
☑ This topic is solved.

Discussion stats
• 11 replies
• 5542 views
• 3 likes
• 4 in conversation