I am trying to count the total number of missing and non missing values for each character variables.
below is the sample data.
data temp;
input (disp1-disp8) ($);
datalines;
1 2 . 4 5 6 7 8
. 2 3 4 5 . 7 8
1 2 3 4 5 6 7 8
1 2 3 . . 6 . 8
1 2 3 4 5 6 7 8
1 . 3 . 5 . 7 .
;
run;
My goal is to try to get the following result. Is there anyway to do it for character variable? Thanks.
Missing | Non Missing | |
disp1 | 1 | 5 |
disp2 | 1 | 5 |
disp3 | 1 | 5 |
disp4 | 2 | 4 |
disp5 | 1 | 5 |
disp6 | 2 | 4 |
disp7 | 1 | 5 |
disp8 | 1 | 5 |
Do you want to count the number of non-missing values? Or the number of DISTINCT non-missing values?
The NLEVELS option on PROC FREQ will do the latter.
proc freq nlevels;
tables disp1-disp8 / noprint;
run;
For the former you can also use PROC FREQ but first make a user defined format.
proc format;
value $cmiss ' '='Missing' other='Non-missing';
run;
proc freq ;
tables disp1-disp8;
format disp1-disp8 $cmiss.;
run;
Do you want to count the number of non-missing values? Or the number of DISTINCT non-missing values?
The NLEVELS option on PROC FREQ will do the latter.
proc freq nlevels;
tables disp1-disp8 / noprint;
run;
For the former you can also use PROC FREQ but first make a user defined format.
proc format;
value $cmiss ' '='Missing' other='Non-missing';
run;
proc freq ;
tables disp1-disp8;
format disp1-disp8 $cmiss.;
run;
Thank Tom, this works like a charm. I need the former approach and noted the result shows the non missing count, but is there anyway to show the missing count as well?
You probably need to add the missing option the TABLES statement.
tables disp1-disp8 / missing;
Thank you Tom.
If you want to get the exact output to your original want, try this
data trans_v(keep=vname miss nmiss)/view=trans_v;
length vname $5 nmiss miss 4;
set temp;
array disps {8} $ disp1-disp8;
label nmiss='Non Missing'
miss='Missing'
;
do i=1 to 8;
call missing(miss,nmiss);
if (missing(disps{i})) then miss=1;
else nmiss=1;
vname=vname(disps{i});
output;
end;
run;
proc summary data=trans_v nway missing;
class vname;
var miss nmiss;
output out=stats(drop=_:) sum=;
run;
Ahmed
I have one more question, why do we need call missing(miss,nmiss) here?
@LL5 wrote:
I have one more question, why do we need call missing(miss,nmiss) here?
I would suggest running that data step without that statement and compare to the results you get with it.
It resets the values of those variable to missing for each step of the loop.
Otherwise you have values from the previous steps that had been set to 1 when i is > 1.
Thanks Ballardw. I ran the data step without the call missing step and noted the number for miss and nmiss variables are kind of accumulating.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.