Help using Base SAS procedures

Count missing values of 100+ variables in with column output

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 142
Accepted Solution

Count missing values of 100+ variables in with column output

HI!  I have a table with 100+ variables and I would like to know the easiest/ fastest way to check for missing/nonmissing values without having to run frequency count for each varaible in separate tables.  Preferably, I would like all variables in on column with missing and nonmissing values in columns.  For example: 

 

Variable         Missing      Non_Missing

RAWREG       2               8

PVPRISM        0              10

DAYS              1               9

OVRIDE3        10             0

 

 

Again, I have 100+ variables, so can I use some kind of Select (*) function w/o having to type out all variables?

 

Here is a sample data set:

 

Obs RAWREG PVPRISM OVRIDE3 DAYS
1 01 0939   408
2   0943   379
3 09 0029   170
4 11 0948    
5   0948   59
6 04 6108   57
7 06 0993   59
8 06 6111   59
9 06 0993   220
10 04 0901   59

Accepted Solutions
Solution
‎11-19-2015 10:38 AM
SAS Super FREQ
Posts: 709

Re: Count missing values of 100+ variables in with column output

The NLEVELS of Proc FREQ does not exactly do what you need. Had another go using the Hash object in the DATA Step.

 

Have a look

 

data have;
  infile cards dlm="," dsd;
  input
    RAWREG 
    PVPRISM $
    OVRIDE3 $
    DAYS 
  ;
cards;
01,0939,,408
,0943,,379
09,0029,,170
11,0948,,
,0948,,59
04,6108,,57
06,0993,,59
06,6111,,59
06,0993,,220
04,0901,,59
,,3,
;
data _null_;
  set have end=last;
  array _xchar{*} _character_;
  array _xnum{*} _numeric_;

  if _n_ = 1 then do;
    length 
      _colName $ 32
      _nMiss 8
      _n 8
    ;
    declare hash hmiss( ordered: "Y");
    hmiss.defineKey("_colName");
    hmiss.defineData("_colName", "_n", "_nMiss");
    hmiss.defineDone();
  end;

  do i = 1 to dim(_xchar);
    _nMiss=0;
    _n=0;
    _colName = vname( _xchar{i} );
    _rc = hmiss.find();
    _nMiss + (missing(_xchar{i}) = 1);
    _n + (missing(_xchar{i}) = 0);

    if _rc = 0 then do;
      hmiss.replace();
    end;
    else do;
      hmiss.add();
    end;
  end;

  do i = 1 to dim(_xnum);
    _nMiss=0;
    _n=0;
    _colName = vname( _xnum{i} );
    _rc = hmiss.find();
    _nMiss + (missing(_xnum{i}) = 1);
    _n + (missing(_xnum{i}) = 0);

    if _rc = 0 then do;
      hmiss.replace();
    end;
    else do;
      hmiss.add();
    end;

  end;

  if last = 1 then do;
    hmiss.output(dataset: "hashmiss");
  end;
run;

proc print data=hashmiss;
run;

proc print data=have;
run;

Bruno

View solution in original post


All Replies
Trusted Advisor
Posts: 1,931

Re: Count missing values of 100+ variables in with column output

UNTESTED CODE

 

proc summary data=have;
     var rawreg--days;
     output out=stats n= nmiss=/autoname;
run;
Frequent Contributor
Posts: 142

Re: Count missing values of 100+ variables in with column output

Posted in reply to PaigeMiller
I'm getting error with this code:

ERROR: Variable RAWREG in list does not match type prescribed for this list
ERROR: Variable PVPRISM in list does not match type prescribed for this list

...these error line go on for each variable in my table. I'm not sure what this means?

Super User
Posts: 5,435

Re: Count missing values of 100+ variables in with column output

It seems I can't say this often enough: don't have tables with hundreds of variables. It just causes headache.

Change your data model (e.g. by transposing your data), then it's much easier to analyze it.

Data never sleeps
Super User
Posts: 5,516

Re: Count missing values of 100+ variables in with column output

This means that you created your data set with all character variables, no numeric variables.  You have two basic choices at this point.  Either change your data, so that you actually have numeric variables.  Or, write a more complex program that will count missings and nonmissings for character variables.  As you make your choice, you might want to consider how you plan to use the data in the future ... will you actually need numeric variables in your data set. 

 

At this point, first you make your choice.  Then the programming can begin.

Respected Advisor
Posts: 4,930

Re: Count missing values of 100+ variables in with column output

It means that your variables are character strings, not numbers as they appear. You should fix that first.

PG
Frequent Contributor
Posts: 108

Re: Count missing values of 100+ variables in with column output

Did you tried this ?

proc means data = sample n nmiss;
var _numeric_;
run;
Trusted Advisor
Posts: 1,118

Re: Count missing values of 100+ variables in with column output

Wouldn't this be sufficient to get a quick overview?

 

proc format;
value miss
._-.z = 'Missing'
other = 'Non_Missing';

value $miss
' ' = 'Missing'
other = 'Non_Missing';
run;

proc freq data=have;
format _character_ $miss.
       _numeric_ miss.;
tables _all_ / missing;
run;
SAS Super FREQ
Posts: 3,755

Re: Count missing values of 100+ variables in with column output

See the ideas, examples, and links in the article "Count the number of missing values in for each variable"

SAS Super FREQ
Posts: 709

Re: Count missing values of 100+ variables in with column output

Proc FREQ has the NLEVELS option that is helpfull for this, see code sample below.

 

data have;
infile cards dlm="," dsd;
input
 RAWREG $
 PVPRISM $
 OVRIDE3 $
 DAYS $
;
cards;
01,0939,,408
,0943,,379
09,0029,,170
11,0948,,
,0948,,59
04,6108,,57
06,0993,,59
06,6111,,59
06,0993,,220
04,0901,,59
;

proc freq data=have nlevels;
  table _all_ / noprint;
run;

It will create this output:

The FREQ Procedure

 

Number of Variable Levels
Variable Levels Missing Levels Nonmissing Levels
RAWREG 6 1 5
PVPRISM 8 0 8
OVRIDE3 1 1 0
DAYS 7 1 6
Bruno
Frequent Contributor
Posts: 142

Re: Count missing values of 100+ variables in with column output

Posted in reply to Bruno_SAS
Hi! Nlevels definitely works, but I data set is huge...14 Million observations and 100+ variables. It keeps running out of memory every time I run the code. It worked great for a smaller test data set!
SAS Super FREQ
Posts: 3,755

Re: Count missing values of 100+ variables in with column output

Please clarify: Are you interested in only NUMERIC variables, or do you want the tabulation for numeric and character variables?

Frequent Contributor
Posts: 142

Re: Count missing values of 100+ variables in with column output

Preferably bother character and numeric. But I know that code can get a little tricky. I can do it separate, counting character using:

proc format;
value $missfmt ' '= 'Missing' other='Not Missing';
run;
proc freq data= have;
format _CHAR_ $missfmt.;
tables _char_ / missing missprint nocum nopercent;
by
run;

But is there a way to for the output to be alphabetical in the results viewer?
Frequent Contributor
Posts: 142

Re: Count missing values of 100+ variables in with column output

Preferably bother character and numeric. But I know that code can get a little tricky. I can do it separate, counting character using:

 

proc format;

value $missfmt ' '= 'Missing' other='Not Missing';

run;

proc freq data= have;

format _CHAR_ $missfmt.;

tables _char_ / missing missprint nocum nopercent;

run;

 

 

***But is there a way to for the character sas variables to be alphabetical in the results viewer?

Valued Guide
Posts: 860

Re: Count missing values of 100+ variables in with column output

[ Edited ]

Here is a solution, the _all_ adds some noise but it'll work.:

 

data have;
infile cards dsd;
input Obs     RAWREG$     PVPRISM$     OVRIDE3     DAYS;
cards;
1,01,0939,,408
2,,0943,,379
3,09,0029,,170
4,11,0948,,
5,,0948,,59
6,04,6108,,57
7,06,0993,,59
8,06,6111,,59
9,06,0993,,220
10,04,0901,,59
;

proc sql;
create table get_max as
select *,max(obs) as max_obs
from have;

proc transpose data=get_max out=tran;by max_obs;var _all_;

data want (keep=Variable missing non_missing);
set tran(rename=(_name_ = Variable));
Missing=nmiss(of col: );
Non_Missing=max_obs - missing;

if variable in ('Obs','max_obs') then delete;
run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 23 replies
  • 1254 views
  • 6 likes
  • 10 in conversation