Is there a simple way to create a list of variables that contain nulls? I found a way to do it if I manually refer to the column by name.
E.g.
select count(*)
from my table
where employee_category is null
group by employee_category
having count(*) > 0) as employee_category_nulls
Returns number of nulls in column.
Even better is to only return a list of columns that contains nulls and the number of nulls.
ods select none;
ods output nlevels=want;
proc freq data=sashelp.heart nlevels;
table _all_;
run;
ods select all;
proc print data=want(where=(NMissLevels ne 0));run;
I believe you have to check each variable on all observations.
That can be done in one step using two arrays, as in next code (not tested):
data _NULL_;
set have end=eof;
array vx {*} _numeric_; /* if need create separate array for _charcter_ */
array nx $ n1-n100; /* up to 100 variables. adapt to need */
length list $1000; /* adapt length to contain full list of var names */
/* fulfill variable names */
do i=1 to dim(vx);
nx(i) = vname(vx(i));
end;
/* check each numeric var in an observation */
do i=1 to dim(vx);
if vx(i) = . and findw(list,nx(i)) = 0
then catx(' ',trim(list), nx(i));
end;
if eof then put list=;
run;
Currently debugging this to see if it works.
If there was a way to do the below and only include instances where the column is character and has missing:
PROC FORMAT;
value $missfmt ' '='Missing' other='Not Missing';
value missfmt . ='Missing' other='Not Missing';
RUN;
PROC FREQ DATA=CCW.INTGRD_INPA;
FORMAT _CHAR_ $missfmt.;
TABLES _CHAR_ / missing missprint nocum nopercent;
FORMAT _NUMERIC_ missfmt.;
TABLES _NUMERIC_ / missing missprint nocum nopercent;
;
RUN;
Here is the code adapted to check CHAR variables: (the formats not needed)
data _NULL_;
set have end=eof;
array vx $ _charcter_;
array nx $ n1-n100; /* up to 100 variables. adapt to need */
length list $1000; /* adapt length to contain full list of var names */
/* fulfill variable names */
do i=1 to dim(vx);
nx(i) = vname(vx(i));
end;
/* check each numeric var in an observation */
do i=1 to dim(vx);
if missing(vx(i)) and findw(list,nx(i)) = 0
then catx(' ',trim(list), nx(i));
end;
if eof then put list=;
run;
When I run this:
then catx(' ',trim(list), nx(i));
_
22
76
ERROR: Undeclared array referenced: catx.
Sorry, line should be:
list = catx(' ',trim(list), nx(i));
The output is list=_charcte
Change array definitions (tested):
array vx {*} $ _character_;
array nx {*} $ n1-n100;
ods select none;
ods output nlevels=want;
proc freq data=sashelp.heart nlevels;
table _all_;
run;
ods select all;
proc print data=want(where=(NMissLevels ne 0));run;
Ksharp, thank you, your code is extremely helpful for profiling. I found that adding table _CHARACTER_ is useful as there is not as much purpose to mass checking for null numbers.
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.