Hello All,
I would like the community's help on looping around the columns of a data set, check if a column has at least one row with a missing value.
data values;
input @1 var1 3. @5 var2 3.;
if missing(var1) then
do;
put 'Variable 1 is Missing.';
end;
else if missing(var2) then
do;
put 'Variable 2 is Missing.';
end;
datalines;
127
988 195
;
run;
This is the manual way I can see from Missing() documentation and it would work for small datasets. But I would like the ability to do this over large data sets easily.
Slightly modified version
data values;
input @1 var1 3. @5 var2 3.;
array var[*] var:;
do i=1 to dim(var);
if missing(var[i]) then do;
put "Variable " i "is missing";
end;
end;
drop i ;
datalines;
127
988 195
;
run;
You can loop through columns by specifying an ARRAY structure (typically a one-dimensional array), where there is one array element for each variables of interest:
data values;
input @1 var1 3. @5 var2 3.;
array v {2} var1 var2;
do i=1 to 2;
if missing(v{I}) then put 'Variable ' I 'is Missing.' v{I}=;
end;
datalines;
127
988 195
;
run;
@mkeintz,
This looks great. However, I would like to request some extra details around the array declaration for my learning:
1. Why are you manually declaring column names after the declaration of the array ?
data values;
input @1 var1 3. @5 var2 3.;
array v {2} var1 var2; * Manually declaring columns here ?;
do i=1 to 2;
if missing(v{I}) then put 'Variable ' I 'is Missing.' v{I}=;
end;
datalines;
127
988 195
;
run;
2. Is there a way to automatically get all the column names and use it in the loop ?
@UdayGuntupalli wrote:
This looks great. However, I would like to request some extra details around the array declaration for my learning:
1. Why are you manually declaring column names after the declaration of the array ?
Because
@UdayGuntupalli@ also wrote:
2. Is there a way to automatically get all the column names and use it in the loop ?
By "all the columns names" do you mean all the numeric variables? If yes, then use
array v {*} _numeric_ ;
and
do i=1 to dim(v);
Slightly modified version
data values;
input @1 var1 3. @5 var2 3.;
array var[*] var:;
do i=1 to dim(var);
if missing(var[i]) then do;
put "Variable " i "is missing";
end;
end;
drop i ;
datalines;
127
988 195
;
run;
ods select none;
ods output nlevels=want;
proc freq data=have nlevels;
table _all_;
run;
ods select all;
proc sql noprint;
select TableVar into : missing_variables separated by ' '
from want
where NNonMissLevels=0;
quit;
%put Missing variables are: &missing_variables ;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.