Hello All,
I would like the community's help on looping around the columns of a data set, check if a column has at least one row with a missing value.
data values;
input @1 var1 3. @5 var2 3.;
if missing(var1) then
do;
put 'Variable 1 is Missing.';
end;
else if missing(var2) then
do;
put 'Variable 2 is Missing.';
end;
datalines;
127
988 195
;
run;
This is the manual way I can see from Missing() documentation and it would work for small datasets. But I would like the ability to do this over large data sets easily.
Slightly modified version
data values;
input @1 var1 3. @5 var2 3.;
array var[*] var:;
do i=1 to dim(var);
if missing(var[i]) then do;
put "Variable " i "is missing";
end;
end;
drop i ;
datalines;
127
988 195
;
run;
You can loop through columns by specifying an ARRAY structure (typically a one-dimensional array), where there is one array element for each variables of interest:
data values;
input @1 var1 3. @5 var2 3.;
array v {2} var1 var2;
do i=1 to 2;
if missing(v{I}) then put 'Variable ' I 'is Missing.' v{I}=;
end;
datalines;
127
988 195
;
run;
@mkeintz,
This looks great. However, I would like to request some extra details around the array declaration for my learning:
1. Why are you manually declaring column names after the declaration of the array ?
data values;
input @1 var1 3. @5 var2 3.;
array v {2} var1 var2; * Manually declaring columns here ?;
do i=1 to 2;
if missing(v{I}) then put 'Variable ' I 'is Missing.' v{I}=;
end;
datalines;
127
988 195
;
run;
2. Is there a way to automatically get all the column names and use it in the loop ?
@UdayGuntupalli wrote:
This looks great. However, I would like to request some extra details around the array declaration for my learning:
1. Why are you manually declaring column names after the declaration of the array ?
Because
@UdayGuntupalli@ also wrote:
2. Is there a way to automatically get all the column names and use it in the loop ?
By "all the columns names" do you mean all the numeric variables? If yes, then use
array v {*} _numeric_ ;
and
do i=1 to dim(v);
Slightly modified version
data values;
input @1 var1 3. @5 var2 3.;
array var[*] var:;
do i=1 to dim(var);
if missing(var[i]) then do;
put "Variable " i "is missing";
end;
end;
drop i ;
datalines;
127
988 195
;
run;
ods select none;
ods output nlevels=want;
proc freq data=have nlevels;
table _all_;
run;
ods select all;
proc sql noprint;
select TableVar into : missing_variables separated by ' '
from want
where NNonMissLevels=0;
quit;
%put Missing variables are: &missing_variables ;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.