Hi,
I have a data of 100+ variables.
I want to apply the simple operation to all variable: If variable_value >10000 then variable_value=.;
I dont want to repeat this line for 100+ line.
So is there any way to array all variables without listing their name so I can get it done?
Also, if I have the first 3 variable that should be excluded from the rule , how can I array all except for them. I can list their 3 name for excluding.
Thank you for your help.
HC
data have;
input var1 a b c;
cards;
1 10 2 1000000
2 10 3 1
3 10 4 0
4 . 100000 1000000
5 100000 1 2
6 10 2 300000
run;
If they really are the first three variables then just index over the array starting with the index set to 4.
data want;
set have;
array _all _numeric_;
do _n_=4 to dim(_all);
if _all(_n_) > 10000 then _all(_n_)=.;
end;
run;
If they aren't the first three, you can make them the first three by adding another SET statement with a KEEP= option.
data want;
set have(keep=var1 var2 var3);
set have;
array _all _numeric_;
do _n_=4 to dim(_all);
if _all(_n_) > 10000 then _all(_n_)=.;
end;
run;
I am not sure if it is possible for us to do the arrays without the variable names. We need the variable names in arrays. However, alternatively what we could do is we could get the list of all other variable names except the names which are not required (3 variable what you mentioned) into a macro variable.
Something like below
after you create the dataset with all the variables and data
write
proc sql;
select distinct name into : vars separated by ' ' from dictionary.columns where memname='dataset name' and name not in ('x1','x2','x3');
select count(distinct name) into : cnt from dictionary.columns where memname='dataset name' and name not in ('x1','x2','x3');
quit;
we get list of all variable names excluding the three variables into vars macro variable separated by blank space. In teh cnt macro variable we get the number of variable count.
then write the arrays as below
data arrays;
set have;
array vr(&cnt) &vars;
do i = 1 to &cnt;
If vr(i) >10000 then vr(i)=.;
end;
run;
If the list of variables is fairly long and the list of excluded variables is short, it would be simpler to list the latter explicitly. For example, to exclude var1 and b :
data want;
set have;
array _v{*} _numeric_;
do _i = 1 to dim(_v);
if upcase(vname(_v{_i})) not in ("VAR1", "B") then
if _v{_i} > 10000 then call missing(_v{_i});
end;
drop _i;
run;
If they really are the first three variables then just index over the array starting with the index set to 4.
data want;
set have;
array _all _numeric_;
do _n_=4 to dim(_all);
if _all(_n_) > 10000 then _all(_n_)=.;
end;
run;
If they aren't the first three, you can make them the first three by adding another SET statement with a KEEP= option.
data want;
set have(keep=var1 var2 var3);
set have;
array _all _numeric_;
do _n_=4 to dim(_all);
if _all(_n_) > 10000 then _all(_n_)=.;
end;
run;
These codes are awesome!
Very easy to follow.
So much thanks.
HC
If it is important to keep the variables in order as in @PGStats program does you can alternatevly built the variable list explicitly.
data have;
input var1 a b c;
var3 = _n_*10;
d = a;
var2 = var1*10;
cards;
1 10 2 1000000
2 10 3 1
3 10 4 0
4 . 100000 1000000
5 100000 1 2
6 10 2 300000
run;
proc print;
run;
proc transpose data=have(obs=0 drop=var: keep=_numeric_) out=vars;
run;
proc sql noprint;
select nliteral(_name_) into :varlist separated by ' ' from vars;
quit;
%put NOTE: &=varlist;
data change;
set have;
array _x[*] &varlist;
do _n_ = 1 to dim(_x);
if _x[_n_] ge 1e5 then _x[_n_]=.;
end;
run;
proc print;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.