I have variables var1 - var28; there are other variables in the dataset too, like name, race, sex, etc. For the var1-var28 variables, I want to replace any value "other" with "oth." Without listing out all 28 variables, is there a way to do this? (I don't want to list out the variables because (1) there are too many and it is does not seem concise and (2) because this code is automated and every year, we have a different number of these variables--e.g., next year, we may have var1-var35.) In my head I want to do something like the following, but don't know how to execute it in SAS.
if var* = "other" then var* = "oth"
Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html
Here's a tutorial on using Arrays in SAS
https://stats.idre.ucla.edu/sas/seminars/sas-arrays/
_numeric_ : all numeric variables
_character_ : all character variables
_all_ : all variables
prefix1 - prefix# : all variables with the same prefix assuming they're numbered
prefix: : all variables that start with prefix
firstVar -- lastVar : variables based on location between first and last variable, including the first and last.
first-numeric-lastVar : variables that are numeric based on location between first and last variable
This concept is referred to as an operation using an array. Although we can use some SAS wildcard syntax. DO OVER works in SAS, but it is not documented, but there are other documented ways to do this with do loops.
data;
length var1-var3 $5;
array var var1-var3 ;
do over var;
var="other";
end;
output;
do over var;
if var="other"
then var="oth";
end;
output;
run;
I say, using DO OVER is the best example we can give keeping in the spirit of using wildcards; it is something done to do more with less syntax.
Fix your read so the desired values are read as needed to begin with. A data step and a custom informat means you don't have to "fix" anything "next year".
proc format; invalue $fixoth (upcase default=20) 'OTHER' = 'oth.' ; data example; informat var1 - var3 $fixoth.; input var1 - var3; datalines; other sometext thattext this that those that this other other other other ;
There may be some other details about actual length of the invalue/ informat when used but the only change needed next time would be the INFORMAT variable list.
Note that my specific example will work with text like: Other OTHER other oTHer and other changes in capitalization.
If you have a consistent prefix then the key is how you declare your array statement using one of the shortcut methods. If they all have a consistent prefix for example, this would work. As long as the prefix is used only for that variable and consistently you're good to go.
data want;
set have;
array _myvars(*) var_prefix: ;
do i=1 to dim(_myvars);
if _myvars(I) = "Other" then _myvars(i) = "Oth";
end;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.