I'm looking to apply a substring function to all columns in a dataset. Currently I've only had success doing it to one column at a time, but this isn't efficient or viable as this is a macro that won't always know the names or the amount of columns(There's also 100's of columns)
data have:
Var1 Var2 Var3 ... VarN
01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
data want:
Var1 Var2 Var3 ... VarN
01 99
14 99
08 99
As I said I can achieve the desired result using the below code, but I have to specify the column names. Really, I need the following function to apply to ALL columns, without specifying their names. I ran into the same problem using the compress function.
data Have;
set Want;
length var1 4;
var1 = substr(var1, 1, index(var1, '-') - 1);
run;
Thanks in Advance
data have;
input (Var1 Var2 Var3 Var4) ($);
cards;
01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
;
data want;
set have;
array t(*) var:;
do _n_=1 to dim(t);
if index(t(_n_), '-')>0 then t(_n_)=substr(t(_n_), 1, index(t(_n_), '-') - 1);
else call missing(t(_n_));
end;
run;
data have;
input (Var1 Var2 Var3 Var4) ($);
cards;
01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
;
data want;
set have;
array t(*) var:;
do _n_=1 to dim(t);
if index(t(_n_), '-')>0 then t(_n_)=substr(t(_n_), 1, index(t(_n_), '-') - 1);
else call missing(t(_n_));
end;
run;
Fantastic! Thank you this is perfect!
Are your variables actually named Var1-VarN? Do they have a specific naming convention? Do you have all character variables or mixed types?
You can use _character_ and _numeric_ to list all character and numeric variables respectively.
@OPrxwr7 wrote:
I'm looking to apply a substring function to all columns in a dataset. Currently I've only had success doing it to one column at a time, but this isn't efficient or viable as this is a macro that won't always know the names or the amount of columns(There's also 100's of columns)
data have:
Var1 Var2 Var3 ... VarN01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
data want:
Var1 Var2 Var3 ... VarN01 99
14 99
08 99
As I said I can achieve the desired result using the below code, but I have to specify the column names. Really, I need the following function to apply to ALL columns, without specifying their names. I ran into the same problem using the compress function.
data Have;
set Want;
length var1 4;
var1 = substr(var1, 1, index(var1, '-') - 1);
run;
Thanks in Advance
Sorry I should have mentioned, yes all the variable names will have a consistent "var" at the beginning, so novinosrin's use of the wildcards will work. Thanks for looking into it though!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.