I'm looking to apply a substring function to all columns in a dataset. Currently I've only had success doing it to one column at a time, but this isn't efficient or viable as this is a macro that won't always know the names or the amount of columns(There's also 100's of columns)
data have:
Var1 Var2 Var3 ... VarN
01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
data want:
Var1 Var2 Var3 ... VarN
01 99
14 99
08 99
As I said I can achieve the desired result using the below code, but I have to specify the column names. Really, I need the following function to apply to ALL columns, without specifying their names. I ran into the same problem using the compress function.
data Have;
set Want;
length var1 4;
var1 = substr(var1, 1, index(var1, '-') - 1);
run;
Thanks in Advance
data have;
input (Var1 Var2 Var3 Var4) ($);
cards;
01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
;
data want;
set have;
array t(*) var:;
do _n_=1 to dim(t);
if index(t(_n_), '-')>0 then t(_n_)=substr(t(_n_), 1, index(t(_n_), '-') - 1);
else call missing(t(_n_));
end;
run;
data have;
input (Var1 Var2 Var3 Var4) ($);
cards;
01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
;
data want;
set have;
array t(*) var:;
do _n_=1 to dim(t);
if index(t(_n_), '-')>0 then t(_n_)=substr(t(_n_), 1, index(t(_n_), '-') - 1);
else call missing(t(_n_));
end;
run;
Fantastic! Thank you this is perfect!
Are your variables actually named Var1-VarN? Do they have a specific naming convention? Do you have all character variables or mixed types?
You can use _character_ and _numeric_ to list all character and numeric variables respectively.
@OPrxwr7 wrote:
I'm looking to apply a substring function to all columns in a dataset. Currently I've only had success doing it to one column at a time, but this isn't efficient or viable as this is a macro that won't always know the names or the amount of columns(There's also 100's of columns)
data have:
Var1 Var2 Var3 ... VarN01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
data want:
Var1 Var2 Var3 ... VarN01 99
14 99
08 99
As I said I can achieve the desired result using the below code, but I have to specify the column names. Really, I need the following function to apply to ALL columns, without specifying their names. I ran into the same problem using the compress function.
data Have;
set Want;
length var1 4;
var1 = substr(var1, 1, index(var1, '-') - 1);
run;
Thanks in Advance
Sorry I should have mentioned, yes all the variable names will have a consistent "var" at the beginning, so novinosrin's use of the wildcards will work. Thanks for looking into it though!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.