I'm looking to apply a substring function to all columns in a dataset. Currently I've only had success doing it to one column at a time, but this isn't efficient or viable as this is a macro that won't always know the names or the amount of columns(There's also 100's of columns)
data have:
Var1 Var2 Var3 ... VarN
01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
data want:
Var1 Var2 Var3 ... VarN
01 99
14 99
08 99
As I said I can achieve the desired result using the below code, but I have to specify the column names. Really, I need the following function to apply to ALL columns, without specifying their names. I ran into the same problem using the compress function.
data Have;
set Want;
length var1 4;
var1 = substr(var1, 1, index(var1, '-') - 1);
run;
Thanks in Advance
data have;
input (Var1 Var2 Var3 Var4) ($);
cards;
01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
;
data want;
set have;
array t(*) var:;
do _n_=1 to dim(t);
if index(t(_n_), '-')>0 then t(_n_)=substr(t(_n_), 1, index(t(_n_), '-') - 1);
else call missing(t(_n_));
end;
run;
data have;
input (Var1 Var2 Var3 Var4) ($);
cards;
01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
;
data want;
set have;
array t(*) var:;
do _n_=1 to dim(t);
if index(t(_n_), '-')>0 then t(_n_)=substr(t(_n_), 1, index(t(_n_), '-') - 1);
else call missing(t(_n_));
end;
run;
Fantastic! Thank you this is perfect!
Are your variables actually named Var1-VarN? Do they have a specific naming convention? Do you have all character variables or mixed types?
You can use _character_ and _numeric_ to list all character and numeric variables respectively.
@OPrxwr7 wrote:
I'm looking to apply a substring function to all columns in a dataset. Currently I've only had success doing it to one column at a time, but this isn't efficient or viable as this is a macro that won't always know the names or the amount of columns(There's also 100's of columns)
data have:
Var1 Var2 Var3 ... VarN01-AB NA NA 99-XX
NA NA 14-EF 99-XX
NA 08-BC NA 99-XX
data want:
Var1 Var2 Var3 ... VarN01 99
14 99
08 99
As I said I can achieve the desired result using the below code, but I have to specify the column names. Really, I need the following function to apply to ALL columns, without specifying their names. I ran into the same problem using the compress function.
data Have;
set Want;
length var1 4;
var1 = substr(var1, 1, index(var1, '-') - 1);
run;
Thanks in Advance
Sorry I should have mentioned, yes all the variable names will have a consistent "var" at the beginning, so novinosrin's use of the wildcards will work. Thanks for looking into it though!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.