Solved
New Contributor
Posts: 3

# Apply substring Function to all variables in a dataset

I'm looking to apply a substring function to all columns in a dataset. Currently I've only had success doing it to one column at a time, but this isn't efficient or viable as this is a macro that won't always know the names or the amount of columns(There's also 100's of columns)

data have:
Var1      Var2         Var3       ...  VarN

01-AB    NA            NA             99-XX

NA         NA            14-EF        99-XX

NA         08-BC       NA             99-XX

data want:
Var1      Var2         Var3       ... VarN

01                                             99

14               99

08                              99

As I said I can achieve the desired result using the below code, but I have to specify the column names. Really, I need the following function to apply to ALL columns, without specifying their names. I ran into the same problem using the compress function.

data Have;
set Want;
length var1 4;
var1  = substr(var1, 1, index(var1, '-') - 1);
run;

Accepted Solutions
Solution
‎06-06-2018 12:16 PM
PROC Star
Posts: 1,784

## Re: Apply substring Function to all variables in a dataset

``````data have;
input (Var1      Var2         Var3        Var4) (\$);
cards;
01-AB    NA            NA             99-XX
NA         NA            14-EF        99-XX
NA         08-BC       NA             99-XX
;

data want;
set have;
array t(*) var:;
do _n_=1 to dim(t);
if index(t(_n_), '-')>0 then t(_n_)=substr(t(_n_), 1, index(t(_n_), '-') - 1);
else call missing(t(_n_));
end;
run;``````

All Replies
Solution
‎06-06-2018 12:16 PM
PROC Star
Posts: 1,784

## Re: Apply substring Function to all variables in a dataset

``````data have;
input (Var1      Var2         Var3        Var4) (\$);
cards;
01-AB    NA            NA             99-XX
NA         NA            14-EF        99-XX
NA         08-BC       NA             99-XX
;

data want;
set have;
array t(*) var:;
do _n_=1 to dim(t);
if index(t(_n_), '-')>0 then t(_n_)=substr(t(_n_), 1, index(t(_n_), '-') - 1);
else call missing(t(_n_));
end;
run;``````
New Contributor
Posts: 3

## Re: Apply substring Function to all variables in a dataset

Fantastic! Thank you this is perfect!

Super User
Posts: 23,700

## Re: Apply substring Function to all variables in a dataset

[ Edited ]

Are your variables actually named Var1-VarN? Do they have a specific naming convention? Do you have all character variables or mixed types?

You can use _character_ and _numeric_ to list all character and numeric variables respectively.

@OPrxwr7 wrote:

I'm looking to apply a substring function to all columns in a dataset. Currently I've only had success doing it to one column at a time, but this isn't efficient or viable as this is a macro that won't always know the names or the amount of columns(There's also 100's of columns)

data have:
Var1      Var2         Var3       ...  VarN

01-AB    NA            NA             99-XX

NA         NA            14-EF        99-XX

NA         08-BC       NA             99-XX

data want:
Var1      Var2         Var3       ... VarN

01                                             99

14               99

08                              99

As I said I can achieve the desired result using the below code, but I have to specify the column names. Really, I need the following function to apply to ALL columns, without specifying their names. I ran into the same problem using the compress function.

data Have;
set Want;
length var1 4;
var1  = substr(var1, 1, index(var1, '-') - 1);
run;

New Contributor
Posts: 3

## Re: Apply substring Function to all variables in a dataset

Sorry I should have mentioned, yes all the variable names will have a consistent "var" at the beginning, so novinosrin's use of the wildcards will work. Thanks for looking into it though!

☑ This topic is solved.