Hello
I want to split words in column model .
Each word will be in a new column.
Let's say that I don't know in advance how many new variables should be defined and I want to calculate it with countw function.
I am using array to split the string into multiple varaibles.
My question is related to array statement.
How should I tell SAS that the new variables are model1 until model number k?
data cars_new;
set sashelp.cars;
k=countw(model,",");
array modelname[k] $15 model1-modelK;/*Here is my question*/
do i = 1 to k;
modelname[i] = scan(model,i,", ");
end;
run;
Something like below should work.
%let max_wrd=;
proc sql noprint;
select max(countw(model,', ')) into :max_wrd trimmed
from sashelp.cars
;
quit;
data cars_new(drop=_:);
set sashelp.cars;
array model_[&max_wrd] $15 ;
do _i = 1 to dim(model_);
model_[_i] = scan(model,_i,', ');
if missing(model_[_i]) then leave;
end;
run;
proc print;
run;
You first need to determine max(countw(model)) in a SQL step (SELECT INTO), store this in a macro variable, and use that macro variable for the definition of the array.
Arrays are static in size for the duration of a data step.
Thank you very much!
Is it fine to write
model1-model&K.
proc sql noprint;
select max(countw(model)) into :K
from sashelp.cars
;
quit;
data cars_new;
set sashelp.cars;
array modelname[&K.] $15 model1-model&K.;
do i = 1 to &k.;
modelname[i] = scan(model,i,", ");
end;
run;
Once again: Maxim 4.
I am on my tablet at the moment, so can't test anything.
Yes, model1-model&K
is correct.
However, you will want to ensure the countw
parse uses the same delimiter as the later scan
. Trimmed results are needed for usage such as model&K
proc sql noprint;
select max(countw(model, ", " )) /* word count based on comma or space as delimiters */
into :K trimmed /* remove leading and trailing spaces of value placed in macro variable K */
from sashelp.cars
;
quit;
Something like below should work.
%let max_wrd=;
proc sql noprint;
select max(countw(model,', ')) into :max_wrd trimmed
from sashelp.cars
;
quit;
data cars_new(drop=_:);
set sashelp.cars;
array model_[&max_wrd] $15 ;
do _i = 1 to dim(model_);
model_[_i] = scan(model,_i,', ');
if missing(model_[_i]) then leave;
end;
run;
proc print;
run;
a Few questions please:
1- When you declare the array called model_, shouldn't you type the names of the new varaibles?
It was written :
array model_[&max_wrd] $15 ;
Where did you tell SAS what will be the names of the new varaibles?
2- What is the purpose of :
if missing(model_[_i]) then leave;
What will happen If you don't type it?
thanks
Once again, Maxim 1.
From the documentation for the ARRAY Statement:
Range | The names must be either variables that you define in the ARRAY statement or variables that SAS creates by concatenating the array name and a number. For example, when the subscript is a number (not the asterisk), you do not need to name each variable in the array. Instead, SAS creates variable names by concatenating the array name and the numbers 1, 2, 3, …n. |
Thanks,
You wrote _i
Was it typing error?
What is the purpose of the code:
%let max_wrd=;
@Ronein wrote:
What is the purpose of the code:
%let max_wrd=;
It is a "safety valve" in case the dataset contains no observations.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.