Hello! I'm new to SAS (but not programming) and would really appreciate advice on an issue I'm working on. Please note that the following is a simple test I've constructed for my own understanding before I apply it to a larger dataset with more criteria.
Objective: I'd like to declare and populate a dynamic array (word_array) with the words that precede each instance of the word "SAS."
Output: Using the example string below with 4 instances of "SAS", the desired output I'm after is...
word_array{1} = It looks like
word_array{2} = is the way to go I love
word_array{3} = oh yasss
word_array{4} = woohoo
Current Code: Is as follows...
%let string=It looks like SAS is the way to go I love SAS oh yasss SAS woohoo SAS; data _null_; string=symget('string'); cnt=countc(string,'SAS')/3; array word_array{*} _CHARACTER_; j=1; do i = 1 by 1 to cnt; put i=; put j=; put string=; if length(word_array{j})>lengthc(word_array{j}) then j+1; word_array{j}=trim(scan(string,i,'SAS')); put word_array{j}=; end; stop; run;
What I can see from my puts (put i=; etc) is that my dynamic array word_array is not advancing to the next instance of "SAS" - and I'm not sure whether it's my poor syntax around (1) declaring the macro variable string or (2) the dynamic array word_array or (3) that j is not incrementing forward.
Current Output: Is shown below...
i=1 j=1 string=It looks like SAS is the way to go I love SAS oh yasss SAS woohoo SAS string=It looks like i=2 j=1 string=It looks like string= i=3 j=1 string= string= i=4 j=1 string= string=
Many thanks in advance - I've been reading through a lot of online documentation and I'm sure there's got to be a simple solution that I'm just not seeing. Let me know if I'm not following posting protocols and I will revise.
An easy way to split strings is to use a special character as the marker. I chose 01x here.
It may or not be useful to you:
data _null_;
STR1='It looks like SAS is the way to go I love SAS oh yasss SAS woohoo SAS';
STR2=tranwrd(STR1,'SAS','01'x);
do I=1 to countw(STR2,'01'x);
STR3=scan(STR2,I,'01'x);
if STR3 ne ' ' then putlog STR3=;
end;
run;
STR3=It looks like
STR3=is the way to go I love
STR3=oh yasss
STR3=woohoo
cnt=countc(string,'SAS')/3;
Perhaps use COUNTW to find the number of times SAS is in the string?
COUNTC counts character so I'm not sure what that's supposed to be doing here.
array word_array{*} _CHARACTER_;
This declares an array of all the character variables. SAS does not support dynamic sized arrays. _CHARACTER_ is a shortcut reference to all character variables.
do i = 1 by 1 to cnt;
A do statement is usually
DO i=1 to cnt by 1 (wrong order in yours).
if length(word_array{j})>lengthc(word_array{j}) then j+1;
Since you only have one variable this will work the first time and then not again, which is what you're seeing.
Since SAS does not support dynamic arrays, an option that I would suggest would be to output the strings as a new line each and then transpose if desired.
%let string=It looks like SAS is the way to go I love SAS oh yasss SAS woohoo SAS;
data outStrings;
string="&string";
*Find the number of times SAS is in the string;
count_sas=countw(string, 'SAS');
start=1;
do i=1 to count_sas;
*Find the word in the string;
loc=findw(string, 'SAS', start);
*substring the component;
outString=substr(string, start, loc-1-start);
*output to a data set;
output;
*set the start location to after the last word, this can be dynamic if needed;
start=loc+3;
end;
*keep only relevant variables;
rename i=index;
keep string outString i;
run;
*reshape to a wide format;
proc transpose data=outStrings out=Want prefix=String;
by string;
id index;
var outString;
run;
@DBloom wrote:
Hello! I'm new to SAS (but not programming) and would really appreciate advice on an issue I'm working on. Please note that the following is a simple test I've constructed for my own understanding before I apply it to a larger dataset with more criteria.
Objective: I'd like to declare and populate a dynamic array (word_array) with the words that precede each instance of the word "SAS."
Output: Using the example string below with 4 instances of "SAS", the desired output I'm after is...
word_array{1} = It looks like
word_array{2} = is the way to go I love
word_array{3} = oh yasss
word_array{4} = woohoo
Current Code: Is as follows...
%let string=It looks like SAS is the way to go I love SAS oh yasss SAS woohoo SAS; data _null_; string=symget('string'); cnt=countc(string,'SAS')/3; array word_array{*} _CHARACTER_; j=1; do i = 1 by 1 to cnt; put i=; put j=; put string=; if length(word_array{j})>lengthc(word_array{j}) then j+1; word_array{j}=trim(scan(string,i,'SAS')); put word_array{j}=; end; stop; run;
What I can see from my puts (put i=; etc) is that my dynamic array word_array is not advancing to the next instance of "SAS" - and I'm not sure whether it's my poor syntax around (1) declaring the macro variable string or (2) the dynamic array word_array or (3) that j is not incrementing forward.
Current Output: Is shown below...
i=1 j=1 string=It looks like SAS is the way to go I love SAS oh yasss SAS woohoo SAS string=It looks like i=2 j=1 string=It looks like string= i=3 j=1 string= string= i=4 j=1 string= string=Many thanks in advance - I've been reading through a lot of online documentation and I'm sure there's got to be a simple solution that I'm just not seeing. Let me know if I'm not following posting protocols and I will revise.
Many thanks Reeza! Quick question - my complete dataset will inevitably contain special characters...for instance, in place of "SAS" in the below example, the data may contain multiple semicolons ";" that will be used as delimiters - how should I adjust the macro declaration for this?
@DBloom wrote:
Many thanks Reeza! Quick question - my complete dataset will inevitably contain special characters...for instance, in place of "SAS" in the below example, the data may contain multiple semicolons ";" that will be used as delimiters - how should I adjust the macro declaration for this?
It shouldn't affect the code I provided, unless you need to account for them somehow. In that case you need to provide an example.
An easy way to split strings is to use a special character as the marker. I chose 01x here.
It may or not be useful to you:
data _null_;
STR1='It looks like SAS is the way to go I love SAS oh yasss SAS woohoo SAS';
STR2=tranwrd(STR1,'SAS','01'x);
do I=1 to countw(STR2,'01'x);
STR3=scan(STR2,I,'01'x);
if STR3 ne ' ' then putlog STR3=;
end;
run;
STR3=It looks like
STR3=is the way to go I love
STR3=oh yasss
STR3=woohoo
@ChrisNZ Sir, That's very smart clever. I am jealous!!!!!!!!!!!!!!!!!!!!!!
Hi ChrisNZ,
May I ask what does your hexadecimal marker "01X" do here? For instance, if I had to apply this to a different string that was looking up numerical strings as delimiters, would this still work?
Many thanks!
Dale
'01'x is a non printable-character with hexadecimal code 01.
Like 09x is a tab and 20x is the space character.
Since 01x is extremely rare in strings, it is a good choice for such cases.
Thank you! I accepted this as an answer due to its brevity and updated it for my use case:
data w; str1='It looks like SAS is the way to go I love SAS oh yasss SAS woohoo SAS'; str2=tranwrd(str1,'SAS','01'x); do i=1 to countw(str1,'SAS'); str3=trim(scan(str2,i,'01'x)); if str3 ne ' ' then putlog str3=; output; end; keep str1 i str3; run; proc transpose data=w label=i name=str1 prefix=str3; run;
data w;
string='It looks like SAS is the way to go I love SAS oh yasss SAS woohoo SAS';
length word $50;
k=1;
do n=1 to countw(string);
if scan(string,n)='SAS' then do;
call scan(string, n, position, length);
word=substr(string, k, position-k);
k=position+length;
output;
end;
end;
keep word;
run;
proc transpose data=w out=want prefix=word;
var word;
run;
Many thanks novinosrin! Out of curiosity (since the full dataset I'm working with will inevitably have special characters)... if the string contained semicolons ";" as delimiters instead of the word "SAS" , how would I adjust this?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.