Hi
I've found a number of solutions to count words within a string in SAS. However they all do not work for strings without spaces.
I have this string (it's in character format !) : 121121121121121
The string can also be like this : 121556121112141215
I would like SAS to return 5 in the first case and 4 in the second case. How do I do this ?
Many thanks !!!
B
data have;
length str $18;
input str;
datalines;
121121121121121
121556121112141215
;
run;
data want;
set have;
count=count(str,'121');
run;
If you don't have a delimiter in the string, you have to explain what "word" means in your problem.
Please elaborate why SAS should return 4 and 5 respectively.
Hi @Billybob73
I am assuming you want a count of 2 in the string values which you haven't mentioned that @PeterClemmensen specifically asks.
If my assumption is correct-->
/*Creating sample data HAVE using your 2 string values*/
data have;
length str $20;
str='121121121121121';
output;
str='121556121112141215';
output;
run;
/*This assumes you want the count of 2 from what I see 5 and 4*/
data want;
set have;
want=countc(str,'2');
run;
My guess is that your pattern is '121' (PAT).
There are are several ways to skin the cat,
data have; length str $18; input str; datalines; 121121121121121 121556121112141215 ; run; data want; retain pat '121'; lengpat = length(pat); set have; count = 0; k = 1; do while(1); k = find(str,'121',k); if k then count + 1; if k = 0 then leave; k + lengpat; end; keep str count; run;
Another, much shorter way is to use difference of lengths of STR Before and End of compressing the pattern. Length function will return one for a null string. Therefore, lengthN is function is necessary after compression to catch the null string.
data need; set have; retain pat '121'; lengpat = length(pat); lengB = length(str); temp = str; temp = compress(temp,'121'); lengE = lengthn(temp); count = floor((lengB - lengE)/lengpat); keep str count; run;
Still a better way is have your own function. The advantage is that you can change the pattern as you like. It is coded once, compiled and saved to a Library of your choice.
proc fcmp outlib = work.cmput.lib; function patcount(str $, pat $); file log; patlen = length(pat); Blen = length(str); str = compress(str, pat); Elen = lengthn(str); rc = floor((Blen - Elen)/patlen); return(rc); endsub; quit; options cmplib = work.cmput; /* Like any other SAS Function.Found in */ /* this Library */ data fun; set have; pat = '123'; count = patcount(str, pat); drop pat; run;
In case my guess is wrong, still the code will not change and you only need to change the PAT. Hope you will try this out.
I think you read the pattern well:) Nicely done!
data have;
length str $18;
input str;
datalines;
121121121121121
121556121112141215
;
run;
data want;
set have;
count=count(str,'121');
run;
NOvinosrin:
Count() function is a very good choice.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.