Hi
I've found a number of solutions to count words within a string in SAS. However they all do not work for strings without spaces.
I have this string (it's in character format !) : 121121121121121
The string can also be like this : 121556121112141215
I would like SAS to return 5 in the first case and 4 in the second case. How do I do this ?
Many thanks !!!
B
data have;
length str $18;
input str;
datalines;
121121121121121
121556121112141215
;
run;
data want;
set have;
count=count(str,'121');
run;
If you don't have a delimiter in the string, you have to explain what "word" means in your problem.
Please elaborate why SAS should return 4 and 5 respectively.
Hi @Billybob73
I am assuming you want a count of 2 in the string values which you haven't mentioned that @PeterClemmensen specifically asks.
If my assumption is correct-->
/*Creating sample data HAVE using your 2 string values*/
data have;
length str $20;
str='121121121121121';
output;
str='121556121112141215';
output;
run;
/*This assumes you want the count of 2 from what I see 5 and 4*/
data want;
set have;
want=countc(str,'2');
run;
My guess is that your pattern is '121' (PAT).
There are are several ways to skin the cat,
data have;
length str $18;
input str;
datalines;
121121121121121
121556121112141215
;
run;
data want;
retain pat '121';
lengpat = length(pat);
set have;
count = 0;
k = 1;
do while(1);
k = find(str,'121',k);
if k then count + 1;
if k = 0 then leave;
k + lengpat;
end;
keep str count;
run;
Another, much shorter way is to use difference of lengths of STR Before and End of compressing the pattern. Length function will return one for a null string. Therefore, lengthN is function is necessary after compression to catch the null string.
data need; set have; retain pat '121'; lengpat = length(pat); lengB = length(str); temp = str; temp = compress(temp,'121'); lengE = lengthn(temp); count = floor((lengB - lengE)/lengpat); keep str count; run;
Still a better way is have your own function. The advantage is that you can change the pattern as you like. It is coded once, compiled and saved to a Library of your choice.
proc fcmp outlib = work.cmput.lib;
function patcount(str $, pat $);
file log;
patlen = length(pat);
Blen = length(str);
str = compress(str, pat);
Elen = lengthn(str);
rc = floor((Blen - Elen)/patlen);
return(rc);
endsub;
quit;
options cmplib = work.cmput; /* Like any other SAS Function.Found in */
/* this Library */
data fun;
set have;
pat = '123';
count = patcount(str, pat);
drop pat;
run;
In case my guess is wrong, still the code will not change and you only need to change the PAT. Hope you will try this out.
I think you read the pattern well:) Nicely done!
data have;
length str $18;
input str;
datalines;
121121121121121
121556121112141215
;
run;
data want;
set have;
count=count(str,'121');
run;
NOvinosrin:
Count() function is a very good choice.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.