Obsidian | Level 7

## How many times a word occurs in a string?

Hi All,

How can I count the number of times a string occurs in a bigger string in SAS?

Basically a variable with a lot of text and I need to determine how many times a certain thing occurs in each of the rows

Example: A variable has a value "This is is a great day" for row 1
I need to know how many time 'is' occurred.

Similarly for all rows

I have been using prxmatch to know if it occurs or not. Cant figure out how to calculate number of occurrences.

Thanks.

6 REPLIES 6
Super User

## Re: How many times a word occurs in a string?

How much further are you taking this analysis?
My suggestion would be to separate each word from a sentence into it's own record and then use PROC FREQ.

Here's an example at finding all two word combinations or how to split a file:

``````*Create sample data;
data random_sentences;
infile cards truncover;
informat sentence \$256.;
input sentence \$256.;
cards;
This is a random sentence
This is another random sentence
Happy Birthday
My job sucks.
This is a good idea, not.
This is an awesome idea!
How are you today?
Does this make sense?
Have a great day!
;

;
;
;
*Partition into words;
data f1;
set random_sentences;
id=_n_;
nwords=countw(sentence);
nchar=length(compress(sentence));

do word_order=1 to nwords;
word=scan(sentence, word_order);
output;
end;
run;

proc sql;
create table words2 as
select t1.sentence, lowcase(t1.word) as word1, lowcase(t2.word) as word2
from f1 as t1
cross join f1 as t2
where t1.sentence=t2.sentence
and t1.word_order > t2.word_order
order by t1.sentence, t1.word_order;
quit;

proc freq data=words2 noprint order=freq;
table word1*word2 /list out=want;
run;``````
Super User

## Re: How many times a word occurs in a string?

``````count = 0;
do i = 1 to countw(yourvar);
count = count + (scan(yourvar,i) = testvar);
end;
drop i;``````

``````data want;
testvar = 'is';
yourvar = "This is is a great day";
count = 0;
do i = 1 to countw(yourvar);
count = count + (scan(yourvar,i) = testvar);
end;
drop i;
run;``````
Diamond | Level 26

## Re: How many times a word occurs in a string?

Example: A variable has a value "This is is a great day" for row 1
I need to know how many time 'is' occurred.

Seems to me the answer is 3. Is that right? Can you confirm? Your title indicates you are counting words, but your first sentence indicates you want to know how many times a STRING appears in a bigger string. These are not the same! We need clarification.

The code provided by others is looking for entire words that match 'is' and would not count the letters 'is' in the word "This".

--
Paige Miller
Obsidian | Level 7

## Re: How many times a word occurs in a string?

Hi Paige,

I'm sorry for that error.

I didnt see the is in This.

Yes, the count should be 3.

Thank you.

Super User

## Re: How many times a word occurs in a string?

So it should be

"count how many times a string appears in another string"

Obsidian | Level 7

## Re: How many times a word occurs in a string?

Hi Kurt,

I figured out a way on achieving my problem. Is there a better way of achieving the same?

OpportunityDescription is my variable while "F2F", "face to face" are my test strings.

``````data out.open_records_opp_desc;
set out.open_records;

Count_1 = count(OpportunityDescription,'F2F','i');

Count_2 = count(OpportunityDescription,'face to face','i');
Count_3 = count(OpportunityDescription,'in person','i');
Count_4 = count(OpportunityDescription,'meeting','i');
Count_5 = count(OpportunityDescription,'met','i');
Count_6 = count(OpportunityDescription,'discussed','i');
Count_7 = count(OpportunityDescription,'one on one','i');
Count_8 = count(OpportunityDescription,'discussed','i');
Count_9 = count(OpportunityDescription,'contact','i');

count_f2f_occurance = sum(Count_1,Count_2,Count_3,Count_4,Count_5,Count_6,Count_7,Count_8,Count_9);

run;``````

Discussion stats
• 6 replies
• 10104 views
• 4 likes
• 4 in conversation