DATA Step, Macro, Functions and more

Count repeated alphabets in a string

Reply
Contributor
Posts: 22

Count repeated alphabets in a string

Hi all,

     I have been trying to find repeated alphabets in a given string.

 

for example, I have a string like "REPEATED". I want to find repeated alphabet in it and its count.

 

can someone help me with its code..

 

Thanks in advance

 

 

Super User
Super User
Posts: 9,441

Re: Count repeated alphabets in a string

There are numerous examples out there, just search on Goolge.  One way would be to output each character as a new observation, then proc freq it. Another would be to loop over each character in the string, and have an array for each letter, and add to that.  The real question here is why?  I don't see any value in counting characters, unless of course you have more than one data item in a variable, which isn't good practice.  Post test data, form of a datastep, and what you want to see out for better answers.

 

Super User
Posts: 13,358

Re: Count repeated alphabets in a string

Does case matter? For instance in "Aa" do you want a count of 1 for "A" and 1 for "a" or 2 for "A"?

 

You really need to provide some example data and what the output would look like as there are several ways to interpret and provide results.

Super User
Posts: 23,354

Re: Count repeated alphabets in a string

Show us exactly what you want as output.

PROC Star
Posts: 1,604

Re: Count repeated alphabets in a string

data have;
string='REPEATED';
run;
data _null_;
if _n_=1 then do;
 dcl hash H () ;
   h.definekey  ("char") ;
   h.definedata ('char',"count") ;
   h.definedone();
end;
set have;
do _n_=1 to length(string);
char=substr(string,_n_,1);
if h.find() ne 0 then do;count=1;h.add();end;
else do;count=count+1;h.replace();end;
end;
h.output(dataset:'want');
run;
Valued Guide
Posts: 560

Re: Count repeated alphabets in a string

[ Edited ]

Why do you need the count of alphabets?  

 

Here is one way

data test (keep=String STR_Count STR_Val);
String="REPEATED";
String_=String;
do until (length(String_)=1);
STR_Count=count(String_,substr(String_,1,1));
STR_Val=substr(String_,1,1);
String_=COMPRESS(String_,substr(String_,1,1));
output;
end;
run;

PROC TRANSPOSE data=test out=want(drop=_name_) prefix=Alphabet_;
by String;
id STR_Val;
var STR_Count;
run;

 

If mixed case then use "i" modifier. 

data test (keep=String STR_Count STR_Val);
String="RePEATED";
String_=String;
do until (length(String_)=1);
STR_Count=count(String_,substr(String_,1,1),'i');
STR_Val=substr(String_,1,1);
String_=COMPRESS(String_,substr(String_,1,1),'i');
output;
end;
run;

PROC TRANSPOSE data=test out=want(drop=_name_) prefix=Alphabet_;
by String;
id STR_Val;
var STR_Count;
run;
Thanks,
Suryakiran
PROC Star
Posts: 1,604

Re: Count repeated alphabets in a string

Most simplest:

 

data have;
string='REPEATED';
run;
data temp;
set have;
grp+1;
do _n_=1 to length(string);
char=char(string,_n_);
output;
end;
run;

proc freq data=temp;
by grp;
tables char/out=want(keep=char count);
run;
Ask a Question
Discussion stats
  • 6 replies
  • 90 views
  • 2 likes
  • 6 in conversation