Dear Colleagues,
Hi friends, I am new to SAS and would like someone’s favour to crack the code for my data manipulation before. Hi friends, I am new to SAS and would like someone’s favour to crack the code for my data manipulation before.
I would like to count how many times multiple substrings (words) appear in a string. For example in the string below, I would like to count how many times the substrings appear. i want the substrings to be counted regardless of the case (whether upper or lowwer)
String | Substrings to be counted | Num_substrings |
Abc, def, jhi klm, abc,def | Abc, def, klm | 5 |
Abc,jhi,zem,lmh,skj,abc | Abc,skj | 3 |
Thanks a bunch.
Dathan Byonaneby
Hi @DathanMD
Another approach could be the following one:
data have;
infile datalines dlm="09"x;
input String :$50. Substrings :$50.;
datalines;
Abc, def, jhi klm, abc,def Abc, def, klm
Abc,jhi,zem,lmh,skj,abc Abc,skj
;
run;
data want;
set have;
count=0;
do i=1 to countw(Substrings,",");
count + count(String, strip(scan(Substrings,i,",")),"i");
end;
drop i;
run;
Does your dataset look like this:
data have;
infile datalines dlm='|' dsd;
input String :$30. Substrings :$30.;
datalines;
Abc, def, jhi klm, abc,def|Abc, def, klm
Abc,jhi,zem,lmh,skj,abc|Abc,skj
;
?
Look at the dataset created by my data step; does it look like the dataset you have?
@DathanMD wrote:
The delimeter in the string is ',"
Hi @DathanMD
Here is an approach to do this:
data have;
infile datalines dlm="09"x;
input String :$50. Substrings :$50.;
datalines;
Abc, def, jhi klm, abc,def Abc, def, klm
Abc,jhi,zem,lmh,skj,abc Abc,skj
;
run;
proc sql noprint;
select max(countw(Substrings)) into:total trimmed from have;
quit;
data want;
set have;
count=0;
array _substring (&total) $50. _temporary_;
array _nb_substring (&total) _temporary_;
do i=1 to dim(_substring);
_substring(i)=scan(Substrings,i,",");
_nb_substring(i)=count(String, strip(_substring(i)),"i");
end;
count=sum(of _nb_substring(*));
drop i;
run;
Hi @DathanMD
Another approach could be the following one:
data have;
infile datalines dlm="09"x;
input String :$50. Substrings :$50.;
datalines;
Abc, def, jhi klm, abc,def Abc, def, klm
Abc,jhi,zem,lmh,skj,abc Abc,skj
;
run;
data want;
set have;
count=0;
do i=1 to countw(Substrings,",");
count + count(String, strip(scan(Substrings,i,",")),"i");
end;
drop i;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.