Dear Colleagues,
Hi friends, I am new to SAS and would like someone’s favour to crack the code for my data manipulation before. Hi friends, I am new to SAS and would like someone’s favour to crack the code for my data manipulation before.
I would like to count how many times multiple substrings (words) appear in a string. For example in the string below, I would like to count how many times the substrings appear. i want the substrings to be counted regardless of the case (whether upper or lowwer)
String | Substrings to be counted | Num_substrings |
Abc, def, jhi klm, abc,def | Abc, def, klm | 5 |
Abc,jhi,zem,lmh,skj,abc | Abc,skj | 3 |
Thanks a bunch.
Dathan Byonaneby
Hi @DathanMD
Another approach could be the following one:
data have;
infile datalines dlm="09"x;
input String :$50. Substrings :$50.;
datalines;
Abc, def, jhi klm, abc,def Abc, def, klm
Abc,jhi,zem,lmh,skj,abc Abc,skj
;
run;
data want;
set have;
count=0;
do i=1 to countw(Substrings,",");
count + count(String, strip(scan(Substrings,i,",")),"i");
end;
drop i;
run;
Does your dataset look like this:
data have;
infile datalines dlm='|' dsd;
input String :$30. Substrings :$30.;
datalines;
Abc, def, jhi klm, abc,def|Abc, def, klm
Abc,jhi,zem,lmh,skj,abc|Abc,skj
;
?
Look at the dataset created by my data step; does it look like the dataset you have?
@DathanMD wrote:
The delimeter in the string is ',"
Hi @DathanMD
Here is an approach to do this:
data have;
infile datalines dlm="09"x;
input String :$50. Substrings :$50.;
datalines;
Abc, def, jhi klm, abc,def Abc, def, klm
Abc,jhi,zem,lmh,skj,abc Abc,skj
;
run;
proc sql noprint;
select max(countw(Substrings)) into:total trimmed from have;
quit;
data want;
set have;
count=0;
array _substring (&total) $50. _temporary_;
array _nb_substring (&total) _temporary_;
do i=1 to dim(_substring);
_substring(i)=scan(Substrings,i,",");
_nb_substring(i)=count(String, strip(_substring(i)),"i");
end;
count=sum(of _nb_substring(*));
drop i;
run;
Hi @DathanMD
Another approach could be the following one:
data have;
infile datalines dlm="09"x;
input String :$50. Substrings :$50.;
datalines;
Abc, def, jhi klm, abc,def Abc, def, klm
Abc,jhi,zem,lmh,skj,abc Abc,skj
;
run;
data want;
set have;
count=0;
do i=1 to countw(Substrings,",");
count + count(String, strip(scan(Substrings,i,",")),"i");
end;
drop i;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.