BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DathanMD
Obsidian | Level 7

Dear Colleagues,

 

Hi friends, I am new to SAS and would like someone’s favour to crack the code for my data manipulation before. Hi friends, I am new to SAS and would like someone’s favour to crack the code for my data manipulation before. 

 

I would like to count how many times multiple substrings (words) appear in a string. For example in the string below, I would like to count how many times the substrings appear. i want the substrings to be counted regardless of the case (whether upper or lowwer)

 

String

Substrings to be counted

Num_substrings

Abc, def, jhi klm, abc,def

Abc, def, klm

5

Abc,jhi,zem,lmh,skj,abc

Abc,skj

3

 

Thanks a bunch.

 

Dathan Byonaneby

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @DathanMD 

Another approach could be the following one:

 

data have;
	infile datalines dlm="09"x;
	input String :$50. Substrings :$50.;
	datalines;
Abc, def, jhi klm, abc,def	Abc, def, klm
Abc,jhi,zem,lmh,skj,abc	Abc,skj
;
run;

data want;
	set have;
	count=0;
	do i=1 to countw(Substrings,",");
		count + count(String, strip(scan(Substrings,i,",")),"i");
	end;
	drop i;
run;

 Capture d’écran 2020-02-25 à 09.18.59.png

 

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

Does your dataset look like this:

data have;
infile datalines dlm='|' dsd;
input String :$30. Substrings :$30.;
datalines;
Abc, def, jhi klm, abc,def|Abc, def, klm
Abc,jhi,zem,lmh,skj,abc|Abc,skj
;

?

DathanMD
Obsidian | Level 7
The delimeter in the string is ',"
ed_sas_member
Meteorite | Level 14

Hi @DathanMD 

Here is an approach to do this:

data have;
	infile datalines dlm="09"x;
	input String :$50. Substrings :$50.;
	datalines;
Abc, def, jhi klm, abc,def	Abc, def, klm
Abc,jhi,zem,lmh,skj,abc	Abc,skj
;
run;

proc sql noprint;
	select max(countw(Substrings)) into:total trimmed from have;
quit;

data want;
	set have;
	count=0;
	array _substring (&total) $50. _temporary_;
	array _nb_substring (&total) _temporary_;
	do i=1 to dim(_substring);
		_substring(i)=scan(Substrings,i,",");
		_nb_substring(i)=count(String, strip(_substring(i)),"i");
	end;
	count=sum(of _nb_substring(*));
	drop i;
run;

 Capture d’écran 2020-02-25 à 09.18.59.png

 

ed_sas_member
Meteorite | Level 14

Hi @DathanMD 

Another approach could be the following one:

 

data have;
	infile datalines dlm="09"x;
	input String :$50. Substrings :$50.;
	datalines;
Abc, def, jhi klm, abc,def	Abc, def, klm
Abc,jhi,zem,lmh,skj,abc	Abc,skj
;
run;

data want;
	set have;
	count=0;
	do i=1 to countw(Substrings,",");
		count + count(String, strip(scan(Substrings,i,",")),"i");
	end;
	drop i;
run;

 Capture d’écran 2020-02-25 à 09.18.59.png

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2266 views
  • 3 likes
  • 3 in conversation