BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DathanMD
Obsidian | Level 7

Dear Colleagues,

 

Hi friends, I am new to SAS and would like someone’s favour to crack the code for my data manipulation before. Hi friends, I am new to SAS and would like someone’s favour to crack the code for my data manipulation before. 

 

I would like to count how many times multiple substrings (words) appear in a string. For example in the string below, I would like to count how many times the substrings appear. i want the substrings to be counted regardless of the case (whether upper or lowwer)

 

String

Substrings to be counted

Num_substrings

Abc, def, jhi klm, abc,def

Abc, def, klm

5

Abc,jhi,zem,lmh,skj,abc

Abc,skj

3

 

Thanks a bunch.

 

Dathan Byonaneby

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @DathanMD 

Another approach could be the following one:

 

data have;
	infile datalines dlm="09"x;
	input String :$50. Substrings :$50.;
	datalines;
Abc, def, jhi klm, abc,def	Abc, def, klm
Abc,jhi,zem,lmh,skj,abc	Abc,skj
;
run;

data want;
	set have;
	count=0;
	do i=1 to countw(Substrings,",");
		count + count(String, strip(scan(Substrings,i,",")),"i");
	end;
	drop i;
run;

 Capture d’écran 2020-02-25 à 09.18.59.png

 

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

Does your dataset look like this:

data have;
infile datalines dlm='|' dsd;
input String :$30. Substrings :$30.;
datalines;
Abc, def, jhi klm, abc,def|Abc, def, klm
Abc,jhi,zem,lmh,skj,abc|Abc,skj
;

?

DathanMD
Obsidian | Level 7
The delimeter in the string is ',"
ed_sas_member
Meteorite | Level 14

Hi @DathanMD 

Here is an approach to do this:

data have;
	infile datalines dlm="09"x;
	input String :$50. Substrings :$50.;
	datalines;
Abc, def, jhi klm, abc,def	Abc, def, klm
Abc,jhi,zem,lmh,skj,abc	Abc,skj
;
run;

proc sql noprint;
	select max(countw(Substrings)) into:total trimmed from have;
quit;

data want;
	set have;
	count=0;
	array _substring (&total) $50. _temporary_;
	array _nb_substring (&total) _temporary_;
	do i=1 to dim(_substring);
		_substring(i)=scan(Substrings,i,",");
		_nb_substring(i)=count(String, strip(_substring(i)),"i");
	end;
	count=sum(of _nb_substring(*));
	drop i;
run;

 Capture d’écran 2020-02-25 à 09.18.59.png

 

ed_sas_member
Meteorite | Level 14

Hi @DathanMD 

Another approach could be the following one:

 

data have;
	infile datalines dlm="09"x;
	input String :$50. Substrings :$50.;
	datalines;
Abc, def, jhi klm, abc,def	Abc, def, klm
Abc,jhi,zem,lmh,skj,abc	Abc,skj
;
run;

data want;
	set have;
	count=0;
	do i=1 to countw(Substrings,",");
		count + count(String, strip(scan(Substrings,i,",")),"i");
	end;
	drop i;
run;

 Capture d’écran 2020-02-25 à 09.18.59.png

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2159 views
  • 3 likes
  • 3 in conversation