BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DathanMD
Obsidian | Level 7

Dear Colleagues,

 

Hi friends, I am new to SAS and would like someone’s favour to crack the code for my data manipulation before. Hi friends, I am new to SAS and would like someone’s favour to crack the code for my data manipulation before. 

 

I would like to count how many times multiple substrings (words) appear in a string. For example in the string below, I would like to count how many times the substrings appear. i want the substrings to be counted regardless of the case (whether upper or lowwer)

 

String

Substrings to be counted

Num_substrings

Abc, def, jhi klm, abc,def

Abc, def, klm

5

Abc,jhi,zem,lmh,skj,abc

Abc,skj

3

 

Thanks a bunch.

 

Dathan Byonaneby

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @DathanMD 

Another approach could be the following one:

 

data have;
	infile datalines dlm="09"x;
	input String :$50. Substrings :$50.;
	datalines;
Abc, def, jhi klm, abc,def	Abc, def, klm
Abc,jhi,zem,lmh,skj,abc	Abc,skj
;
run;

data want;
	set have;
	count=0;
	do i=1 to countw(Substrings,",");
		count + count(String, strip(scan(Substrings,i,",")),"i");
	end;
	drop i;
run;

 Capture d’écran 2020-02-25 à 09.18.59.png

 

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

Does your dataset look like this:

data have;
infile datalines dlm='|' dsd;
input String :$30. Substrings :$30.;
datalines;
Abc, def, jhi klm, abc,def|Abc, def, klm
Abc,jhi,zem,lmh,skj,abc|Abc,skj
;

?

DathanMD
Obsidian | Level 7
The delimeter in the string is ',"
ed_sas_member
Meteorite | Level 14

Hi @DathanMD 

Here is an approach to do this:

data have;
	infile datalines dlm="09"x;
	input String :$50. Substrings :$50.;
	datalines;
Abc, def, jhi klm, abc,def	Abc, def, klm
Abc,jhi,zem,lmh,skj,abc	Abc,skj
;
run;

proc sql noprint;
	select max(countw(Substrings)) into:total trimmed from have;
quit;

data want;
	set have;
	count=0;
	array _substring (&total) $50. _temporary_;
	array _nb_substring (&total) _temporary_;
	do i=1 to dim(_substring);
		_substring(i)=scan(Substrings,i,",");
		_nb_substring(i)=count(String, strip(_substring(i)),"i");
	end;
	count=sum(of _nb_substring(*));
	drop i;
run;

 Capture d’écran 2020-02-25 à 09.18.59.png

 

ed_sas_member
Meteorite | Level 14

Hi @DathanMD 

Another approach could be the following one:

 

data have;
	infile datalines dlm="09"x;
	input String :$50. Substrings :$50.;
	datalines;
Abc, def, jhi klm, abc,def	Abc, def, klm
Abc,jhi,zem,lmh,skj,abc	Abc,skj
;
run;

data want;
	set have;
	count=0;
	do i=1 to countw(Substrings,",");
		count + count(String, strip(scan(Substrings,i,",")),"i");
	end;
	drop i;
run;

 Capture d’écran 2020-02-25 à 09.18.59.png

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1082 views
  • 3 likes
  • 3 in conversation