count number of specific string in a char variable

Solved
Frequent Contributor
Posts: 87

count number of specific string in a char variable

Hi,

I got a datasæt with many rows and a char variable with content of a list of numbers separated by :

41:31: 53:52:32:33:34:35:36:54:53:52:51:61:48:47:46:45:61:62

I need to count and sum the number of specific numbers to create a classification as this:

Group1 when number is in(48,47,46,45,44,43,42,41,31,32,33,34,35,36) and I need to sum the found number of string

Group2 when number is in(55,54,53,52,51,61,62,63,64,) and I need to sum the found number of string

So the result is Group1=11 and Group2=7

Any ideas how to do it ?

Accepted Solutions
Solution
‎11-03-2017 05:45 PM
Super User
Posts: 13,887

Re: count number of specific string in a char variable

Show some example input and the desired output for that input.

It is not at all clear what you are summing.

When I look at your group 2 example

```41:31:53:52:32:33:34:35:36:54:53:52:51:61:48:47:46:45:61:62
value  occurrences
55     0
54     1
53     2
52     2
51     1
61     2
62     1
63     0
64     0```

I get a total of 9 occurrences of the values or 6 values found. So how do you get 7?

This gets the counts of the substrings in group2 as an example:

```data example;
infile datalines truncover;
informat string \$100.;
input string;
array g {9} \$ _temporary_    ("55" "54" "53" "52" "51" "61" "62" "63" "64") ;
array gcount {9};
do i= 1 to dim(g);
gcount[i] =count(string,g[i],'it');
end;
drop i;
datalines;
41:31:53:52:32:33:34:35:36:54:53:52:51:61:48:47:46:45:61:62
;
run;```

Gcount1 will have the count of the 55 in the string variable. Gcount9 will have the count of the 64s.

The sum of the counts could be:

grp2sum= sum(of gcount(*));

You would need a separate temporary array and separate counting array (unless you are VERY careful) to use this approach for both groups.

All Replies
PROC Star
Posts: 548

Re: count number of specific string in a char variable

something like this then u can extend your idea for another one

``````data test;
length z \$8.;
a= '41:31:53:52:32:33:34:35:36:54:53:52:51:61:48:47:46:45:61:62';
b ='48,47,46,45,44,43,42,41,31,32,33,34,35,36';
sum = 0;
do until(z='');
count+1;
z = scan(a, count,":");
if find(b,trim(z)) then sum+z;
else sum+0;

end;
drop i count z;
run;
``````
Solution
‎11-03-2017 05:45 PM
Super User
Posts: 13,887

Re: count number of specific string in a char variable

Show some example input and the desired output for that input.

It is not at all clear what you are summing.

When I look at your group 2 example

```41:31:53:52:32:33:34:35:36:54:53:52:51:61:48:47:46:45:61:62
value  occurrences
55     0
54     1
53     2
52     2
51     1
61     2
62     1
63     0
64     0```

I get a total of 9 occurrences of the values or 6 values found. So how do you get 7?

This gets the counts of the substrings in group2 as an example:

```data example;
infile datalines truncover;
informat string \$100.;
input string;
array g {9} \$ _temporary_    ("55" "54" "53" "52" "51" "61" "62" "63" "64") ;
array gcount {9};
do i= 1 to dim(g);
gcount[i] =count(string,g[i],'it');
end;
drop i;
datalines;
41:31:53:52:32:33:34:35:36:54:53:52:51:61:48:47:46:45:61:62
;
run;```

Gcount1 will have the count of the 55 in the string variable. Gcount9 will have the count of the 64s.

The sum of the counts could be:

grp2sum= sum(of gcount(*));

You would need a separate temporary array and separate counting array (unless you are VERY careful) to use this approach for both groups.

Frequent Contributor
Posts: 87

Re: count number of specific string in a char variable

This is the solution that works for me on my own data - and in additon very nice and elegant solution. Thank you very much.

PROC Star
Posts: 618

Re: count number of specific string in a char variable

Hi,

In addition also look at PRXCHANGE with COUNT .

DATA test;
string="41:31:53:52:32:33:34:35:36:54:53:52:51:61:48:47:46:45:61:62";
group1="48,47,46,45,44,43,42,41,31,32,33,34,35,36";
group2="55,54,53,52,51,61,62,63,64";
Count_Group1=COUNT(prxchange("s/(48|47|46|45|44|43|42|41|31|32|33|34|35|36)/G1/i",-1,string),"G1");
Count_Group2=COUNT(prxchange("s/(55|54|53|52|51|61|62|63|64)/G2/i",-1,string),"G2");

run;

Thanks,
Suryakiran
☑ This topic is solved.