## counting occurrences over variables

Solved
Super Contributor
Posts: 459

# counting occurrences over variables

[ Edited ]

Hi,

suppose I have the following data:

 var1 var2 var3 var4 var5 var6 a a a b b b a b a b a b b b b a a a

what I would like to obtain is a new data set consisting of the above data set where the value "a" occurs at least 2 times over the variables var1 to var 4.

So here I will end up with 1st row because "a" occurs 3 times within var1 - var4,

and the second row because it occurs 2 times.

Thank you!

Accepted Solutions
Solution
‎01-20-2017 08:25 PM
PROC Star
Posts: 8,169

## Re: counting occurrences over variables

Here is one way to do it:

```data have;
input (var1-var6) (\$);
cards;
a a a b b b
a b a b a b
b b b a a a
;

data want;
set have;
if countc(catt(of var1-var4),'a') ge 2 then output;
run;
```

HTH,

Art

All Replies
Solution
‎01-20-2017 08:25 PM
PROC Star
Posts: 8,169

## Re: counting occurrences over variables

Here is one way to do it:

```data have;
input (var1-var6) (\$);
cards;
a a a b b b
a b a b a b
b b b a a a
;

data want;
set have;
if countc(catt(of var1-var4),'a') ge 2 then output;
run;
```

HTH,

Art

Super User
Posts: 6,785

## Re: counting occurrences over variables

In real life, is the key string only 1 character long?  If so, Art's solution works just fine.  If not, you may need to rethink your approach.  Another possibility:

data want;

set have;

array vars {6} var1-var6;

totcount = 0;

do _n_=1 to 4;

if vars{_n_}='a' then totcount + 1;

end;

drop totcount;

if totcount >= 2;

run;

Other details might matter.  Does capitalization play a role?  Do partial words count?  For example, if you are searching for "cancer" would "cancerous" count?  The more details you can supply, the better the answer you will receive.

Super User
Posts: 9,599

## Re: counting occurrences over variables

A slight change to @art297's great example should cover words also:

```data have;
input (var1-var6) (\$);
cards;
a a a b b b
a b a b a b
b b b a a a
;
run;

data want;
set have;
sum_of_a=6 - countw(tranwrd(catx(',',of var:),"a",""),",","blank");
run;
```

What I do is drop any "a" results, then count what is left and take that off total.  If you know the other options then you could go the other way of course.

☑ This topic is solved.