Solved
Contributor
Posts: 67

# count repeated variables by group

``````herd                     cow_id
10                           101
10                           101
10                           101
10                           101
10                           101
10                           101
10                           101
10                           102
10                           102
10                           102
10                           102
10                           103
10                           104
10                           104
10                           104
10                           105
11                           106
11                           106
11                           106
11                           106
11                           106
11                           106
11                           106
11                           106
11                           106
11                           107
11                           107
11                           108
11                           108
11                           108
11                           108
11                           108
11                           108
11                           108 ``````

I just wanted to count the number of cow_id for each herd that I have in my dataset.

Data named: Fatd  and    I have variable herd and coe_id,  I want to delete all herds that have less than 5 cow_ids.

If you look at this data, Herd 10 has 5 cows ID (101, 102, 103, 104, 105). the repetition 7 times of  cow_id 101 mean this cow has 7 times of measurements, so in my question, I don't care about the repetition, I'm just looking for cow_id.   And herd 11 has 3 cows ID (106, 107, 108), so in this case, I just want to keep herd 10 and delete herd 11, it does not matter how much each cow had been repeated.

I just care about the number of cows in each herd not about the number of measurements.  (Repeated cow_id is a different time of measurements).

I hope you get my idea this time

Regards

Ibrahim

Accepted Solutions
Solution
‎06-20-2018 04:49 PM
PROC Star
Posts: 1,287

## Re: count repeated variables by group

Something like this?

``````data have;
input herd cow_id;
datalines;
10 101
10 101
10 101
10 101
10 101
10 101
10 101
10 102
10 102
10 102
10 102
10 103
10 104
10 104
10 104
10 105
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 107
11 107
11 108
11 108
11 108
11 108
11 108
11 108
11 108
;

proc sql;
create table want as
select * from have
where herd in
(select distinct herd from have group by herd having count(distinct cow_id) >= 5);
quit;
``````

All Replies
Solution
‎06-20-2018 04:49 PM
PROC Star
Posts: 1,287

## Re: count repeated variables by group

Something like this?

``````data have;
input herd cow_id;
datalines;
10 101
10 101
10 101
10 101
10 101
10 101
10 101
10 102
10 102
10 102
10 102
10 103
10 104
10 104
10 104
10 105
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 107
11 107
11 108
11 108
11 108
11 108
11 108
11 108
11 108
;

proc sql;
create table want as
select * from have
where herd in
(select distinct herd from have group by herd having count(distinct cow_id) >= 5);
quit;
``````
Contributor
Posts: 67

## Re: count repeated variables by group

My dataset over million, how I can put them in the code, I'm sorry about this but I'm new with SAS.

REGARDS

Super User
Posts: 23,778

## Re: count repeated variables by group

You don't.

You reference your SAS data set. We use cards/datalines to create example data to test the code.

@Barkamih wrote:

My dataset over million, how I can put them in the code, I'm sorry about this but I'm new with SAS.

REGARDS

Regular Contributor
Posts: 167

## Re: count repeated variables by group

proc sort data=....;

by herd cow_id;

run;

data ....

retain count;

set ....;

by herd cow_id;

if first.herd then count=1;

if first.cow_id then count=count+1;

run;

does that work?

--------------
blog: papersandprograms.com
Contributor
Posts: 67

NOT WORKING

Posts: 3,069

## Re: count repeated variables by group

[ Edited ]

@Barkamih wrote:

NOT WORKING

When someone says "NOT WORKING", this gives us NO information on what has gone wrong, and so there's no way for us to help further.

Give us information.

Show us a part of the results that indicate a problem.

--
Paige Miller
Contributor
Posts: 67

## Re: count repeated variables by group

this id the error

1273
1274
1275
1276 data FATD
1277
1278 retain count;
------
56
ERROR 56-185: RETAIN is not allowed in the DATA statement when option DATASTMTCHK=COREKEYWORDS.
Check for a missing semicolon in the DATA statement, or use DATASTMTCHK=NONE.

1279
1280 set FATD;
1281
1282 by herd cow_id;
1283
1284 if first.herd then count=1;
1285
1286 if first.cow_id then count=count+1;
1287
1288 run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

Super User
Posts: 23,778

## Re: count repeated variables by group

You're missing a semicolon.
Posts: 3,069

## Re: count repeated variables by group

@Barkamih wrote:

1276 data FATD
1277
1278 retain count;

Put a semi-colon after data FATD

--
Paige Miller
Contributor
Posts: 67

## Re: count repeated variables by group

Yes I found  that

but Nothing change in data as this log showing

408 proc sort data=FATD;
409
410 by herd cow_id;
411
412 run;

NOTE: There were 1519417 observations read from the data set WORK.FATD.
NOTE: The data set WORK.FATD has 1519417 observations and 55 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 3.11 seconds
cpu time 1.86 seconds

413
414
415
416 data FATD;
417
418 retain count;
419
420 set FATD;
421
422 by herd cow_id;
423
424 if first.herd then count=1;
425
426 if first.cow_id then count=count+1;
427
428 run;

NOTE: There were 1519417 observations read from the data set WORK.FATD.
NOTE: The data set WORK.FATD has 1519417 observations and 55 variables.
NOTE: DATA statement used (Total process time):
real time 9.37 seconds
cpu time 1.71 seconds

PROC Star
Posts: 1,846

## Re: count repeated variables by group

@Barkamih  May i ask why haven't you tried @draycut sql solution in the first place ?

Contributor
Posts: 67

## Re: count repeated variables by group

That what I used to solve my issue!!!!!!

PROC Star
Posts: 1,846

## Re: count repeated variables by group

``````data have;
input herd                     cow_id;
cards;
10                           101
10                           101
10                           101
10                           101
10                           101
10                           101
10                           101
10                           102
10                           102
10                           102
10                           102
10                           103
10                           104
10                           104
10                           104
10                           105
11                           106
11                           106
11                           106
11                           106
11                           106
11                           106
11                           106
11                           106
11                           106
11                           107
11                           107
11                           108
11                           108
11                           108
11                           108
11                           108
11                           108
11                           108
;

proc sql;
create table want as
select *
from have
group by herd
having  count(distinct cow_id)>=5;
quit;``````
Contributor
Posts: 67

## Re: count repeated variables by group

Thank you every one.

my  best regards to all of you

Brakmih

☑ This topic is solved.