herd cow_id
10 101
10 101
10 101
10 101
10 101
10 101
10 101
10 102
10 102
10 102
10 102
10 103
10 104
10 104
10 104
10 105
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 107
11 107
11 108
11 108
11 108
11 108
11 108
11 108
11 108
I just wanted to count the number of cow_id for each herd that I have in my dataset.
Data named: Fatd and I have variable herd and coe_id, I want to delete all herds that have less than 5 cow_ids.
If you look at this data, Herd 10 has 5 cows ID (101, 102, 103, 104, 105). the repetition 7 times of cow_id 101 mean this cow has 7 times of measurements, so in my question, I don't care about the repetition, I'm just looking for cow_id. And herd 11 has 3 cows ID (106, 107, 108), so in this case, I just want to keep herd 10 and delete herd 11, it does not matter how much each cow had been repeated.
I just care about the number of cows in each herd not about the number of measurements. (Repeated cow_id is a different time of measurements).
I hope you get my idea this time
Regards
Ibrahim
Something like this?
data have;
input herd cow_id;
datalines;
10 101
10 101
10 101
10 101
10 101
10 101
10 101
10 102
10 102
10 102
10 102
10 103
10 104
10 104
10 104
10 105
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 107
11 107
11 108
11 108
11 108
11 108
11 108
11 108
11 108
;
proc sql;
create table want as
select * from have
where herd in
(select distinct herd from have group by herd having count(distinct cow_id) >= 5);
quit;
Something like this?
data have;
input herd cow_id;
datalines;
10 101
10 101
10 101
10 101
10 101
10 101
10 101
10 102
10 102
10 102
10 102
10 103
10 104
10 104
10 104
10 105
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 107
11 107
11 108
11 108
11 108
11 108
11 108
11 108
11 108
;
proc sql;
create table want as
select * from have
where herd in
(select distinct herd from have group by herd having count(distinct cow_id) >= 5);
quit;
My dataset over million, how I can put them in the code, I'm sorry about this but I'm new with SAS.
REGARDS
You don't.
You reference your SAS data set. We use cards/datalines to create example data to test the code.
@Barkamih wrote:
My dataset over million, how I can put them in the code, I'm sorry about this but I'm new with SAS.
REGARDS
proc sort data=....;
by herd cow_id;
run;
data ....
retain count;
set ....;
by herd cow_id;
if first.herd then count=1;
if first.cow_id then count=count+1;
run;
does that work?
NOT WORKING
@Barkamih wrote:
NOT WORKING
When someone says "NOT WORKING", this gives us NO information on what has gone wrong, and so there's no way for us to help further.
Give us information.
Show us your code
Show us your SASLOG
Show us a part of the results that indicate a problem.
Sorry about this, I was so stressful because I have a deadline soon.
this id the error
1273
1274
1275
1276 data FATD
1277
1278 retain count;
------
56
ERROR 56-185: RETAIN is not allowed in the DATA statement when option DATASTMTCHK=COREKEYWORDS.
Check for a missing semicolon in the DATA statement, or use DATASTMTCHK=NONE.
1279
1280 set FATD;
1281
1282 by herd cow_id;
1283
1284 if first.herd then count=1;
1285
1286 if first.cow_id then count=count+1;
1287
1288 run;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
@Barkamih wrote:
1276 data FATD
1277
1278 retain count;
Put a semi-colon after data FATD
Yes I found that
but Nothing change in data as this log showing
408 proc sort data=FATD;
409
410 by herd cow_id;
411
412 run;
NOTE: There were 1519417 observations read from the data set WORK.FATD.
NOTE: The data set WORK.FATD has 1519417 observations and 55 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 3.11 seconds
cpu time 1.86 seconds
413
414
415
416 data FATD;
417
418 retain count;
419
420 set FATD;
421
422 by herd cow_id;
423
424 if first.herd then count=1;
425
426 if first.cow_id then count=count+1;
427
428 run;
NOTE: There were 1519417 observations read from the data set WORK.FATD.
NOTE: The data set WORK.FATD has 1519417 observations and 55 variables.
NOTE: DATA statement used (Total process time):
real time 9.37 seconds
cpu time 1.71 seconds
That what I used to solve my issue!!!!!!
data have;
input herd cow_id;
cards;
10 101
10 101
10 101
10 101
10 101
10 101
10 101
10 102
10 102
10 102
10 102
10 103
10 104
10 104
10 104
10 105
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 106
11 107
11 107
11 108
11 108
11 108
11 108
11 108
11 108
11 108
;
proc sql;
create table want as
select *
from have
group by herd
having count(distinct cow_id)>=5;
quit;
Thank you every one.
my best regards to all of you
Brakmih
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.