DATA Step, Macro, Functions and more

Count by group

Accepted Solution Solved
Reply
Regular Contributor
Posts: 181
Accepted Solution

Count by group

I want to calculate maximum number of times a value appear by group. For example, in the table below, 11 appears thrice in a group value of 1 and 22 appears twice in a group value of 1. So i want 1 11 in a row as 11 appears maximum times. Same logic holds for a ID value of 2. In that case, 12 22 should come out in a row. In the final output table, i need 2 rows as explained above.

 

ID ID2
1 11
1 11
1 11
1 22
1 22
2 12
2 12
2 12
2 16

Accepted Solutions
Solution
‎06-17-2016 02:13 PM
Respected Advisor
Posts: 3,124

Re: Count by group

Here is quick & dirty SQL:

proc sql;
select distinct id,id2 from
(select id,id2,count(*) as ct from have group by id,id2) 
group by id
having ct=max(ct)
;
quit;

View solution in original post


All Replies
Super User
Posts: 10,500

Re: Count by group

What do you want to do if two or more values of ID2 appear the same number of times such as:

ID ID2
1 11
1 11
1 11
1 22
1 22
1 22
2 12
2 12
2 16

Would you want 1 11 or 1 22? And the logic between picking the two?

(NOTE: I modified your data so 1 11 and 1 22 both appear 3 times)

Solution
‎06-17-2016 02:13 PM
Respected Advisor
Posts: 3,124

Re: Count by group

Here is quick & dirty SQL:

proc sql;
select distinct id,id2 from
(select id,id2,count(*) as ct from have group by id,id2) 
group by id
having ct=max(ct)
;
quit;
Regular Contributor
Posts: 181

Re: Count by group

@ballardw: I want both the cases if they appear equal number of times. Thanks! @Haikuo: Thanks a ton. Do you suggest the data step method for the same?
Respected Advisor
Posts: 3,124

Re: Count by group

It is possible to achieve the same results using data step, Hash or DOW can be utilized. The following code is an example using 2XDOW:

 

data have;
	input ID ID2;
	cards;
1 11 
1 11 
1 11 
1 22 
1 22 
1 22 
2 12 
2 12 
2 16 
;

data want;
	do until (last.id);
		set have;
		by id id2;

		if first.id then
			call missing (_id,_id2);

		if first.id2 then
			do;
				_id=max(_id,_id2);
				_id2=1;
			end;
		else _id2+1;
	end;

	do until (last.id);
		set have;
		by id id2;

		if first.id2 then
			do;
				_id2=1;
			end;
		else _id2+1;

		if last.id2 and _id2=_id then
			output;
	end;

	drop _:;
run;
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 311 views
  • 1 like
  • 3 in conversation