Solved: Total count of a variable based on another variable

yellowyellowred · Posted 01-12-2022 05:14 PM

Hi,

I have a dataset. It looks something like this:

data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR
789 Marketing
789 Council
123 Marketing
555 HR
123 Artist
555 Director
789 Artist
789 Doctor
;

I want the output to give this:

ID	Number of groups ID belongs in
123	3
555	1
789	5

Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?

Reeza · Posted 01-12-2022 05:17 PM

SQL is the easiest here:

proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID;
quit;

You could also use a double proc freq or data step.

https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas.

@yellowyellowred wrote:

Hi,

I have a dataset. It looks something like this:
data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR
789 Marketing
789 Council
123 Marketing
555 HR
123 Artist
555 Director
789 Artist
789 Doctor
;
I want the output to give this:

ID Number of groups ID belongs in

123 3

555 1

789 5

Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?

View solution in original post

Reeza · Posted 01-12-2022 05:17 PM

SQL is the easiest here:

proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID;
quit;

You could also use a double proc freq or data step.

https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas.

@yellowyellowred wrote:

Hi,

I have a dataset. It looks something like this:
data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR
789 Marketing
789 Council
123 Marketing
555 HR
123 Artist
555 Director
789 Artist
789 Doctor
;
I want the output to give this:

ID Number of groups ID belongs in

123 3

555 1

789 5

Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?

yellowyellowred · Posted 01-12-2022 05:24 PM

@Reezathanks. how would I also add one to show the total number of distinct groups? In this case, it is 6. Edit: never mind I figured it out. It's just

proc sql;
create table total_groups as
select count(distinct RgstrtnGrpNm) as total_count
from have
run;

Reeza · Posted 01-12-2022 05:31 PM

proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID
union 
select "Total" as ID, count(distinct group) as Number_Groups
from have;
quit;

If ID was character something like this would work

Total count of a variable based on another variable

Re: Total count of a variable based on another variable

Re: Total count of a variable based on another variable

Re: Total count of a variable based on another variable

Re: Total count of a variable based on another variable

Catch up on SAS Innovate 2026

Total count of a variable based on another variable

Re: Total count of a variable based on another variable

Re: Total count of a variable based on another variable

Re: Total count of a variable based on another variable

Re: Total count of a variable based on another variable

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away