Hi,
I have a dataset. It looks something like this:
data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR
789 Marketing
789 Council
123 Marketing
555 HR
123 Artist
555 Director
789 Artist
789 Doctor
;
I want the output to give this:
ID | Number of groups ID belongs in |
123 | 3 |
555 | 1 |
789 | 5 |
Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?
SQL is the easiest here:
proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID;
quit;
You could also use a double proc freq or data step.
https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas.
@yellowyellowred wrote:
Hi,
I have a dataset. It looks something like this:data have; input ID group $; datalines; 123 Marketing 123 HR
123 HR 789 Marketing 789 Council 123 Marketing
555 HR 123 Artist 555 Director
789 Artist
789 Doctor ;I want the output to give this:
ID Number of groups ID belongs in 123 3 555 1 789 5
Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?
SQL is the easiest here:
proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID;
quit;
You could also use a double proc freq or data step.
https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas.
@yellowyellowred wrote:
Hi,
I have a dataset. It looks something like this:data have; input ID group $; datalines; 123 Marketing 123 HR
123 HR 789 Marketing 789 Council 123 Marketing
555 HR 123 Artist 555 Director
789 Artist
789 Doctor ;I want the output to give this:
ID Number of groups ID belongs in 123 3 555 1 789 5
Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?
@Reezathanks. how would I also add one to show the total number of distinct groups? In this case, it is 6. Edit: never mind I figured it out. It's just
proc sql;
create table total_groups as
select count(distinct RgstrtnGrpNm) as total_count
from have
run;
proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID
union
select "Total" as ID, count(distinct group) as Number_Groups
from have;
quit;
If ID was character something like this would work
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.