BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yellowyellowred
Obsidian | Level 7

Hi,   
  
I have a dataset. It looks something like this:  

data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR 789 Marketing 789 Council 123 Marketing
555 HR 123 Artist 555 Director
789 Artist
789 Doctor ;

I want the output to give this:

IDNumber of groups ID belongs in
1233
5551
7895

 

Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

SQL is the easiest here:

 

proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID;
quit;

You could also use a double proc freq or data step. 

https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas.

 


@yellowyellowred wrote:

Hi,   
  
I have a dataset. It looks something like this:  

data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR 789 Marketing 789 Council 123 Marketing
555 HR 123 Artist 555 Director
789 Artist
789 Doctor ;

I want the output to give this:

ID Number of groups ID belongs in
123 3
555 1
789 5

 

Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?


 

View solution in original post

3 REPLIES 3
Reeza
Super User

SQL is the easiest here:

 

proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID;
quit;

You could also use a double proc freq or data step. 

https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas.

 


@yellowyellowred wrote:

Hi,   
  
I have a dataset. It looks something like this:  

data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR 789 Marketing 789 Council 123 Marketing
555 HR 123 Artist 555 Director
789 Artist
789 Doctor ;

I want the output to give this:

ID Number of groups ID belongs in
123 3
555 1
789 5

 

Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?


 

yellowyellowred
Obsidian | Level 7

@Reezathanks. how would I also add one to show the total number of distinct groups? In this case, it is 6. Edit: never mind I figured it out. It's just  

proc sql;
create table total_groups as
select count(distinct RgstrtnGrpNm) as total_count
from have
run;
Reeza
Super User
proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID
union 
select "Total" as ID, count(distinct group) as Number_Groups
from have;
quit;

If ID was character something like this would work

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 2631 views
  • 3 likes
  • 2 in conversation