BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yellowyellowred
Obsidian | Level 7

Hi,   
  
I have a dataset. It looks something like this:  

data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR 789 Marketing 789 Council 123 Marketing
555 HR 123 Artist 555 Director
789 Artist
789 Doctor ;

I want the output to give this:

IDNumber of groups ID belongs in
1233
5551
7895

 

Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

SQL is the easiest here:

 

proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID;
quit;

You could also use a double proc freq or data step. 

https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas.

 


@yellowyellowred wrote:

Hi,   
  
I have a dataset. It looks something like this:  

data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR 789 Marketing 789 Council 123 Marketing
555 HR 123 Artist 555 Director
789 Artist
789 Doctor ;

I want the output to give this:

ID Number of groups ID belongs in
123 3
555 1
789 5

 

Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?


 

View solution in original post

3 REPLIES 3
Reeza
Super User

SQL is the easiest here:

 

proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID;
quit;

You could also use a double proc freq or data step. 

https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas.

 


@yellowyellowred wrote:

Hi,   
  
I have a dataset. It looks something like this:  

data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR 789 Marketing 789 Council 123 Marketing
555 HR 123 Artist 555 Director
789 Artist
789 Doctor ;

I want the output to give this:

ID Number of groups ID belongs in
123 3
555 1
789 5

 

Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?


 

yellowyellowred
Obsidian | Level 7

@Reezathanks. how would I also add one to show the total number of distinct groups? In this case, it is 6. Edit: never mind I figured it out. It's just  

proc sql;
create table total_groups as
select count(distinct RgstrtnGrpNm) as total_count
from have
run;
Reeza
Super User
proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID
union 
select "Total" as ID, count(distinct group) as Number_Groups
from have;
quit;

If ID was character something like this would work

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1789 views
  • 3 likes
  • 2 in conversation