Hi,
I have a dataset. It looks something like this:
data have;
input ID group $;
datalines;
123 Marketing
123 HR
123 HR
789 Marketing
789 Council
123 Marketing
555 HR
123 Artist
555 Director
789 Artist
789 Doctor
;
I want the output to give this:
ID | Number of groups ID belongs in |
123 | 3 |
555 | 1 |
789 | 5 |
Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?
SQL is the easiest here:
proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID;
quit;
You could also use a double proc freq or data step.
https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas.
@yellowyellowred wrote:
Hi,
I have a dataset. It looks something like this:data have; input ID group $; datalines; 123 Marketing 123 HR
123 HR 789 Marketing 789 Council 123 Marketing
555 HR 123 Artist 555 Director
789 Artist
789 Doctor ;I want the output to give this:
ID Number of groups ID belongs in 123 3 555 1 789 5
Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?
SQL is the easiest here:
proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID;
quit;
You could also use a double proc freq or data step.
https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas.
@yellowyellowred wrote:
Hi,
I have a dataset. It looks something like this:data have; input ID group $; datalines; 123 Marketing 123 HR
123 HR 789 Marketing 789 Council 123 Marketing
555 HR 123 Artist 555 Director
789 Artist
789 Doctor ;I want the output to give this:
ID Number of groups ID belongs in 123 3 555 1 789 5
Note that for a given ID, they may have duplicate Group values. ID 123 has "Marketing" and "HR" twice, but they only contribute 1 to the "number" total. Can someone show how this can be generated?
@Reezathanks. how would I also add one to show the total number of distinct groups? In this case, it is 6. Edit: never mind I figured it out. It's just
proc sql;
create table total_groups as
select count(distinct RgstrtnGrpNm) as total_count
from have
run;
proc sql;
create table want as
select ID, count(distinct group) as Number_Groups
from have
group by ID
union
select "Total" as ID, count(distinct group) as Number_Groups
from have;
quit;
If ID was character something like this would work
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.