BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kmardinian
Quartz | Level 8

Hi, I have a dataset where I have subject ID and site ID as two separate variables and I would like to count the unique subject IDs for each site id.

 

subject            site

10021             644

10021             644

10256             644

10256             644

56985             733

56985             733

45698             733

12659             733

 

And I would like the output dataset to look like this:

 

site           count

644             2

733             3

 

Any help is much appreciated! Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
proc sql;
create table want as
select site, count(unique subject) as count
from have
group by site;
quit;

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20
proc sql;
create table want as
select site, count(unique subject) as count
from have
group by site;
quit;
kmardinian
Quartz | Level 8

Thank you!! This is exactly what I needed!

ed_sas_member
Meteorite | Level 14

Hi @kmardinian 

 

You can do this:

data have;
	input subject site;
	cards;
10021 644
10021 644
10256 644
10256 644
56985 733
56985 733
45698 733
12659 733
;
run;
proc sql;
	create table want as
	select site, count(distinct subject) as count
	from have
	group by site;
run;
Reeza
Super User
A double proc freq is the other option.

proc freq data=have noprint;
table site*subject / out=temp;
run;

proc freq data=temp;
table site / out=want;
run;
Ksharp
Super User

Reeza ,

no need double, just one shot.

 

data have;
	input subject site;
	cards;
10021 644
10021 644
10256 644
10256 644
56985 733
56985 733
45698 733
12659 733
;

ods output nlevels=want;
proc freq data=have nlevels;
by site;
table subject;
run;
novinosrin
Tourmaline | Level 20

if the data is grouped like your sample suggests, a datastep is convenient too

 

data have;
input subject $            site $;
cards;
10021             644
10021             644
10256             644
10256             644
56985             733
56985             733
45698             733
12659             733
;

data want;
set have;
by site subject notsorted;
if first.site then count=1;
else if first.subject then count+1;
if last.site;
drop subject;
run;
novinosrin
Tourmaline | Level 20

Some HASH teaser 🙂 

 

data have;
input subject $            site $;
cards;
10021             644
10256             644
10021             644
10256             644
56985             733
56985             733
45698             733
12659             733
;


data want;
if _n_=1 then do;
   dcl hash H () ;
   h.definekey  ("subject") ;
   h.definedata ("subject") ;
   h.definedone () ;
 end;
 do until(last.site);
  set have;
  by site;
  h.ref();
 end;
 count=h.num_items;
 h.clear();
 drop subject;
run;

 

 

novinosrin
Tourmaline | Level 20

Key indexing paint brush . This assumes subjectid is numeric or char with only digit characters

 

data have;
input subject $            site $;
cards;
10021             644
10256             644
10021             644
10256             644
56985             733
56985             733
45698             733
12659             733
;

data want;
array t(-1000000:1000000)_temporary_;
do until(last.site);
  set have;
  by site;
  t(input(subject,32.))=1;
end;
count=n(of t(*));
call missing(of t(*));
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 5869 views
  • 2 likes
  • 5 in conversation