BookmarkSubscribeRSS Feed
csessa3
Calcite | Level 5

Hello,

 

I have a dataset that has doctors and the various practices they work in. Each doctor in my dataset works in at least 1 practice but as many as 17 different practices. I would like to know the unique number of doctors each one works with. 

 

Example dataset below. This sample shows that Doctor A is in practices, P1, P3, and P5. Doctor E is in practices P1, P2, and P5,etc.

 

DoctorTot_in_groupgrou_practice
A2P1
E2P1
C2P2
B2P2
A3P3
D3P3
E3P3
E2P5
A2P5

 

From this chart I would want a new column with the total number of unique doctors each one works with. In this case Doctor A works with 2 other doctors (E & D.) However, if I simply grouped by doctor and summed, I find that Doctor A works with 6 Doctors. However this is wrong because it would count Doctor A 3 times (once for each practice he is listed in) AND it would count Doctor E twice (he is in two group practices with Doctor A, P1 & P5)

 

I have ~ 800,000 doctors with ~400,000 group practices making manual methods unfeasible. Does any one have any suggestions on how to get this started?

 

sample data set code

data test;
input doctor $ tot_in_group group_practices $;
datalines;
A 2 P1
E 2 P1
C 3 P2
B 3 P2
E 3 P2
A 2 P3
D 2 P3
E 2 P5
A 2 P5
;
run;

 

 

6 REPLIES 6
novinosrin
Tourmaline | Level 20

Hi @csessa3  Not sure if I understand your requirement thoroughly. Are you after this by any chance?

 



data test;
input doctor $ tot_in_group group_practices $;
datalines;
A 2 P1
E 2 P1
C 3 P2
B 3 P2
E 3 P2
A 2 P3
D 2 P3
E 2 P5
A 2 P5
;
run;

proc sql;
create table want as
select a.doctor,count(distinct b.doctor)-1 as count
from test a left join test b
on a.group_practices=b.group_practices
group by a.doctor;
quit;
doctor count
A 2
B 2
C 2
D 1
E 3

 

csessa3
Calcite | Level 5

Hello, thank you for taking time to review my question.

 

The code below is close as it adjusts the count to not double count the individual Doctor (Doctor E is listed in 3 practices so it will subtract 3 for that.) However, does not adjust for when the same doctors are in multiple practices together.

 

For example, Doctor A only works with 1 other doctor (Doctor E.) It just so happens that Doctor A and Doctor E work in 2 practices together (P1, P2) In contrast, Doctor E works with 2 other doctors, Doctor A and Doctor D.

 

The final output would look like:

DoctorUnique Doctors Work with
A1
B2
C2
D1
E2
novinosrin
Tourmaline | Level 20

Can you explain 

D 2

 

or should it be 

D 1

?

csessa3
Calcite | Level 5

I'm sorry, that was a typo on my part, I will see if I can edit my previous comment.

 

Doctor D should be listed as working with 1 other docotor

novinosrin
Tourmaline | Level 20

Hi @csessa3  No worries. I am afraid perhaps not the best sleep last night or whatever I am not proud of the fact my comprehension is slow and slippery today. Again, not stating it is now 100%, but I am thinking the following should be close-


data test;
input doctor $ tot_in_group group_practices $;
datalines;
A 2 P1
E 2 P1
C 3 P2
B 3 P2
E 3 P2
A 2 P3
D 2 P3
E 2 P5
A 2 P5
;
run;

proc sql;
create table want as
select doctor, count(bd) as unique_count
from
(select a.doctor,b.doctor as bd
from test a left join test b
on a.group_practices=b.group_practices and a.doctor ne b.doctor
group by a.doctor,b.doctor
having count(*)=1)
group by doctor;
quit;
doctor unique_count
A 1
B 2
C 2
D 1
E 2
ballardw
Super User

@csessa3 wrote:

Hello, thank you for taking time to review my question.

 

The code below is close as it adjusts the count to not double count the individual Doctor (Doctor E is listed in 3 practices so it will subtract 3 for that.) However, does not adjust for when the same doctors are in multiple practices together.

 

For example, Doctor A only works with 1 other doctor (Doctor E.) It just so happens that Doctor A and Doctor E work in 2 practices together (P1, P2) In contrast, Doctor E works with 2 other doctors, Doctor A and Doctor D.

 

The final output would look like:

Doctor Unique Doctors Work with
A 1
B 2
C 2
D 1
E 2

I see doctor A works in P3 with doctor D as well as E in two practices. So how does A have a 1 for unique doctors. E share practices with A, B and C. I am afraid that I don't quite understand your definition of "unique" or what you are actually counting.

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1223 views
  • 1 like
  • 3 in conversation