BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yubaraj
Fluorite | Level 6

Hi I want to get surgeon volume within previous 365 days of the operation on a person. 

My objective is to test the hypothesis that if a surgeon performed greater number of operation in the last 1 year then their operation is more likely to succeed.

All I have is records per each service users. The recipient ID can duplicate because if  someone has a operation in first leg then they also can have operation in second leg. I am interested in first operation

data WORK.docs_have(label='Surgery Data');
    input  rcpt_id:32. service_date:date9. physician_id:32.;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 
 1123 13OCT2010 3354 
 1267 18JAN2011 3354 
 1123 15AUG2011 3354
 1145 30NOV2009 2234 
 1323 24JAN2010 2234 
 1122 29JAN2010 2234 
 1122 13SEP2010 2234 

 ;;;;


 data WORK.docs_want(label='Surgery Data_want');
    input  rcpt_id:32. service_date:date9. physician_id:32. surg_vol:12;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 0
 1123 13OCT2010 3354 0
 1267 18JAN2011 3354 1
 1123 15AUG2011 3354 2
 1145 30NOV2009 2234 0
 1323 24JAN2010 2234 1
 1122 29JAN2010 2234 2
 
 
 

 ;;;;

The variables are 

 recipient ID (rcpt_id)  service date (service_date)  and provider ID (physician_id), 
 
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Here's one way, but the results don't match.

The TEMP data set is very close but you have a person 1123 who has been listed twice which you don't want for ID 1122. What's the logic between those two differing criteria? I'm guessing you likely have a different field you need to include in this logic as well. 

 

This code handles them the same. 

 

data WORK.docs_have(label='Surgery Data');

    input  rcpt_id:32. service_date:date9. physician_id:32.;
    ID = _n_;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 
 1123 13OCT2010 3354 
 1267 18JAN2011 3354 
 1123 15AUG2011 3354
 1145 30NOV2009 2234 
 1323 24JAN2010 2234 
 1122 29JAN2010 2234 
 1122 13SEP2010 2234 

 ;;;;
 
proc sql;
create table temp as
select *, 
   (select count(distinct t2.rcpt_Id) 
   from docs_have as t2 
   where t1.physician_id = t2.physician_id 
   and t1.ID ne t2.ID and 
   t2.service_date between t1.service_date - 365 and t1.service_date) as num_procs
from docs_have as t1
order by rcpt_id, ID
;
quit;

data want;
set temp;
by rcpt_id;
if first.rcpt_id;
run;

@yubaraj wrote:

Hi I want to get surgeon volume within previous 365 days of the operation on a person. 

My objective is to test the hypothesis that if a surgeon performed greater number of operation in the last 1 year then their operation is more likely to succeed.

All I have is records per each service users. The recipient ID can duplicate because if  someone has a operation in first leg then they also can have operation in second leg. I am interested in first operation

data WORK.docs_have(label='Surgery Data');
    input  rcpt_id:32. service_date:date9. physician_id:32.;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 
 1123 13OCT2010 3354 
 1267 18JAN2011 3354 
 1123 15AUG2011 3354
 1145 30NOV2009 2234 
 1323 24JAN2010 2234 
 1122 29JAN2010 2234 
 1122 13SEP2010 2234 

 ;;;;


 data WORK.docs_want(label='Surgery Data_want');
    input  rcpt_id:32. service_date:date9. physician_id:32. surg_vol:12;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 0
 1123 13OCT2010 3354 0
 1267 18JAN2011 3354 1
 1123 15AUG2011 3354 2
 1145 30NOV2009 2234 0
 1323 24JAN2010 2234 1
 1122 29JAN2010 2234 2
 
 
 

 ;;;;

The variables are 

 recipient ID (rcpt_id)  service date (service_date)  and provider ID (physician_id), 
 

 data WORK.docs_want(label='Surgery Data_want');
    input  rcpt_id:32. service_date:date9. physician_id:32. surg_vol:12;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 0
 1123 13OCT2010 3354 0
 1267 18JAN2011 3354 1
 1123 15AUG2011 3354 2
 1145 30NOV2009 2234 0
 1323 24JAN2010 2234 1
 1122 29JAN2010 2234 2

View solution in original post

6 REPLIES 6
ballardw
Super User

@yubaraj wrote:

Hi I want to get surgeon volume within previous 365 days of the operation on a person. 

My objective is to test the hypothesis that if a surgeon performed greater number of operation in the last 1 year then their operation is more likely to succeed.

All I have is records per each service users. The recipient ID can duplicate because if  someone has a operation in first leg then they also can have operation in second leg. I am interested in first operation

data WORK.docs_have(label='Surgery Data');
    input  rcpt_id:32. service_date:date9. physician_id:32.;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 
 1123 13OCT2010 3354 
 1267 18JAN2011 3354 
 1123 15AUG2011 3354
 1145 30NOV2009 2234 
 1323 24JAN2010 2234 
 1122 29JAN2010 2234 
 1122 13SEP2010 2234 

 ;;;;


 data WORK.docs_want(label='Surgery Data_want');
    input  rcpt_id:32. service_date:date9. physician_id:32. surg_vol:12;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 0
 1123 13OCT2010 3354 0
 1267 18JAN2011 3354 1
 1123 15AUG2011 3354 2
 1145 30NOV2009 2234 0
 1323 24JAN2010 2234 1
 1122 29JAN2010 2234 2
 
 
 

 ;;;;

The variables are 

 recipient ID (rcpt_id)  service date (service_date)  and provider ID (physician_id), 
 

 

How about walking us through just how the first values of 0, 1 and 2 are derived. Step by step. Because I don't see any thing obvious about why this gets zero

 1356 22NOV2008 3354 0

And this gets 2

 1123 15AUG2011 3354 2

 

yubaraj
Fluorite | Level 6

Hi 

1356 22NOV2008 3354 0

For surgeon_id 3354 surgery on  patient 1356 (rcpt_id) was first so there is no previous surgery by that surgeon  before  365 days of this surgery.

 

1123 13OCT2010 3354 0
This is 0 because the previous surgery by that surgeon was not in last 365 days.
1123 15AUG2011 3354

Sorry, this observation should have deleted on the 'want' database. since this was second surgery on subject 1123.

 so basically, I am interested in first surgery for each subject, and number of first surgeries done in last 365 days by each surgeon.

 

Reeza
Super User

Here's one way, but the results don't match.

The TEMP data set is very close but you have a person 1123 who has been listed twice which you don't want for ID 1122. What's the logic between those two differing criteria? I'm guessing you likely have a different field you need to include in this logic as well. 

 

This code handles them the same. 

 

data WORK.docs_have(label='Surgery Data');

    input  rcpt_id:32. service_date:date9. physician_id:32.;
    ID = _n_;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 
 1123 13OCT2010 3354 
 1267 18JAN2011 3354 
 1123 15AUG2011 3354
 1145 30NOV2009 2234 
 1323 24JAN2010 2234 
 1122 29JAN2010 2234 
 1122 13SEP2010 2234 

 ;;;;
 
proc sql;
create table temp as
select *, 
   (select count(distinct t2.rcpt_Id) 
   from docs_have as t2 
   where t1.physician_id = t2.physician_id 
   and t1.ID ne t2.ID and 
   t2.service_date between t1.service_date - 365 and t1.service_date) as num_procs
from docs_have as t1
order by rcpt_id, ID
;
quit;

data want;
set temp;
by rcpt_id;
if first.rcpt_id;
run;

@yubaraj wrote:

Hi I want to get surgeon volume within previous 365 days of the operation on a person. 

My objective is to test the hypothesis that if a surgeon performed greater number of operation in the last 1 year then their operation is more likely to succeed.

All I have is records per each service users. The recipient ID can duplicate because if  someone has a operation in first leg then they also can have operation in second leg. I am interested in first operation

data WORK.docs_have(label='Surgery Data');
    input  rcpt_id:32. service_date:date9. physician_id:32.;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 
 1123 13OCT2010 3354 
 1267 18JAN2011 3354 
 1123 15AUG2011 3354
 1145 30NOV2009 2234 
 1323 24JAN2010 2234 
 1122 29JAN2010 2234 
 1122 13SEP2010 2234 

 ;;;;


 data WORK.docs_want(label='Surgery Data_want');
    input  rcpt_id:32. service_date:date9. physician_id:32. surg_vol:12;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 0
 1123 13OCT2010 3354 0
 1267 18JAN2011 3354 1
 1123 15AUG2011 3354 2
 1145 30NOV2009 2234 0
 1323 24JAN2010 2234 1
 1122 29JAN2010 2234 2
 
 
 

 ;;;;

The variables are 

 recipient ID (rcpt_id)  service date (service_date)  and provider ID (physician_id), 
 

 data WORK.docs_want(label='Surgery Data_want');
    input  rcpt_id:32. service_date:date9. physician_id:32. surg_vol:12;
   format service_date date9.;
 datalines;
 1356 22NOV2008 3354 0
 1123 13OCT2010 3354 0
 1267 18JAN2011 3354 1
 1123 15AUG2011 3354 2
 1145 30NOV2009 2234 0
 1323 24JAN2010 2234 1
 1122 29JAN2010 2234 2
yubaraj
Fluorite | Level 6
Actually, I was interested in first surgery on a subject. So, when I selected first observation before running your code, it worked.
Thanks
Reeza
Super User
If you do that BEFORE running the code rather than after you may miss surgery counts though. Is that what you want to do, exclude surgeries if a surgeon worked on the same patient multiple times?
yubaraj
Fluorite | Level 6
Yes, I am interested in first surgeries on a patient.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 963 views
  • 2 likes
  • 3 in conversation