Solved: adjusting for a covariate, maybe clustering?

Kyra · Posted 04-15-2020 11:27 PM

Hi,

I have tried to study association between outcome-medicine prescribed to patients (it is dose of medicine in continuous number) and some variables like age, ethnicity, smoking status, insurance status and weekday medicine was prescribed. Main reason was to see if dose of medicine prescribed is associated with week day it is prescribed.Something to see like higher doses are prescribed over weekend.

I used linear regression model. Reviewers are saying that this association might be due to specific surgeons operating on specific days of week. There are 90 surgeons , whose patients are in the dataset.

Please let me know how should i handle this question.

Thank you very much in advance!

PaigeMiller · Posted 04-16-2020 08:23 AM

I would add surgeon into the model, in a perfect world this would account for the effect of surgeon on the prescribed amount of a drug. The problem in the imperfect real worls is that if you have 90 surgeons, you need a lot of data in order to find real effects (and maybe you have a lot of data). But again, in a perfect world, then the effect of weekend vs. weekday has the effect of surgeon removed.

I don't see clustering as a possible solution here.

--
Paige Miller

View solution in original post

Reeza · Posted 04-16-2020 12:31 AM

What is the N? Can you add surgeon as a factor in the model? Or specifically add indicators for surgeons who do weekend/night versus day shifts or however your shifts are assigned.

SteveDenham · Posted 04-16-2020 08:15 AM

Well, the first thing I would do is look at the distribution of surgeons by day of the week and identify those that are "weekend" surgeons and those that are "weekday" surgeons which is what @Reeza is suggesting. Then I would add this into the mix of variables that you have in your model. I presume that because some of your variables are categorical you are using something like GLM or MIXED to do your linear regression. I make this assumption because a regression that considers ethnicity as continuous variable is probably going to give different results depending on the values assigned. Is this a correct assumption?

SteveDenham

PaigeMiller · Posted 04-16-2020 08:23 AM

I would add surgeon into the model, in a perfect world this would account for the effect of surgeon on the prescribed amount of a drug. The problem in the imperfect real worls is that if you have 90 surgeons, you need a lot of data in order to find real effects (and maybe you have a lot of data). But again, in a perfect world, then the effect of weekend vs. weekday has the effect of surgeon removed.

I don't see clustering as a possible solution here.

--
Paige Miller

Kyra · Posted 04-16-2020 03:33 PM

Hi, I wanted to add one more thing to this.

We have submitted another paper using the same dataset in a different journal. Here we were looking for association between outcome-prescription and different variables patient characteristics including age , gender, ethnicity , history of alcohol abuse etc. , provider characteristic including provider's age, gender , year of practice.

We did mutivariate linear regression Proc GLM and found that patient age, gender, ethnicity, surgeon gender, age, and years in practice was significantly predictive of the amount of opioids prescribed.

We have received a comment from reviewer saying, 'My major concern in the paper is the way that surgeon characteristics are incorporated in the models. There is likely significant clustering by surgeon and thus inclusion of just the characteristics without adjusting for clustering may incorrectly measure the effect of the surgeon predictors on the outcome. A multi-level model (with patient and surgeon-levels) would be more appropriate and account for clustering effects. '

Plaese let me know what do you think about this. N= 23000 and number of surgeons= 90

PaigeMiller · Posted 04-16-2020 08:02 PM

There is likely significant clustering by surgeon

II don't really know what this means, and I have no background in medical studies, nor do I know if this statement is true or if it is reflected in your data.

So I think you'd have to look into the data and see. Or talk to people in your field who might be able to help. Or both.

--
Paige Miller

Kyra · Posted 04-16-2020 09:16 PM

Thank you very much for the prompt reply! This community is always a great asset.

Just one help- Can you please direct me to SAS codes which i can use if clustering is a problem and outcome is continuous. (currently using Proc GLM)Thanks!

Reeza · Posted 04-16-2020 09:41 PM

I don't think you'd want clusters wouldn't you want STRATA's instead? Ie analyze your weekend/evening shifts separately from day time? It really depends on what you're aiming to do overall.

Kyra · Posted 04-16-2020 11:02 PM

My problem is our data does not have any information regarding shifts, weekdays by surgeons . And this data spans 5 years. I would assume shifts are not the similar and might have changed for a lot of surgeons. The only information i have regarding surgeons is there name, age, gender, years of practice. Thanks.

Reeza · Posted 04-17-2020 12:52 AM

You don't have time of day for the surgery?

Kyra · Posted 04-17-2020 09:58 AM

Thanks for pointing this out to me. I have date of surgery and time of surgery.

I can get weekday of surgery and day, evening, night shift of surgery from there.

I can add day of surgery and shift of surgery to the model.

Sorry, to give too much trouble but how do i explain to reviewer that it cannot be explained by clustering . I think they would like an explanation. Thank you!

PaigeMiller · Posted 04-17-2020 06:49 AM

@Kyra wrote:

My problem is our data does not have any information regarding shifts, weekdays by surgeons . And this data spans 5 years. I would assume shifts are not the similar and might have changed for a lot of surgeons. The only information i have regarding surgeons is there name, age, gender, years of practice. Thanks.

Try putting the surgeon's age, gender and years of practice into the model. This could adjust for whatever was meant by "clustering" of surgeons.

--
Paige Miller

Kyra · Posted 04-17-2020 07:50 AM

Surgeons age, gender, year of practice is already in the model

PaigeMiller · Posted 04-17-2020 07:55 AM

@Kyra wrote:
Surgeons age, gender, year of practice is already in the model

Well, if you are using all the information you have about the surgeon, its hard to see how any type of "clustering" (no matter how it is defined) can improve on that.

--
Paige Miller

PaigeMiller · Posted 04-17-2020 06:48 AM

@Kyra wrote:

Thank you very much for the prompt reply! This community is always a great asset.

Just one help- Can you please direct me to SAS codes which i can use if clustering is a problem and outcome is continuous. (currently using Proc GLM)Thanks!

No, I can't because I can't see how clustering would help here.

However, in thinking about the problem, and do you need to modify the model to take account of something that can be loosely called "clustering" of surgeons ... this is what I thought ... look at the residuals. If there are patterns in the residuals — specifically when you plot the residuals against surgeons, but really you should plot residuals against all variables — this indicates a deficiency of the model (or an incorrect assumption somewhere) and it could be that whatever is going on with the surgeons is causing the pattern (although other things cause patterns). If there is no pattern in the residuals, then you probably don't need to worry about whatever is meant by "clustering" of surgeons.

--
Paige Miller

adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Re: adjusting for a covariate, maybe clustering?

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away