Solved: %margins macro to estimate predictive margins

mtohidi · Posted 03-05-2024 03:15 PM

I used %margins macro to estimate predictive margins for 10 year survival after a particular surgery. I wanted to estimate the probability of survival for patients with different variables (i.e. age, dementia, ASA comorbidity score, etc). I am wondering - for variables that are not included in margins= or at= lines, but are included in the model itself, does the macro use the average predictions over the sample distribution by default? If not, is this something that I can set?

As an example, I have (simplified version of coding):

%margins (data=x,

class = age sex dementia rural frailty ltc asa chf copd,

response = alive,

roptions = event='1',

model = age sex dementia rural frailty ltc asa chf copd,

dist = binomial,

margins = age,

at = sex dementia,

options = cl)

In this case, is the model using the average value/average event probability (in case of binary) for rural, frailty, asa, chf, and copd? If so, is it using the average for the entire cohort, or the average for each level of the margins=/at= variables that I have specified? Ideally, I would like the latter, but I'm not sure if this is what I'm actually estimating.

Thank you so much for your help, and apologies in advance for my lack of knowledge on this matter.

StatDave · Posted 03-05-2024 04:26 PM

Predictive margins do not fix all predictors at specified levels. This distinguishes them from LS-means. So, in each of your SEX*DEMENTIA groups, the margin for each age level is the average predicted probability of ALIVE=1. In each observation in the average, the predicted probability is computed with AGE, SEX and DEMENTIA fixed, but using the actual values of the other predictors in the model. That is, all of the other predictors can have different values in each of the observations used in the average. See the Details section of the Margins macro documentation.

View solution in original post

StatDave · Posted 03-05-2024 04:26 PM

Predictive margins do not fix all predictors at specified levels. This distinguishes them from LS-means. So, in each of your SEX*DEMENTIA groups, the margin for each age level is the average predicted probability of ALIVE=1. In each observation in the average, the predicted probability is computed with AGE, SEX and DEMENTIA fixed, but using the actual values of the other predictors in the model. That is, all of the other predictors can have different values in each of the observations used in the average. See the Details section of the Margins macro documentation.

mtohidi · Posted 03-06-2024 01:33 PM

Thank you so much for your reply. I really appreciate it. With stats, I find sometimes I have to read things a different way (or 50 times) to understand what it actually means.

So, am I correctly concluding that the predicted probability is the estimated probability of being alive in a hypothetical cohort with AGE, SEX, and DEMENTIA fixed, but with the other variables at the actual values seen in my cohort?

If so, I wonder if I should be using LS-means instead, as the margins macro is creating almost an arbitrary cohort that may not reflect what is actually seen in real life, thus potentially under-estimating survival in the younger patient groups compared to what is actually seen in real-life. What I am hoping to estimate is the predicted probability of being alive for different "types of patients" (ie 60 year old male without dementia) in an easy-to understand format for clinicians. But, I would also imagine that a 60 year old male without dementia in my cohort is less likely to have some of the other co-morbidities than average (cohort is on average older, more comorbid).

Thanks again for your help and expertise.

StatDave · Posted 03-06-2024 05:50 PM

I think that Margins probably does what you want. For each combination of AGE, SEX, and DEMENTIA, it uses the observed values of the other predictors to compute predicted probabilities and gives you the average - pretty intuitively appealing assuming that your model is good and that the values of the other predictors in your data are typical in the population of interest. If you used LS-means, you would have to include the AGE*SEX*DEMENTIA interaction in the model and use that in the LSMEANS statement. As we discussed, this would fix all of the other predictors at one particular setting which might not be reasonable for each combination. You could improve that by adding the BYLEVEL option to get a different setting of the other predictors for each combination - but still a single setting in each case, not an average. Note that, by default, Margins uses the predicted values from all of the observations for each combination. So, the computed margins differ only because of the differing value of AGE, SEX, and/or DEMENTIA. Since all of your predictors are categorical, if you have several observations in each combination and feel that the values in the other predictors differ substantially among the combinations, then you could use the WITHIN= option to only use the observations in each combination to compute the average. Since WITHIN= only allows a single condition, you would have to run the macro for each combination using the appropriate condition.

%margins macro to estimate predictive margins

Re: %margins macro to estimate predictive margins

Re: %margins macro to estimate predictive margins

Re: %margins macro to estimate predictive margins

Re: %margins macro to estimate predictive margins