BookmarkSubscribeRSS Feed
Giampaolo
Obsidian | Level 7

Dear SAS users,

I have a three level (low moderate high) ordinal variable on which I would like to perform a mediation analysis, but Proc CAUSALMED cannot be used for ordinal outcomes. Is there another procedure or (hopefully simple approach) that could be used for this purpose? Thank you!

10 REPLIES 10
SteveDenham
Jade | Level 19

Would the results make sense if you collapsed the response variable, say into low vs med and high together in one analysis, and med vs high in another?  I haven't done enough causal analysis to know if that approach is reasonable or not, but it has the advantage of being simple to do.

 

SteveDenham

Giampaolo
Obsidian | Level 7

Hi Steve,

Thank you for your reply. Yes it would definitely make sense. The problem though, is that when the dependent variable is dichotomized, the independent variable I would like to test as a possible mediator is no longer significant. Maybe because there is some loss of information in using the outcome as dichotomous? The outcome variable (the Nottingham Prognostic Index (a cancer prognostic score) is continuous but has a very difficult distribution to model. See Histogram.

Best

Giampaolo

 
 

 

 

SteveDenham
Jade | Level 19

Is the lack of significance found in both dichotomized datasets?  If so, then I suspect that you will have to accept the notion that you don't have enough data to drive the effect to a "significant" level?  Could you calculate some sort of effect size for the mediator in the two datasets, and compare that to a CID (clinically important difference)?

 

SteveDenham

Giampaolo
Obsidian | Level 7

Sorry if I was confusing. There is only one dataset. In one case I dichotomized the NPI variable. In the other case I converted NPI into an ordinal variable  using two thresholds from the literature. I think you are right that the sample size is not sufficiently powered, but I was intrigued by the fact that my predictor was significant with the ordinal and not with the binary outcome.

 

SteveDenham
Jade | Level 19

You know, that makes sense given the distribution of the predictor variable.  What happens if you just leave it as a continuous (well at least relatively continuous) variable?  I think I misinterpreted this situation and confused the response variable with the mediating variable, so far as dichotomous/ordinal goes.

 

So if you want to dichotomize NPI, the cutpoint is going to be critical.  What value preserves most of the information provided by the continuous version of the variable?  Also, when I look at the pdf for NPI, it looks long tailed to the right.  What sort of distribution do you see if you took the natural log of the NPI?  Perhaps ln(NPI) has a more distinguishing cutpoint, or perhaps ln(NPI) is more significant as a mediator?  Lots of ways to go on this one.

 

SteveDenham

Giampaolo
Obsidian | Level 7

I have tried proc genmod using several different distributions with log link but the association with continuous NPI was not significant. When I tried the non parametric Jonckheere-Terpstra Test however the association of continuous NPI with the predictor was significant. Maybe the problem is that the distributions available do not reflect the data?  

SteveDenham
Jade | Level 19

I assume that for the JT test the row variable was based on the binning you presented in the graphic, and the column variables were the observed in the bin and the not observed (total - observed).  If that is correct (or even close), then you have done a better job of giving an approximation to the distribution, and recall that the distribution options and the canonical links apply to the X'beta matrix, so it includes all of the predictors, so it is much more dependent on the distribution of Y than of the distribution of any single predictor.

 

SteveDenham

Giampaolo
Obsidian | Level 7

Hi Steve,

Thank you for stating  the points I needed to remember in using the procedure and apologies for coming back to this post again after a few days, but  there is one thing that has been in my thoughts and I was hoping to clarify.  I understand the link options apply to all the predictors. With respect to the distribution options, though, I am not sure I have misunderstood the procedure or misinterpreted your message. I thought that with the distribution option one attempts to model the distribution of the response variable Y. Am I wrong? Partially wrong? Could you please explain if this interpretation conflicts with your sentence "the distribution options and the canonical links apply to the X'beta matrix, so it includes all of the predictors...".

Thank you very much!

Giampaolo

Thank you

SteveDenham
Jade | Level 19

Two sides of the same coin.  The X'beta matrix gives the predicted Yhat values.  In a generalized linear model that is what is used to predict the dependent variable (Y) values.  You are correct in that we generally pick distributions and links based on our knowledge of the dependent variable, but the actual algorithmic processes are done using the X'beta and Y values in combination.

 

SteveDenham

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1881 views
  • 1 like
  • 2 in conversation