Hi all,
I'm trying to create the equivalent of LS Mean statements in PROC MCMC and one of the variables I'm including is a categorical variable with 19 different responses.
I was able to include it in the model as a random effect (but MCMC treats it as fixed) as the following:
random C ~ normal(0, var=1e6) subject=SIteID zero=19 monitor(C).
However the issue I'm facing is trying to create LSMean statements in the BEGINNODATA/ENDNODATA Statements, as I'm getting warning that C_1, C_2 etc. are not recognised variables. I can't enter them as parms either as those are for fixed effects. Any help would be appreciated!
As a note I know I can use the outpost option and calculate from there, however the issue is I need the 95% HPD intervals and the validator does not have the IML licence to run the %sumint macro.
LS-means as computed by the LSMEANS statement are defined for fixed effects. So, I assume that you consider your categorical variable to be a fixed effect and you are just using the RANDOM statement and ZERO= option to do the equivalent of creating a set of dummy variables which you would then include in the model as described in the ZERO= option description. That probably explains the warning you get since no dummy variables are created.
In any case, I think the way to estimate LS-means involves processing the OUTPOST= data set as you mentioned. You can use a DATA step to compute the linear combination of model parameters that define the LS-mean for each observation. If you need to know the coefficients of that linear combination, you can fit a model with the same effects in MIXED or GLIMMIX and use an LSMEANS statement with the E option. That prints a table showing the coefficients of the linear combination for each LS-mean. With that done, you can easily compute a reasonable confidence interval for the LS-means by using the quantile options in PROC UNIVARIATE. For example, using the example titled "Random-Effects Model" in the Getting Started section of the MCMC documentation, the following computes the point estimates and quantile-based 95% confidence intervals for the Gender LS-means.
data lsm;
set postout;
lsmgf0=b0;
lsmgf1=b0+b1;
run;
proc univariate data=lsm noprint;
var lsmgf0 lsmgf1;
output out=out mean=mngf0 mngf1 pctlpre=gf0p gf1p pctlpts=2.5 97.5;
run;
proc print; run;
If you insist on computing HPD style intervals, then that should be possible with a DATA step (probably using the LAG function after sorting) following the computation of an HPD interval as described in "Summary Statistics > Highest Posterior Density (HPD) Interval" in the "Introduction to Bayesian Analysis Procedures" chapter of the SAS/STAT User's Guide.
Of course, for this simple example of a single fixed effect model, it's easier to just reparameterize the model without an intercept and with separate male and female dummy variables in the data and associated parameters defined in MCMC. Then the estimates and HPD intervals are directly provided in the MCMC Posterior Summaries table.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.