BookmarkSubscribeRSS Feed
jwb133
Calcite | Level 5

The PREDDIST statement in PROC MCMC allows one to generate a dataset with predictive draws of the dependent variable. It is unclear to me exactly how these draws are collected from the MCMC chain - in particular, is the chain thinned in order to (hopefully) make the draws statistically independent in any way? The documentation doesn't appear to give any details about this, except that if one doesn't specify NSIM, it defaults to NMC, the number of iterations used in the chain (after burnin), in which case the draws would not be independent. If one specifies a value of NSIM<NMC, which iterations' draws are used?

 

Many thanks

Jonathan

2 REPLIES 2
jwb133
Calcite | Level 5

After discussing with SAS support, the following is my understanding of how preddist works. Suppose PROC MCMC is called using NMC=x, so x iterations of the MCMC sampler will be performed. Suppose that NSIM=y is specified, requesting y draws from the posterior predictive dsitribution of the outcome.

 

To produce the y draws from posterior predictive distribution, PROC MCMC samples y parameter values, with replacement, from the NMC=x samples from the posterior distribution. For each, it then simulates a value of y from its distribution conditional on the drawn parameter value.

 

This approach would seem to be valid if the y parameter values which are found by drawing with replacement from the NMC=x iterations are i.i.d. However, it would seem that potentially this does not hold. Suppose for example that we (perhaps stupidly) choose NMC=10 and NSIM=100000. Then the drawn values of the outcome are being drawn conditional on one of 10 parameter values. These 10 parameter values are probably correlated to some extent, and moreover you then have many draws of the outcome variable being made conditional on the same parameter value. In this (perhaps contrived) scenario, the 100000 values would not be (I contend) valid draws from the posterior predictive distribution.

Yuany
Calcite | Level 5

I have a question.

If nmc=100000, thin=10 and outpred=outpred1, does this mean I will have a dataset named outpred1 contains 10000 observations?

Thx.




@jwb133 wrote:

After discussing with SAS support, the following is my understanding of how preddist works. Suppose PROC MCMC is called using NMC=x, so x iterations of the MCMC sampler will be performed. Suppose that NSIM=y is specified, requesting y draws from the posterior predictive dsitribution of the outcome.

 

To produce the y draws from posterior predictive distribution, PROC MCMC samples y parameter values, with replacement, from the NMC=x samples from the posterior distribution. For each, it then simulates a value of y from its distribution conditional on the drawn parameter value.

 

This approach would seem to be valid if the y parameter values which are found by drawing with replacement from the NMC=x iterations are i.i.d. However, it would seem that potentially this does not hold. Suppose for example that we (perhaps stupidly) choose NMC=10 and NSIM=100000. Then the drawn values of the outcome are being drawn conditional on one of 10 parameter values. These 10 parameter values are probably correlated to some extent, and moreover you then have many draws of the outcome variable being made conditional on the same parameter value. In this (perhaps contrived) scenario, the 100000 values would not be (I contend) valid draws from the posterior predictive distribution.



sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1425 views
  • 0 likes
  • 2 in conversation