Solved: Why does Proc MCMC taking too long to finish?

Teketo · Posted 01-24-2019 07:00 PM

Hello,

I am doing a Hierarchical Bayesian Analysis using the Proc MCMC procedure. I have got three level model; however, the Proc MCMC is not working for me. It is taking too long to finish even for the empty model.

I started doing the analysis from the empty model and it takes more than 10 minutes to finish off. Here is a sample code I used:

Proc mcmc data = care seed = 10 nmc = 200000 nbi = 10000 thin = 2 outpost = xcare DIC;

Prams beta0 sig2 delta2;

Prior beta0 ~ normal (0, var = 1000);

prior sig2 ~ igamma (shape = 0.1, scale = 0.01);

prior delta2 ~ igamma (shape = 0.1, scale = 0.01);

mu = beta0;

random gamma ~ normal (0, var = sig2) subject = region;

random delta ~ normal (0, var = delta2) subject = clusterXregion; (clusters are nested within region)

p = logistic(mu + gamma + delta);

model use ~ binary(p);

run;

Moreover, when I include fixed effects and random slopes, the program stops.

I really appreciate your support in this regard.

With kind regards

Teketo

SAS_Rob · Posted 02-01-2019 10:14 AM

It is hard to say for sure without knowing more about the data and the levels of cluster and regions, but initially I would say that NMC=200000 is the likely culprit. Why have you set it so large?

View solution in original post

ballardw · Posted 01-24-2019 07:26 PM

First thing I see is that your NMC and NBI options are orders of magnitude greater than the default 1000. Did you try with the defaults? How long did that take.

Also from the documentation details on computational resources:

　

Computational Resources

It is impossible to estimate how long it will take for a general Markov chain to converge to its stationary distribution.It takes a skilled and thoughtful analysis of the chain to decide whether it has converged to the target distribution andwhether the chain is mixing rapidly enough. In some cases, you might be able to estimate how long a particular simulationmight take. The running time of a program that does not have RANDOMstatements is approximately linear to the following factors: the number of samples in the input data set, the number of simulations,the number of blocks in the program, and the speed of your computer. For an analysis that uses a data set of size nsamples, a simulation length of nsim, and a block design of nblocks, PROC MCMC evaluates the log-likelihood function the following number of times, excluding the tuning phase:

\[ {\mi{nsamples}} \times {\mi{nsim}} \times {\mi{nblocks}} \]

The faster your computer evaluates a single log-likelihood function, the faster this program runs. Suppose you have nsamples equal to 200, nsim equal to 55,000, and nblocks equal to 3. PROC MCMC evaluates the log-likelihood function approximately $3.3\times 10^7$ times. If your computer can evaluate the log likelihood for one observation $10^6$ times per second, this program takes approximately a half a minute to run. If you want to increase the number of simulationsfive-fold, the run time increases approximately five-fold.

Note that the above is without RANDOM statements. Each RANDOM statement adds one pass through the input data at each iteration. So how big is your data set?

Teketo · Posted 01-24-2019 06:55 PM

Hello,

I am doing a Hierarchical Bayesian Analysis using the Proc MCMC procedure. I have got three level model; however, the Proc MCMC is not working for me. It is taking too long to finish even for the empty model.

I started doing the analysis from the empty model and it takes more than 10 minutes to finish off. Here is a sample code I used:

Proc mcmc data = care seed = 10 nmc = 200000 nbi = 10000 thin = 2 outpost = xcare DIC;

Prams beta0 sig2 delta2;

Prior beta0 ~ normal (0, var = 1000);

prior sig2 ~ igamma (shape = 0.1, scale = 0.01);

prior delta2 ~ igamma (shape = 0.1, scale = 0.01);

mu = beta0;

random gamma ~ normal (0, var = sig2) subject = region;

random delta ~ normal (0, var = delta2) subject = clusterXregion; (clusters are nested within region)

p = logistic(mu + gamma + delta);

model use ~ binary(p);

run;

Moreover, when I include fixed effects and random slopes, the program stops.

I really appreciate your support in this regard.

With kind regards

Teketo

SAS_Rob · Posted 02-01-2019 10:14 AM

It is hard to say for sure without knowing more about the data and the levels of cluster and regions, but initially I would say that NMC=200000 is the likely culprit. Why have you set it so large?

Why does Proc MCMC taking too long to finish?

Re: Why does Proc MCMC taking too long to finish?

Re: Why does Proc MCMC taking too long to finish?

Why does Proc MCMC taking too long to finish?

Re: Why does Proc MCMC taking too long to finish?

Why does Proc MCMC taking too long to finish?

Re: Why does Proc MCMC taking too long to finish?

Re: Why does Proc MCMC taking too long to finish?

Why does Proc MCMC taking too long to finish?

Re: Why does Proc MCMC taking too long to finish?

The 2025 SAS Hackathon has begun!