I am working through the book "Statistical Rethinking" by Richard McElreath. The following is a toy example from the book that uses grid approximation to get a posterior distribution then samples from the posterior. My question is what is an equivalent way to obtain the same posterior and similar sample in SAS?
Since I am more familiar with SAS I am doing this for understanding and fun.
The set up is we toss a globe and catch it 9 times. The tip of our right index finger will land on either land (L) or water (W). We observe L= 3 and W= 6. We are trying to determine the proportion of the globe that is water using a bayesian approach.
The R code:
# create a grid of possible parameter values (proportion of globe that is W)
p_grid <- seq(from= 0, to= 1, length.out= 1000)
p_grid[1000]
# Set prior
prob_p <- rep(1, l)
# Liklihood function for grid of possible parameter values
prob_data <- dbinom(6, size = 9, p_grid)
# combine prior with liklihood to get posterior
posterior <- prob_data * prob_p
sum(posterior)
#Standard posterior
posterior <- posterior/sum(posterior)
# Sample from posterior
samples <- sample(p_grid, prob = posterior, size = 1e4, replace = TRUE)
#plot samples
plot(samples)
plot(p_grid, posterior)
Here is what the resulting posterior distribution looks like from the R script.
Here is my attempt to recreate the process in SAS. The tricky part seems to be is how to sample (or run simulations) based on the posterior. I used PROC SURVEYSELECT.
/**
Bayes practice
**/
data _test1;
p_grid= 0; /** Set initial parameter value **/
do i= 1 to 1000;
/* p_grid = rand('uniform') ; */ /** Alternative approach is to just randomly generate a bunch of parameter values **/
prob_p = 1 ; /** Set a prior **/
prob_data = pdf('binomial', 6, p_grid, 9); /** Liklihood function, liklihood of data given parameter value **/
posterior = prob_data * prob_p;
output;
p_grid + .001; /** update p_grid **/
end;
drop i;
run;
proc sql;
select sum(posterior) into: sum_posterior from _test1; /* Sum posterior to standardize */
create table _test2 as
select *, posterior/(&sum_posterior.) as posterior_s from _test1 order by p_grid;
quit;
# create macro to sample from dataset using standard posterior as the size.
%macro sample(i);
proc surveyselect data= _test2 method= pps_wr out= _sample1 n= 1 outhits noprint;
size posterior_s;
run;
proc append base= all_simulations data= _sample1; run;
%mend sample;
proc delete data= all_simulations; run;
data _null_;
%let sim_size = 10000;
do i = 1 to &sim_size;
call execute("%sample("||i||");");
end;
run;
proc sql;
create table _sim_sum1 as
select p_grid, sum(numberhits) as freq, sum(numberhits) / &sim_size. as proportion from all_simulations group by p_grid;
select sum(proportion) as sum_proportion from _sim_sum1;
quit;
proc sgplot data= _sim_sum1;
series x= p_grid y= proportion;
run;
Here is the resulting distribution from the plot: (It doesn't look right)
I acknowledge I may be way off base with this approach. I am trying to understand this material and would appreciate someone showing me a proper way to do this in SAS.
... View more