Statistical Procedures

supp · Posted 12-15-2022 11:57 PM

Is there a way to use a bayesian approach to compare the means of two independent samples? Can proc mcmc handle something like this.

For example, I have two independent samples from the same underlying population:
Sample 1: n = 500, mean = 83.8, standard deviation = 24.7

Sample 2: n = 1000, mean = 86.1, standard deviation = 25.8

Is there a way I can model the difference between the means using proc mcmc without the underlying data?

My only thought is to create a posterior distribution like this:

/* Compute the posterior distributions for the population means */
data post;
	do i = 1 to 10000;
		do x= 0 to 100 by .1;
			sample1  = pdf('Normal', x, 83.8, 24.7);
			sample2  = pdf('Normal', x, 86.1, 25.8);
		output;
		end;
	end;
run;

@Rick_SAS

Rick_SAS · Posted 12-16-2022 06:31 AM

Yes. See this StackExchange article for references and a discussion that shows how to define the likelihood and priors: "Bayesian equivalent of two sample t-test." The article uses R and someone reproduced it in Python.

Ksharp · Posted 12-16-2022 07:03 AM

You could try BAYES statement of PROC GENMOD .

Or calling @StatDave

proc genmod data=sashelp.heart;
class status;
model weight = status / dist=normal link=identity;
lsmeans status/diff cl;
bayes seed=1 coeffprior=normal;
run;

supp · Posted 12-23-2022 10:00 AM

To figure this out here is simulated data. The outcome is score. We will try to estimate the coefficient paramater value for group using a linear regression

/**
Simulate data
**/
data _sim1;
	array test [4] _temporary_ (.03184 .08917 .19745 .68152);
	array control [4] _temporary_ (.02376 .10239 .20877 .66508);

		do i= 1 to 500;
			test_dist= rand('tabled', of test[*]);
			group = 1;
				if test_dist= 1 then score= 0;
				if test_dist= 2 then score= 33.33;
				if test_dist= 3 then score= 66.67;
				if test_dist= 4 then score= 100;
				output;
		end;

		do i= 1 to 10000;
			group = 2;
			control_dist= rand('tabled', of control[*]);
				if control_dist= 1 then score= 0;
				if control_dist= 2 then score= 33.33;
				if control_dist= 3 then score= 66.67;
				if control_dist= 4 then score= 100;
				output;
		end;
run;

Using proc genmode to esitmate the paramater values. Not sure if it is better to treat group as a class variable or just numeric.

proc genmod data= _sim1 ;
/* 	class group; */
	model score = group;
	bayes seed= 1 coeffprior=normal nbi= 1000 nmc= 100000 thin= 10 seed= 1
		out= posterior diagnostics=all;
run;

I think the diagnostics look good?

The final analysis seems reasonable. Using HPD the impact on score of being in the control group is likely somewhere between -3.0138 and 1.55543 with a maximum liklihood point estimate of -0.7640.

I tried to recreate something similar using PROC MCMC but it doesn't seem to be working as well. Can you see what I am doing wrong?

/**
MCMC approach
**/

proc mcmc data= _sim1 seed=1 nbi=1000 nmc=10000 outpost= simout thin=10;
	parms b0 0 b1 0 s2 1;
	prior b: ~ normal(0, var= 100);
	prior s2 ~ igamma(1, scale= 1);
	
	mu = b0 +( b1 * group);
	
	model score ~ normal(mu, var= s2);
run;

The effective sample size if very low and I seem to be getting almost a reversed impact of being in the test group.

Ksharp · Posted 12-24-2022 03:39 AM

Sorry . I am not familiar with PROC MCMC. Maybe @StatDave know more things.
But if you want compare mean between groups, I think you need LSMEAN statement.

lsmeans status/diff cl;

Statistical Procedures

Bayesian approach to a two sample t test

Re: Bayesian approach to a two sample t test

Re: Bayesian approach to a two sample t test

Re: Bayesian approach to a two sample t test

Re: Bayesian approach to a two sample t test

Follow Us

What is...

Statistical Procedures

Bayesian approach to a two sample t test

Re: Bayesian approach to a two sample t test

Re: Bayesian approach to a two sample t test

Re: Bayesian approach to a two sample t test

Re: Bayesian approach to a two sample t test

Register Today!

Follow Us

What is...