BookmarkSubscribeRSS Feed
eemrun
Obsidian | Level 7

Rookie here and struggling to create a declaration to run regression for multiple independent variables. The dataset looks like the following

  

A1A2A3A4A5A6A7A8A9A10A11V0V1V2V3V4V5V6V7V8V9V10V11T1T2T3T4T5T6T7T8T9T10T11T12T13T14T15T16T17T18T19
1000000000010000000000 1000000000000000000
0100000000010000000000 0100000000000000000

 

As part of Proc MCMC, I want to specify the following a model in the following manner:

 

y = a + a_b1*A1 + a_b2 *A2 + a_b3*A3... + b_b1*V1 + b_b2*V2 + ... and so on

 

Is there a shortcut way of doing this for a large sets of A, V and T? 

6 REPLIES 6
Reeza
Super User

Is there a pattern here? If so, then you could likely write a macro loop. 

 

You can also look at the | and @ usage in writing a MODEL statement.

eemrun
Obsidian | Level 7
hi. The pattern is: 3 variable types: A, V, T. A ranges from 1-11, V ranges from 0-11 and T ranges from 1-23 (e.g. A1, A2, V0, V1, T1, T2, etc.). The values for these variables are 0,1.
Rick_SAS
SAS Super FREQ

You don't need a macro. You can compute the mean value (eta) in a loop, as shown on p 4-5 of High & ElRayes (2017):

 

 

eemrun
Obsidian | Level 7

thanks. that paper is helpful. however i am still getting an error (ideally stupidity from my end!). I am using the following argument:

 

proc mcmc DATA=input4 diag=all dic
	propcov=quanew 
	nbi=5000 
	ntu=5000 
	nmc=10000 
	thin=20 
	plots(smooth)=all seed=19680409
	outpost=pdres;


	array od[46] 	a1-a11 v0-v11 t1-t23;
	array bd[46] 	ab_1-ab_11 vb_0-vb_11 tb_1-tb_23;
	

	parms (ab_1-ab_11 vb_0-vb_11 tb_1-tb_23) 0;
	parms sigma2 1;

	/* Prior distribution assumptions */
	prior sigma2 ~ igamma(shape = 100, scale= 76);
	prior ab_1-ab_11 vb_0-vb_11 tb_1-tb_23 ~ normal(mean=0, var=1e6);

	
	do i = 1 to 46;
		p = od[i] * bd[i];
	end;
	model d ~ normal(p, var=sigma2);
run;

The error says: All observations in the input data set contain missing values. 

 

Any idea where I am going wrong?

Tom
Super User Tom
Super User

 

Any idea where I am going wrong?


The error says: All observations in the input data set contain missing values. 

The error means that for every observations in your input data there is at least one variable that has a missing value.

In the example data you posted the variable V11 was missing on all observations.

ballardw
Super User

Unless specifically instructed to allow missing values all the regression procedures default to dropping any record with any of the variables on the model statement missing. So if at least one variable is missing on each record then there is no data to model.

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1680 views
  • 0 likes
  • 5 in conversation