Rookie here and struggling to create a declaration to run regression for multiple independent variables. The dataset looks like the following
A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | A9 | A10 | A11 | V0 | V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | V11 | T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8 | T9 | T10 | T11 | T12 | T13 | T14 | T15 | T16 | T17 | T18 | T19 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
As part of Proc MCMC, I want to specify the following a model in the following manner:
y = a + a_b1*A1 + a_b2 *A2 + a_b3*A3... + b_b1*V1 + b_b2*V2 + ... and so on
Is there a shortcut way of doing this for a large sets of A, V and T?
Is there a pattern here? If so, then you could likely write a macro loop.
You can also look at the | and @ usage in writing a MODEL statement.
You don't need a macro. You can compute the mean value (eta) in a loop, as shown on p 4-5 of High & ElRayes (2017):
thanks. that paper is helpful. however i am still getting an error (ideally stupidity from my end!). I am using the following argument:
proc mcmc DATA=input4 diag=all dic
propcov=quanew
nbi=5000
ntu=5000
nmc=10000
thin=20
plots(smooth)=all seed=19680409
outpost=pdres;
array od[46] a1-a11 v0-v11 t1-t23;
array bd[46] ab_1-ab_11 vb_0-vb_11 tb_1-tb_23;
parms (ab_1-ab_11 vb_0-vb_11 tb_1-tb_23) 0;
parms sigma2 1;
/* Prior distribution assumptions */
prior sigma2 ~ igamma(shape = 100, scale= 76);
prior ab_1-ab_11 vb_0-vb_11 tb_1-tb_23 ~ normal(mean=0, var=1e6);
do i = 1 to 46;
p = od[i] * bd[i];
end;
model d ~ normal(p, var=sigma2);
run;
The error says: All observations in the input data set contain missing values.
Any idea where I am going wrong?
Any idea where I am going wrong?
The error says: All observations in the input data set contain missing values.
The error means that for every observations in your input data there is at least one variable that has a missing value.
In the example data you posted the variable V11 was missing on all observations.
Unless specifically instructed to allow missing values all the regression procedures default to dropping any record with any of the variables on the model statement missing. So if at least one variable is missing on each record then there is no data to model.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.