How to get posterior parameters used in each imputation with monotone regression method? I wonder how to get the parameters used in each imputation when we apply monotone regression method in multiple imputation procedure. Let us use the example dataset as follows, data Fish1;
title 'Fish Measurement Data';
input Length1 Length2 Length3 @@;
datalines;
23.2 25.4 30.0 24.0 26.3 31.2 23.9 26.5 31.1
26.3 29.0 33.5 26.5 29.0 . 26.8 29.7 34.7
26.8 . . 27.6 30.0 35.0 27.6 30.0 35.1
28.5 30.7 36.2 28.4 31.0 36.2 28.7 . .
29.1 31.5 . 29.5 32.0 37.3 29.4 32.0 37.2
29.4 32.0 37.2 30.4 33.0 38.3 30.4 33.0 38.5
30.9 33.5 38.6 31.0 33.5 38.7 31.3 34.0 39.5
31.4 34.0 39.2 31.5 34.5 . 31.8 35.0 40.6
31.9 35.0 40.5 31.8 35.0 40.9 32.0 35.0 40.6
32.7 36.0 41.5 32.8 36.0 41.6 33.5 37.0 42.6
35.0 38.5 44.1 35.0 38.5 44.0 36.2 39.5 45.3
37.4 41.0 45.9 38.0 41.0 46.5
;
run; Then we apply monotone regression method provided by `proc mi`. proc mi data=Fish1 nimpute=5 seed=769097 out=outex3;
monotone reg(Length2= Length1 / details);
var Length1 Length2;
ods output MonoReg=MonoReg;
run; We can see the result of estimated parameters used in first imputation are -0.044703 (Intercept) and 0.982951 (Length1). According to the supporting document, regression method is based on Rubin's book (1987, Multiple Imputation for Nonresponse in Surveys, pp. 166–167). The specific steps are demonstrated here. I followed the steps with code shown as below, but I can't get the same result of the parameters used in the imputation. I just repeat all the steps one time, so only one result was gotten. Presented by beta_star, my results were -0.04867 (Intercept) and 0.9804691 (Length1). What's wrong with my code? /* Extract records with missing Length2 */
data Length2_missing;
set fish2;
if missing(Length2);
keep Length1;
run;
/* Extract observed records (non-missing Length2) for regression modeling */
data Length2_observed;
set fish2;
where not missing(Length2);
keep Length1 Length2;
run;
/* Fit a regression model: Length2 ~ Length1 using GENMOD */
proc genmod data=Length2_observed; model Length2 = Length1 / dist=normal link=identity; /* Normal distribution with identity link */ ods output ParameterEstimates=beta_est(where=(Parameter='Intercept' | Parameter='Length1')); output out=residuals pred=pred_y /* Predicted values */ resraw=raw_res /* Raw residuals */ reschi=pearson_res /* Pearson residuals */ resdev=deviance_res; /* Deviance residuals */ run;
/* Calculate residual variance (s^2) and degrees of freedom (df) */
proc sql noprint;
select count(*) into :n from Length2_observed; /* Total observed sample size (n) */
select &n - 2 into :df from Length2_observed; /* Degrees of freedom: df = n - p (p=2 parameters: intercept + Length1) */
quit;
/* Bayesian posterior sampling for β using IML (Matrix Language) */
proc iml;
/* Read regression coefficients (β_hat) */
use beta_est;
read all var {'Estimate'} into beta_hat; /* Extract β estimates */
close beta_est;
/* Read residuals to compute Sum of Squared Errors (SSE) */
use residuals;
read all var {'raw_res'} into RAW; /* Raw residuals */
RESIDUAL = RAW##2; /* Squared residuals */
SSE = (1/(&n - 2)) * sum(residual[, 1]); /* Residual variance estimator: s^2 = SSE / df */
close residuals;
/* Construct design matrix X (with intercept) */
use Length2_observed;
read all var {'Length1'} into X; /* Predictor variable */
close Length2_observed;
X = j(nrow(X), 1, 1) || X; /* Add intercept column (1s) */
/* Cholesky decomposition of covariance matrix for sampling */
U = root(inv(X` * X)); /* Cholesky root of (X'X)^(-1) */
/* Set random seed for reproducibility */
call randseed(769097); /* Initialize random number generator */
call streaminit(769097); /* Synchronize random stream */
/* Generate random normal vector Z ~ N(0, I) */
Z = randnormal(1, {0, 0}, {1 0, 0 1}); /* Z: 1x2 vector from standard normal */
/* Sample posterior sigma2 from inverse chi-square distribution */
sigma2 = SSE * &df / rand('chisq', &df); /* sigma2 ~ Inv-Chisq(df, SSE) */
/* Sample β_star from posterior: β_star = β_hat + σ * U * Z */
beta_star = beta_hat + sqrt(sigma2) * U * Z`; /* Posterior draw of coefficients */
print beta_star; /* Display sampled β_star */
/* Save sampled β_star to a dataset */
create beta_sample from beta_star[colname={'beta_star'}];
append from beta_star;
close beta_sample;
quit;
... View more