I want to know some details about monotone regression multiple imputation. According to the Example provided in SAS Support Document, we can use the DETAILS option and the parameters estimated from the observed data are displayed. We can see that the parameters estimated with observed data using PROC MI are 0.98587(Length1) and -0.04249(Intercept). My understanding is that this result is based only on the observed data, similar to the results we usually obtain from linear regression. Therefore, I use PROC GENMOD to do linear regression, specifying a normal distribution and an identity link function as follows.
proc genmod data=fish1;
model Length2 = Length1 / dist=normal link=identity;
run;
I find the results from PROC GENMOD are 1.0880 ( for Length1 ) and 0.1348 (for Intercept), which are different from the estimation for observed data in PROC MI mentioned before. data Fish1;
title 'Fish Measurement Data';
input Length1 Length2 Length3 @@;
datalines;
23.2 25.4 30.0 24.0 26.3 31.2 23.9 26.5 31.1
26.3 29.0 33.5 26.5 29.0 . 26.8 29.7 34.7
26.8 . . 27.6 30.0 35.0 27.6 30.0 35.1
28.5 30.7 36.2 28.4 31.0 36.2 28.7 . .
29.1 31.5 . 29.5 32.0 37.3 29.4 32.0 37.2
29.4 32.0 37.2 30.4 33.0 38.3 30.4 33.0 38.5
30.9 33.5 38.6 31.0 33.5 38.7 31.3 34.0 39.5
31.4 34.0 39.2 31.5 34.5 . 31.8 35.0 40.6
31.9 35.0 40.5 31.8 35.0 40.9 32.0 35.0 40.6
32.7 36.0 41.5 32.8 36.0 41.6 33.5 37.0 42.6
35.0 38.5 44.1 35.0 38.5 44.0 36.2 39.5 45.3
37.4 41.0 45.9 38.0 41.0 46.5
;
run;
@PanShuyao wrote:
I want to know some details about monotone regression multiple imputation. According to the Example provided in SAS Support Document, we can use the DETAILS option and the parameters estimated from the observed data are displayed. We can see that the parameters estimated with observed data using PROC MI are 0.98587(Length1) and -0.04249(Intercept). My understanding is that this result is based only on the observed data, similar to the results we usually obtain from linear regression.
Compare the color highlighted section above with this from the OVERVIEW of Proc MI:
Instead of filling in a single value for each missing value, multiple imputation replaces each missing value with a set of plausible values that represent the uncertainty about the right value to impute (Rubin 1976, 1987). The multiply imputed data sets are then analyzed by using standard procedures for complete data and combining the results from these analyses
So Proc MI will "use" cases that Genmod doesn't due to missing values. Did you compare the compare of observations used by the two procedures?
I don't know why you didn't include the Proc MI code you used so we could see what you actually ran.
There may also be implementations of code within each procedure that will result is slightly different values just because the code to perform similar tasks is written differently.
@SAS_Rob hits the nail on the head. The regression parameter estimates displayed in the MI procedure refer to those of the model built after standardization. To see this, run the following code:
/*Standardize all variables in the dataset*/
proc stdize data=fish1 out=fish2;
var Length1 Length2 Length3;
run;
/*Build multiple regression model after standardization*/
proc genmod data=fish2;
model Length2 = Length1 / dist=normal link=identity;
run;
And you will see that the results output in the GENMOD procedure only differs from those in the MI procedure to an extent explainable by rounding error.
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.