Solved: Why ESTIMATION with solely observed data in PROC MI using monotone reg...

PanShuyao · Posted 04-27-2025 04:22 AM

I want to know some details about monotone regression multiple imputation. According to the Example provided in SAS Support Document, we can use the DETAILS option and the parameters estimated from the observed data are displayed. We can see that the parameters estimated with observed data using PROC MI are 0.98587(Length1) and -0.04249(Intercept). My understanding is that this result is based only on the observed data, similar to the results we usually obtain from linear regression. Therefore, I use PROC GENMOD to do linear regression, specifying a normal distribution and an identity link function as follows.

proc genmod data=fish1;
model Length2 = Length1 / dist=normal link=identity;
run;

I find the results from PROC GENMOD are 1.0880 ( for Length1 ) and 0.1348 (for Intercept), which are different from the estimation for observed data in PROC MI mentioned before.

For the sake of convenience, I paste the raw data that I use.

data Fish1;
   title 'Fish Measurement Data';
   input Length1 Length2 Length3 @@;
   datalines;
23.2 25.4 30.0    24.0 26.3 31.2    23.9 26.5 31.1
26.3 29.0 33.5    26.5 29.0   .     26.8 29.7 34.7
26.8   .    .     27.6 30.0 35.0    27.6 30.0 35.1
28.5 30.7 36.2    28.4 31.0 36.2    28.7   .    .
29.1 31.5   .     29.5 32.0 37.3    29.4 32.0 37.2
29.4 32.0 37.2    30.4 33.0 38.3    30.4 33.0 38.5
30.9 33.5 38.6    31.0 33.5 38.7    31.3 34.0 39.5
31.4 34.0 39.2    31.5 34.5   .     31.8 35.0 40.6
31.9 35.0 40.5    31.8 35.0 40.9    32.0 35.0 40.6
32.7 36.0 41.5    32.8 36.0 41.6    33.5 37.0 42.6
35.0 38.5 44.1    35.0 38.5 44.0    36.2 39.5 45.3
37.4 41.0 45.9    38.0 41.0 46.5
;
run;

SAS_Rob · Posted 04-27-2025 07:46 AM

See the documentation linked below which states “ Note that all continuous variables are standardized before the imputation process and then are transformed back to the original scale after the imputation process.”
Try standardizing the data before running GENMOD and they should agree.

https://documentation.sas.com/doc/en/statug/15.2/statug_mi_details05.htm

View solution in original post

ballardw · Posted 04-27-2025 04:56 AM

@PanShuyao wrote:

I want to know some details about monotone regression multiple imputation. According to the Example provided in SAS Support Document, we can use the DETAILS option and the parameters estimated from the observed data are displayed. We can see that the parameters estimated with observed data using PROC MI are 0.98587(Length1) and -0.04249(Intercept). My understanding is that this result is based only on the observed data, similar to the results we usually obtain from linear regression.

Compare the color highlighted section above with this from the OVERVIEW of Proc MI:

Instead of filling in a single value for each missing value, multiple imputation replaces each missing value with a set of plausible values that represent the uncertainty about the right value to impute (Rubin 1976, 1987). The multiply imputed data sets are then analyzed by using standard procedures for complete data and combining the results from these analyses

So Proc MI will "use" cases that Genmod doesn't due to missing values. Did you compare the compare of observations used by the two procedures?

I don't know why you didn't include the Proc MI code you used so we could see what you actually ran.

There may also be implementations of code within each procedure that will result is slightly different values just because the code to perform similar tasks is written differently.

SAS_Rob · Posted 04-27-2025 07:46 AM

See the documentation linked below which states “ Note that all continuous variables are standardized before the imputation process and then are transformed back to the original scale after the imputation process.”
Try standardizing the data before running GENMOD and they should agree.

https://documentation.sas.com/doc/en/statug/15.2/statug_mi_details05.htm

Season · Posted 04-27-2025 09:58 AM

@SAS_Rob hits the nail on the head. The regression parameter estimates displayed in the MI procedure refer to those of the model built after standardization. To see this, run the following code:

/*Standardize all variables in the dataset*/
proc stdize data=fish1 out=fish2;
var Length1 Length2 Length3;
run;
/*Build multiple regression model after standardization*/
proc genmod data=fish2;
model Length2 = Length1 / dist=normal link=identity;
run;

And you will see that the results output in the GENMOD procedure only differs from those in the MI procedure to an extent explainable by rounding error.

PanShuyao · Posted 04-27-2025 08:59 PM

Yes, it works. Thank you so much!

Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Re: Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Re: Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Re: Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Re: Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Re: Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Re: Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Re: Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Re: Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Re: Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Re: Why ESTIMATION with solely observed data in PROC MI using monotone regression is different

Registration is open