BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Yan_vivien
Calcite | Level 5

I want to use PMM(pattern mixed model) to impute missing data, so I used the MNAR modelobs to specify the observations used to build the imputation model, but i found if the observations in the dataset but are NOT specified as modelobs change, the imoutation model will change too, it confused me,  from my understanding, the imputation model is derived from the observations i specified in the option 'modelobs=x',  so it will not be impacted by the other observations, but  it seems this is not the truth.

Here are some examples:

the only difference between below 2 codes is the input dataset, in code 2, more observations are included in the input dataset. bothe of the 2 codes use observations  'model_ob='M4'' to specify the observations used to derive the imputation model.

Yan_vivien_0-1754023198931.png

in the SAS output,  the parameters that are estimated from the same observations used to build model are different, why ?

ps: i tried to procude the below pictures in English version SAS, but failed due to codeing issue, hope the CHINESE charactors doesn't affect your reading.

Yan_vivien_2-1754023885045.pngYan_vivien_3-1754023888532.png

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
SAS_Rob
SAS Employee

The reason for the differences has to do with the fact that all of the variables are standardized using all of the observations prior to fitting the imputation model.  Adding observations changes the mean and variance and thus the standardized values.  These values are then used in the imputation model, which is built only on one of the groups, which leads to slightly different estimates.

 

To see this more explicitly, look at the example below.  Notice how the "obs-data" estimates change slightly because all the observations are standardized.  This can be readily verified using Proc STANDARD and Proc REG.

data Mono1;
do Trt=0 to 1;
do j=1 to 5;
y0=10 + rannor(999);
y1= y0 + Trt + rannor(999);
if (ranuni(999) < 0.3) then y1=.;
output;
end; end;

do Trt=0 to 1;
do j=1 to 45;
y0=10 + rannor(999);
y1= y0 + Trt + rannor(999);
if (ranuni(999) < 0.3) then y1=.;
output;
end; end;
drop j;
run;


proc mi data=Mono1 seed=14823 nimpute=1 out=outex15;
class Trt;
monotone reg (/details);
mnar model( y1 / modelobs= (Trt='0'));
var y0 y1;
ods select MonoReg;
run;

proc standard data=mono1 mean=0 std=1 out=out1;
var y0 y1;
run;

proc reg data=out1;
where trt=0;
model y1=y0;
ods select ParameterEstimates;
run;

data add;
trt=1;
do j=1 to 45;
y0=10 + rannor(1);
y1= y0 + Trt + rannor(999);
if (ranuni(999) < 0.3) then y1=.;
output;
end;
drop j;
run;
data mono2;
set mono1 add;
run;

proc mi data=Mono2 seed=14823 nimpute=15 out=outex15;
class Trt;
monotone reg (/details);
mnar model( y1 / modelobs= (Trt='0'));
var y0 y1;
ods select MonoReg;
run;

proc standard data=mono2 mean=0 std=1 out=out2;
var y0 y1;
run;

proc reg data=out2;
where trt=0;
model y1=y0;
ods select ParameterEstimates;
run;

View solution in original post

7 REPLIES 7
yabwon
Amethyst | Level 16

But you are filtering input data differently. The where clause selects different observations so each proc works on different data. 

It's like you'd do:

proc reg data=sashelp.class(where=(sex in ('F', 'M') ));
  model height = weight age; 
run;

proc reg data=sashelp.class(where=(sex in ('F') ));
  model height = weight age; 
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Yan_vivien
Calcite | Level 5

yes, the input data are different, but the modelobs used to derive the imputation model are the same (both are modelobs='M4'), so the regression parameters (the numbers in red boxes in the lower pictures) derived from the same observations should be same too, right? 

yabwon
Amethyst | Level 16

You right, running this simple example on Cars dataset shows it:

data cars;
set sashelp.cars;
if _N_ in (6 8 47 99 101) then invoice=.;
run;


proc mi data=cars(where=(origin in ('Asia', 'Europe', 'USA') )) out=imp1 seed=123 nimpute=1;
  class origin; 
  var Weight invoice;
  monotone reg (invoice = Weight / details);
  mnar model(invoice / modelobs=(origin='Europe'));
run;

proc mi data=cars(where=(origin in ('USA', 'Europe') )) out=imp2 seed=123 nimpute=1;
  class origin; 
  var Weight invoice;
  monotone reg (invoice = Weight / details);
  mnar model(invoice / modelobs=(origin='Europe'));
run;

proc mi data=cars(where=(origin in ('Europe') )) out=imp3 seed=123 nimpute=1;
  class origin; 
  var Weight invoice;
  monotone reg (invoice = Weight / details);
  mnar model(invoice / modelobs=(origin='Europe'));
run;

but at the bottom of this page: https://documentation.sas.com/doc/en/statug/15.2/statug_mi_details61.htm

 

Under the MNAR assumption, the following steps are used to impute missing values for each imputed variable in each imputation (when you specify a MONOTONE statement) or in each iteration (when you specify an FCS statement):

1. For each imputed variable, a conditional model, such as a regression model for continuous variables, is fitted using either all applicable observations or a specified subset of observations.
2. A new model is simulated from the posterior predictive distribution of the fitted model.
3. Missing values of the variable are imputed based on the new model, and the imputed values for a specified subset of observations can be adjusted using specified shift and scale parameters.

 

It looks like after model from selected observations is fitted, another one is fitted.

 

That's my best guess.

 

Bart

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



SAS_Rob
SAS Employee

The reason for the differences has to do with the fact that all of the variables are standardized using all of the observations prior to fitting the imputation model.  Adding observations changes the mean and variance and thus the standardized values.  These values are then used in the imputation model, which is built only on one of the groups, which leads to slightly different estimates.

 

To see this more explicitly, look at the example below.  Notice how the "obs-data" estimates change slightly because all the observations are standardized.  This can be readily verified using Proc STANDARD and Proc REG.

data Mono1;
do Trt=0 to 1;
do j=1 to 5;
y0=10 + rannor(999);
y1= y0 + Trt + rannor(999);
if (ranuni(999) < 0.3) then y1=.;
output;
end; end;

do Trt=0 to 1;
do j=1 to 45;
y0=10 + rannor(999);
y1= y0 + Trt + rannor(999);
if (ranuni(999) < 0.3) then y1=.;
output;
end; end;
drop j;
run;


proc mi data=Mono1 seed=14823 nimpute=1 out=outex15;
class Trt;
monotone reg (/details);
mnar model( y1 / modelobs= (Trt='0'));
var y0 y1;
ods select MonoReg;
run;

proc standard data=mono1 mean=0 std=1 out=out1;
var y0 y1;
run;

proc reg data=out1;
where trt=0;
model y1=y0;
ods select ParameterEstimates;
run;

data add;
trt=1;
do j=1 to 45;
y0=10 + rannor(1);
y1= y0 + Trt + rannor(999);
if (ranuni(999) < 0.3) then y1=.;
output;
end;
drop j;
run;
data mono2;
set mono1 add;
run;

proc mi data=Mono2 seed=14823 nimpute=15 out=outex15;
class Trt;
monotone reg (/details);
mnar model( y1 / modelobs= (Trt='0'));
var y0 y1;
ods select MonoReg;
run;

proc standard data=mono2 mean=0 std=1 out=out2;
var y0 y1;
run;

proc reg data=out2;
where trt=0;
model y1=y0;
ods select ParameterEstimates;
run;

whymath
Barite | Level 11

Hi, could you please help to explain the meaning of numbers from "Imputation" column?

whymath_0-1754360390402.png

It is produced by your first proc mi.

 

SAS_Rob
SAS Employee

Those are the parameters that are used to generate the first imputed data set.  They are the result of Step 1, that are used in Step 2, as detailed in the documentation for the Montone Regression Method.

SAS Help Center: Monotone and FCS Regression Methods

Yan_vivien
Calcite | Level 5
It explains the matter. Thank you so much! it helps a lot!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1132 views
  • 4 likes
  • 4 in conversation