BookmarkSubscribeRSS Feed
jgraham1
Calcite | Level 5

Hi all, 

 

(First post, new to SAS, using SAS University edition)

 

Big picture:

I am trying to run multivariate, non-linear regression on this data set (i.e. trying to use multiple predictor variables (power, volume) to develop parametric equations to estimate a dependent variable (mass)). However, there are some missing data points (not all the information I want is available online, so some systems list mass and power, others list mass and volume, and others list all three, etc). Currently, I am running a complete case analysis (only using the cases where mass, volume, and power are known). However, this vastly shrinks my available data set, and leads to large confidence intervals for the predicted coefficients in the data set. 

 

Goal: Use multiple imputation to impute missing values, then re-run multivariate, non-linear regression code (currently in MATLAB) to get the coefficients in the parametric equations to have less variance. 

 

Currently using: 

proc mi data=Work.IMPORT nimpute=10 seed=54321 mu0=313.5 219.2 1275 10.14 796 minimum=0 out=mi_mvn;
mcmc chain=multiple displayinit initial=em(itprint);
var PAYLOADPOWER PAYLOADMASS POWER VOLUME DRYMASS;
run;

 

Problem: Some of the imputed values for the dry mass category are larger than their respective known wet mass. Is there a way to condition the imputation to limit it to be less than the wet mass? (Does proc MI just apply random guesses for values based on the observed means and standard deviation? I would hope that it uses the known data points to effectively impute the missing data points but that doesn't seem like what's happening). 

 

1 REPLY 1
ballardw
Super User

From the online documentation in the Overview of MI procedure:

 

Multiple imputation does not attempt to estimate each missing value through simulated values. Instead, it draws a random sample of the missing values from its distribution. This process results in valid statistical inferences that properly reflect the uncertainty due to missing values—for example, confidence intervals with the correct probability coverage.

 

You may want to pick a larger number of imputations and then apply your filters from the generated data to restrict dry < wet mass.

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 732 views
  • 0 likes
  • 2 in conversation