About kduggan

kduggan · ‎02-09-2017

Hi Tom, Thanks so much for your reply! That's what I had figured it was doing (somehow imputing differently depending on the order of observations), but it is helpful to know that that is what others think is going on. Indeed, that seems non-consequential in the long term in terms of one set of imputed values being somehow more or less valid than the other for future analyses. In support of that, Ms and SDs of scores among imputed participants only are very close (within .01 of each other), even though scores aren't necessarily comparable for any one participant. Thank you, again, for your help!

kduggan · ‎02-03-2017

Hi Art, Thank you so much for your feedback. Clearly, I am not an expert on MI either. I have traditionally used other approaches (e.g., maximum likelihood) with missing data, so MI is new to me too. That's a good point with the nbiter. What I thought that was doing was running one single imputation, and I agree that the "number of burn in iterations" wording is a little unclear. I don't think the nbiters command is causing the different estimates, though. I removed that command and re-ran the proc mi, first sorting by ID and then again by race, and once again I get different estimates from both procedures. But still, it was worth a shot, and at least something has been narrowed down!

kduggan · ‎02-02-2017

Hello, I am trying to singly-impute missing data using stochastic regression using proc mi in SAS 14.1. Here is a sample of my script. (Only var1 has any missingness, by sheer luck): proc mi data = INPUTFILE out = OUTPUTFILE minimum = 0.00 maximum = 1.00 nimpute = 1 seed = 123456; var VAR1 VAR2 VAR3 VAR4 VAR5 VAR6 VAR7 VAR8 RACE RISK VAR11 VAR12 VAR13 VAR14 VAR15 VAR16; class RACE RISK; fcs nbiter = 1 reg (VAR1 = VAR2 VAR3 VAR4 VAR5 VAR6 VAR7 VAR8 RACE RISK VAR11 VAR12 VAR13 VAR14 VAR15 VAR16); run; I am using a seed so that the procedure generates the same output data file every time. I noticed that when I sort the file by race, despite not having SAS impute by race (so, just proc sort data = inputfile; by race; run; before the proc mi statement) that I get different values imputed in the output file than when it is sorted by ID. From what I understand, the sorting of the input data set does not matter for proc mi. At first I thought maybe SAS was imputing the file separately by race, but when I added a "by race;" command to proc mi, the estimates were different again. So, I don't think it's doing this. To summarize my estimates: 1. one set of estimates when the file is sorted by ID before proc mi. 2. one set of estimates when the file is sorted by race before proc mi. 3. a third set of estimates when I add a "by race" command to proc mi. How the file is sorted should not matter if I don't have a "by" command in proc mi, right? For example, in the SAS documentation for proc mi (https://support.sas.com/documentation/onlinedoc/stat/141/mi.pdf), on page 5890, it says "note that the input data set does not need to be sorted in any order." Can anyone help me understand why I'm getting different estimates, if this is consequential, and how consequential it might be? Would the estimates somehow be less valid if the file were sorted by race instead of ID, despite not having a "by" statement within the proc mi command? Thank you for your time.

Online Status	Offline
Date Last Visited	‎02-10-2017 12:38 PM

Re: Issue with data set sorting generating different estimates with pr...

Re: Issue with data set sorting generating different estimates with pr...

Issue with data set sorting generating different estimates with proc m...

Re: Issue with data set sorting generating different estimates with pr...

Re: Issue with data set sorting generating different estimates with pr...

Re: Issue with data set sorting generating different estimates with pr...

Issue with data set sorting generating different estimates with proc m...