turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- PROC MI Warning "An effect for variable X is a lin...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-09-2017 04:31 PM

I am trying to complete a multiple imputation for a dataset that has participants at 3 visits (data in short format). After setting up and running the basic code in SAS 9.4, I receive the warning "An effect for variable X is a linear combination of other effects. The coefficient of the effect will be set to zero in the imputation." for almost __ all __of my variables (but they aren't all linear combinations). Any ideas why this could be happening?

We have a lot of variables we are trying to impute in a single dataset, although the eventual models will only focus on a subset of the variables. The 1, 2, and 3 at the very end of the variable names represent visit # for variables that change over time. For pollutant and weather variables, nomenclature is name of pollutant/weather characteristic + MN (for mean) + 2, 7, 28 or 365 for the averaging period of days over which the mean is calculated. For our actual analyses after the imputation, we would only be using one set of averaging times and one pollutant exposure in one model (i.e. we might have a 2-day mean CO with the 2-day mean temperature, pressure, and dewpoint) along with the other variables that don't end in the 2, 7, 28, or 365. Could the number of variables be causing this? I'm using a book called "Multiple Imputation of Missing Data Using SAS" that suggested this approach of putting the data in short form would suffice for the longitudinal setting, but maybe this dataset is too complex?

**proc** **mi** data=data_short nimpute=10 seed=**270** out=data_impute;

class hrtarm dmarm cadarm hseduc ethnic smk_statusn1 smk_statusn2 smk_statusn3 alc_statusn1 alc_statusn2 alc_statusn3 center3dn1;

fcs logistic (hseduc smk_statusn1 smk_statusn2 smk_statusn3 alc_statusn1 alc_statusn2 alc_statusn3)

regression (texpwkn1 texpwkn2 texpwkn3 bmin1 bmin2 bmin3

tempmn2n1 tempmn2n2 tempmn2n3 tempmn7n1 tempmn7n2 tempmn7n3 tempmn28n1 tempmn28n2 tempmn28n3 tempmn365n1 tempmn365n2 tempmn365n3

dewpmn2n1 dewpmn2n2 dewpmn2n3 dewpmn7n1 dewpmn7n2 dewpmn7n3 dewpmn28n1 dewpmn28n2 dewpmn28n3 dewpmn365n1 dewpmn365n2 dewpmn365n3

premn2n1 premn2n2 premn2n3 premn7n1 premn7n2 premn7n3 premn28n1 premn28n2 premn28n3 premn365n1 premn365n2 premn365n3

z_score_sumn1 z_score_sumn2 z_score_sumn3

PM10MNMOn1 PM10MNMOn2 PM10MNYRn1 PM10MNYRn2

PM25MNMOn1 PM25MNMOn2 PM25MNYRn1 PM25MNYRn2

PMcMNMOn1 PMcMNMOn2 PMcMNYRn1 PMcMNYRn2

COMN2n1 COMN2n2 COMN2n3 COMN7n1 COMN7n2 COMN7n3 COMN28n1 COMN28n2 COMN28n3 COMN365n1 COMN365n2 COMN365n3

NO2MN2n1 NO2MN2n2 NO2MN2n3 NO2MN7n1 NO2MN7n2 NO2MN7n3 NO2MN28n1 NO2MN28n2 NO2MN28n3 NO2MN365n1 NO2MN365n2 NO2MN365n3

NOXMN2n1 NOXMN2n2 NOXMN2n3 NOXMN7n1 NOXMN7n2 NOXMN7n3 NOXMN28n1 NOXMN28n2 NOXMN28n3 NOXMN365n1 NOXMN365n2 NOXMN365n3

O3MN2n1 O3MN2n2 O3MN2n3 O3MN7n1 O3MN7n2 O3MN7n3 O3MN28n1 O3MN28n2 O3MN28n3 O3MN365n1 O3MN365n2 O3MN365n3

PM10MN2n1 PM10MN2n2 PM10MN2n3 PM10MN7n1 PM10MN7n2 PM10MN7n3 PM10MN28n1 PM10MN28n2 PM10MN28n3 PM10MN365n1 PM10MN365n2 PM10MN365n3

PM25MN2n2 PM25MN2n3 PM25MN7n2 PM25MN7n3 PM25MN28n2 PM25MN28n3 PM25MN365n2 PM25MN365n3

PMcMN2n2 PMcMN2n3 PMcMN7n2 PMcMN7n3 PMcMN28n2 PMcMN28n3 PMcMN365n2 PMcMN365n3

SO2MN2n1 SO2MN2n2 SO2MN2n3 SO2MN7n1 SO2MN7n2 SO2MN7n3 SO2MN28n1 SO2MN28n2 SO2MN28n3 SO2MN365n1 SO2MN365n2 SO2MN365n3) ;

**var** hrtarm cadarm dmarm ageDSRn1 ethnic CENTER3Dn1 CENTER3Dn2 CENTER3Dn3 q2n1 q3n1 q4n1 bmin1 hseduc alc_statusn1

smk_statusn1 SO2MN28n1 SO2MN7n1 SO2MN2n1 PM10MN28n1 PM10MN7n1 PM10MN2n1 O3MN28n1 O3MN7n1 O3MN2n1 NOXMN28n1 NOXMN7n1

NOXMN2n1 NO2MN28n1 NO2MN7n1 NO2MN2n1 COMN28n1 COMN7n1 COMN2n1 PMcMNYRn1 PMcMNMOn1 PM25MNYRn1 PM25MNMOn1 PM10MNYRn1

PM10MNMOn1 dewpmn28n1 dewpmn2n1 tempmn28n1 tempmn7n1 tempmn2n1 dewpmn7n1 SO2MN365n1 PM10MN365n1 O3MN365n1

NOXMN365n1 NO2MN365n1 COMN365n1 tempmn365n1 dewpmn365n1 SO2MN365n2 SO2MN28n2 SO2MN7n2 SO2MN2n2 PM10MN365n2

PM10MN28n2 PM10MN7n2 PM10MN2n2 O3MN365n2 O3MN28n2 O3MN7n2 O3MN2n2 NOXMN365n2 NOXMN28n2 NOXMN7n2 NOXMN2n2

NO2MN365n2 NO2MN28n2 NO2MN7n2 NO2MN2n2 COMN365n2 COMN28n2 COMN7n2 COMN2n2 q4n2 q3n2 q2n2 tempmn365n2 tempmn28n2

tempmn7n2 tempmn2n2 ageDSRn2 PMcMNYRn2 PMcMNMOn2 PM25MNYRn2 PM25MNMOn2 PM10MNYRn2 PM10MNMOn2 dewpmn365n2 dewpmn28n2

dewpmn7n2 z_score_sumn2 dewpmn2n2 z_score_sumn1 premn28n2 premn28n1 premn7n1 premn7n2 premn2n2 premn2n1 premn365n2

texpwkn2 premn365n1 alc_statusn2 texpwkn1 smk_statusn2 bmin2 PMcMN2n2 PM25MN2n2 PMcMN7n2 PM25MN7n2 PMcMN28n2

PM25MN28n2 PMcMN365n2 PM25MN365n2 q4n3 q3n3 q2n3 ageDSRn3 z_score_sumn3 tempmn28n3 tempmn7n3 tempmn2n3

dewpmn28n3 dewpmn7n3 dewpmn2n3 tempmn365n3 dewpmn365n3 SO2MN365n3 PMcMN365n3 PM25MN365n3 PM10MN365n3 O3MN365n3

NOXMN365n3 NO2MN365n3 COMN365n3 premn28n3 premn7n3 premn2n3 premn365n3 SO2MN28n3 PMcMN28n3 PM25MN28n3 PM10MN28n3

O3MN28n3 NOXMN28n3 NO2MN28n3 COMN28n3 SO2MN7n3 SO2MN2n3 PMcMN7n3 PMcMN2n3 PM25MN7n3 PM25MN2n3 PM10MN7n3 PM10MN2n3

O3MN7n3 O3MN2n3 NOXMN7n3 NOXMN2n3 NO2MN7n3 NO2MN2n3 COMN7n3 COMN2n3 alc_statusn3 texpwkn3 smk_statusn3 bmin3;

**run**;

Thanks for any ideas!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-23-2017 11:28 AM

Hi @khollid

I am having the same thing happen when I run PROC MI. Did you ever find out why this happens or if there is a solution?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-23-2017 12:36 PM

Yes. So for me it was just 1 variable that was an issue, but the way the warning works is to list every variable. When I took out the 1 problematic variable, the warnings all went away. In my case, one variable was perfectly predicted by the other two as it is var1=var2-var3. When I took var1 out of the list, the warnings cleared up. I would try to figure out if any of your variables can be perfectly predicted by some combination of other varibles, realizing that the warnings don't actually mean that every variable is the issue. Then you can take out the problematic variable and run the imputation. After the imputation finished, I then calculated an imputed var1 by having SAS calculate it from the imputed var2 and var3 for any initial missing values of var1. Hope this helps!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-28-2017 12:04 PM

I am having the same issue. Is there a quick way of identifying the variable that produced the problem. I am including over a hundred variables in my imputation. Hence, I need an efficient way of identifying the variable(s) that are causing MI to crash.

Cristian

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-28-2017 12:09 PM - edited 05-28-2017 12:10 PM

There are two ways to determine linear combinations in the absence of categorical variables

- Run PROC CORR on all of your variables, the pair that has a correlation of +1 or –1 is the problem
- Run PROC PRINCOMP on all of your variables, the linear combination of variables that has an eigenvalue of zero is the problem

If you have categorical variables, then the dummy variables (depending how you created them) will sum to 1 across the rows, you need to remove one dummy variable for each original categorical variable.

But saying you have the "same" issue really obscures many issues; and if we could see your code, we could be more definitive in our answer.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-28-2017 12:40 PM

I use a lot of macro language to automate my code. Hence, the full code is a bit too long to post. Here are the relevant parts. Basically, I am using FCS unless I have only continuous data. If I have continuous, ordinal, and nominal data, I am using FCS. That is, what I have for my current study. I have been able to get the program to run by commenting out offending variables. However, I do need to impute these variables. Should I use a multiple pass system where I impute everything that does not crash in round 1 and then in subsequent rounds I add the commented variables until everything is imputed?

Cristian

Part 1: Macro program

Filename Imp "h:\OSU\Teaching\Factor & Cluster Analysis\SAS macros\Impute macro.sas" ;

%Let Continuous = age /*FamInc*/ house /*SOL amount soc:*/ ;

%Let Ordinal = CG1Empl /*No_Jobs*/ CG1Edu /*InsHealth: aid:*/ children adults savings ;

%Let Nominal = CG1Gen CG1Rel CG1Mar CG1Race ;

%Let No_impute = StrEv_1--StrEv_30 Ins_: P_:;

%Let Transform = BoxCox(age/lambda=-.62) /*BoxCox(FamInc/lambda=.38)*/ BoxCox(house/lambda=-.32) /*BoxCox(SOL/lambda=-.47)*/ ;

%Let Filein = VAM;

%Let Fileout = demo_imp;

%Let uniq_ID = ID;

%Let Imp_no = 5 ;

%Let Seed = 123456 ; * Use 0 to create a random seed--but results cannot be replicated afterwards. ;

%Include Imp ; /* Run external SAS code */

Part 2: Snippet of back-end program

%Macro ImputeData;

proc means data=&Filein noprint ;

var &Continuous &Ordinal &Nominal;

output out=_min_ min=&Continuous ;

output out=_max_ max=&Continuous ;

run;

%if "&Continuous"^="" %then %do;

/* Produce macro variables with minimum and maximum values for variable list. */

Proc IML;

use _min_;

read all var {&Continuous} into min;

close _min_;

use _max_;

read all var {&Continuous} into max;

close _max_;

min = compbl(rowcat( char(min) ));

max = compbl(rowcat( char(max) ));

call symputx('min',min);

call symputx('max',max);

run;quit;

%end;

*ods select missPattern;

proc mi data = &Filein

seed=&Seed

nimpute = &Imp_no

out=&Fileout

%if "&Continuous"^="" %then %do;

minimum = &min

maximum = &max

round = 1

%end;

MINMAXITER = 3000;

%if "&Ordinal &Nominal"=" " %then

mcmc IMPUTE=FULL;

%else %do;

class &Ordinal &Nominal ;

fcs nbiter=75

%if "&continuous"^="" %then

REGPMM(&continuous) ;

%if "&ordinal"^="" %then

logistic(&Ordinal) ;

%if "&nominal"^="" %then

discrim(&Nominal / CLASSEFFECTS=INCLUDE) ; ;

%end ; ;

%if "&Transform"^="" %then

Transform &Transform ; ;

var &Continuous &Ordinal &Nominal &No_impute;

run;

%mend;

%ImputeData;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-28-2017 01:47 PM

It would sure help if you defined what FCS is ... But as i have never used FCS (as far as I know), I neverthelss think your attempt to impute values even in the case of a linear combintaion of variables is misguided. If the variables are a linear combination of one another in the data that isn't missing, of what value would it be to impute values that destroys the linear combination? I see no value in doing this. If variables are linear combinations of one another in the data that is present, I say eliminate one (or more) of the variables, because there's no value in including it in the analyses.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-28-2017 02:04 PM

FCS (fully conditional specification) method is a relatively new statistical procedure for imputing missing data. Originally, the MCMC approach was used but this method is only appropriate for continuous, normal data. I do not understand the mathematical statistics behind FCS but according to SAS Help, this is the method recommended when you have data with an arbitrary missing pattern. The method allows you to organize your variables according to whether they are continuous, ordinal, or nominal. It uses the regression model to impute continuous variables, logistic regress to impute ordinal variables, and the discriminant function to impute nominal variables. Best of all, it can do all this in one step. The downside is that the model is prone to crashing.

I agree that one should not impute variables that are linear combination of other effects. My issue is identifying them and then dropping them. PROC CORR only works for continuous data. What SAS does not say is whether the linear combination is due to the continuous variables, ordinal variable, nominal variables, or some combination there of. Since FCS does everything in one step, my bet is on a combination of all the variables. Hence, in order to confirm that a variable is a linear combination of other effects, I would need to run a single procedure that can use continuous, ordinal, and discrete predictors. I am not sure which procedure can do this. It also feels like a lot of work...but that is a side point.

I am currently working on dropping the variables that SAS indicates are linear effects. However, I am missing something in my macro code. I am getting an error for the following code. SAS is treating **i** in my %Let statement as a character rather than reading the numeric value. I am sure there is a simple function that will fix this but I cannot think of it. Any suggestions?

Cristian

%Macro ExcludeVars;

data _NULL_;

array Var{*} &Continuous ;

array Ex{*} &Exclude;

%Let _drop_= ;

do i=1 to dim(Ex);

if Ex(i) in Var then

%Let _drop_=&_drop_ %sysfunc(scan(&Exclude,i));

end;

%put &=_drop_;

run;

%Mend ExcludeVars;

%ExcludeVars;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-28-2017 02:08 PM

I forgot to add, the reason I am writing a macro to drop variables, rather than commenting them out as I did before, is because I am working with lots of variables. It is too much work to comment out specific variables when using shorthand references (colon or hyphen) to list your variables. Hence, I am trying to read in the variables and drop select variables from a larger list. Hence, my previous code.

C.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-28-2017 02:17 PM

As I said earlier, PROC PRINCOMP will find the linear combinations that are constant, these are the linear combinations that have a zero eigenvalue. In the case of nominal or ordinal variables, you need to provide PROC PRINCOMP with appropriate dummy variables.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-28-2017 02:21 PM

Unfortunately, this solution would be too inefficient for my data. I have about 175 variables, over 90% of which are ordinal or nominal. Creating dummy codes for all these variables is not worth the cost in programming time. Any suggestions for how to get SAS to read the value of i rather than to treat it as a character variable?

C.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-28-2017 05:15 PM

PCG wrote:

Unfortunately, this solution would be too inefficient for my data. I have about 175 variables, over 90% of which are ordinal or nominal. Creating dummy codes for all these variables is not worth the cost in programming time.

PROC GLMMOD makes creation of dummy variables easy.

Any suggestions for how to get SAS to read the value of i rather than to treat it as a character variable?

I don't know which part of your code you are referring to. Please be specific.