Calcite | Level 5

## Note in Proc GLM Solution Output Means That There is a Problem with the Regression Model?

Hi everyone. I am using Proc GLM with categorical variables as my predictors on number of medications taken.

My code looks like this:
proc glm data=mydata;
class Sex AgeGroups Income;
model Medications=Sex AgeGroups Income AgeGroups*Sex /CLPARM solution;
run; quit;

When I receive the output the parameter estimates table has a note at the bottom that says:

 The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

I have been told that I cannot trust the regression model and I cannot report the results of it due to this error message. I have been reading and trying to understand information about this error but all I can gather is that this message always comes about with you have a class statement in glm because it drops the last category and uses it for reference.

Is it true that I cannot report the estimates and confidence limits for any significant effects in my output because of the error note? If yes, how can I fix my regression model so that I can report my findings?

5 REPLIES 5
Super User

## Re: Note in Proc GLM Solution Output Means That There is a Problem with the Regression Model?

When a variable is specified in both the CLASS and MODEL statements in PROC GLM, the procedure uses GLM parameterization. This is a less than full-rank parameterization in which a CLASS variable with k levels is represented in the design matrix by a set of k 0,1-coded indicator (or "dummy" ) variables. If the SOLUTION option in the MODEL statement is also specified, the following note is included in the displayed results below the parameter estimates table:

`NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable.`

Note that there are many possible parameterizations, each of which imposes a different interpretation on the model parameters. However, only the GLM parameterization is available in PROC GLM. For more on the parameterizations available in other procedures, see "Parameterization of Model Effects: Other Parameterizations" in the Shared Concepts and Topics chapter of the SAS/STAT User's Guide and this usage note.

The GLM parameterization provides easily interpretable hypotheses about the model parameters but results in an "overparameterized" model — that is, a model with more parameters than degrees of freedom. The above NOTE is displayed to make you aware of the overparameterized model provided by GLM parameterization and does not indicate a problem with the fitted model. Interpretation of the PROC GLM parameter estimates is discussed and illustrated in this usage note.

http://support.sas.com/kb/22/585.html

http://support.sas.com/kb/38/384.html

@someone456 wrote:

Hi everyone. I am using Proc GLM with categorical variables as my predictors on number of medications taken.

My code looks like this:
proc glm data=mydata;
class Sex AgeGroups Income;
model Medications=Sex AgeGroups Income AgeGroups*Sex /CLPARM solution;
run; quit;

When I receive the output the parameter estimates table has a note at the bottom that says:

 The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

I have been told that I cannot trust the regression model and I cannot report the results of it due to this error message. I have been reading and trying to understand information about this error but all I can gather is that this message always comes about with you have a class statement in glm because it drops the last category and uses it for reference.

Is it true that I cannot report the estimates and confidence limits for any significant effects in my output because of the error note? If yes, how can I fix my regression model so that I can report my findings?

Diamond | Level 26

## Re: Note in Proc GLM Solution Output Means That There is a Problem with the Regression Model?

I have been told that I cannot trust the regression model and I cannot report the results of it due to this error message.

The opposite is true. The model was fit properly. You can trust the results. The message is a warning that you can't estimate all of the levels of a categorical (Class) variable, which is the standard message everyone gets when they use a categorical variable. It is not an indication that something is wrong.

Is it true that I cannot report the estimates and confidence limits for any significant effects in my output because of the error note?

not true at all.

If yes, how can I fix my regression model so that I can report my findings?

Nothing to fix, SAS has done the proper job here. What needs to change is the interpretation that you make of it. When you have a class variable, you will ALWAYS get one level set to zero, with no confidence intervals. That's the way things work with class variables, one of the levels adds no information, for example, if your class variables levels were Male and Female, and you know the individual was not Male, then knowing it is female tells you nothing additional over knowing it is not Male.

--
Paige Miller
Calcite | Level 5

## Re: Note in Proc GLM Solution Output Means That There is a Problem with the Regression Model?

Thank you so much for your help. I suppose the interpretation piece is the next hardest part.

My results showed the following. Would this show that men take less medications when compared to women? I am really not sure how to interpret the interaction effect. Thanks again for any help.

 Parameter Estimate Standard Error t value p value Men -0.1138 B 0.04 -2.83 0.0048 Women 0 B . . . Sex*AgeGroup Men Young 0.1178 B 0.05 2.34 0.01 Sex*AgeGroup Men Middle Age 0.0837 B 0.05 1.6 0.11 Sex*AgeGroup Men Oldest 0 B . . . Sex*AgeGroup Women Young 0 B . . . Sex*AgeGroup Women Middle Age 0 B . . . Sex*AgeGroup Women Oldest 0 B . . .
Diamond | Level 26

## Re: Note in Proc GLM Solution Output Means That There is a Problem with the Regression Model?

As stated above by @data_null__, you can use the LSMEANS command to provide a much more interpretable main effect and interaction effect.

--
Paige Miller