BookmarkSubscribeRSS Feed
someone456
Calcite | Level 5

Hi everyone. I am using Proc GLM with categorical variables as my predictors on number of medications taken.

 

My code looks like this: 
proc glm data=mydata;
class Sex AgeGroups Income;
model Medications=Sex AgeGroups Income AgeGroups*Sex /CLPARM solution;
run; quit;

 

When I receive the output the parameter estimates table has a note at the bottom that says:

The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

 

I have been told that I cannot trust the regression model and I cannot report the results of it due to this error message. I have been reading and trying to understand information about this error but all I can gather is that this message always comes about with you have a class statement in glm because it drops the last category and uses it for reference.

 

Is it true that I cannot report the estimates and confidence limits for any significant effects in my output because of the error note? If yes, how can I fix my regression model so that I can report my findings?

 

Thanks in advance.

5 REPLIES 5
Reeza
Super User

When a variable is specified in both the CLASS and MODEL statements in PROC GLM, the procedure uses GLM parameterization. This is a less than full-rank parameterization in which a CLASS variable with k levels is represented in the design matrix by a set of k 0,1-coded indicator (or "dummy" ) variables. If the SOLUTION option in the MODEL statement is also specified, the following note is included in the displayed results below the parameter estimates table:

NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

Note that there are many possible parameterizations, each of which imposes a different interpretation on the model parameters. However, only the GLM parameterization is available in PROC GLM. For more on the parameterizations available in other procedures, see "Parameterization of Model Effects: Other Parameterizations" in the Shared Concepts and Topics chapter of the SAS/STAT User's Guide and this usage note.

The GLM parameterization provides easily interpretable hypotheses about the model parameters but results in an "overparameterized" model — that is, a model with more parameters than degrees of freedom. The above NOTE is displayed to make you aware of the overparameterized model provided by GLM parameterization and does not indicate a problem with the fitted model. Interpretation of the PROC GLM parameter estimates is discussed and illustrated in this usage note.

 

http://support.sas.com/kb/22/585.html

 

http://support.sas.com/kb/38/384.html

 


@someone456 wrote:

Hi everyone. I am using Proc GLM with categorical variables as my predictors on number of medications taken.

 

My code looks like this: 
proc glm data=mydata;
class Sex AgeGroups Income;
model Medications=Sex AgeGroups Income AgeGroups*Sex /CLPARM solution;
run; quit;

 

When I receive the output the parameter estimates table has a note at the bottom that says:

The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

 

I have been told that I cannot trust the regression model and I cannot report the results of it due to this error message. I have been reading and trying to understand information about this error but all I can gather is that this message always comes about with you have a class statement in glm because it drops the last category and uses it for reference.

 

Is it true that I cannot report the estimates and confidence limits for any significant effects in my output because of the error note? If yes, how can I fix my regression model so that I can report my findings?

 

Thanks in advance.


 

PaigeMiller
Diamond | Level 26

I have been told that I cannot trust the regression model and I cannot report the results of it due to this error message. 

 

The opposite is true. The model was fit properly. You can trust the results. The message is a warning that you can't estimate all of the levels of a categorical (Class) variable, which is the standard message everyone gets when they use a categorical variable. It is not an indication that something is wrong.

 

Is it true that I cannot report the estimates and confidence limits for any significant effects in my output because of the error note?

 

not true at all.

 

 If yes, how can I fix my regression model so that I can report my findings?

 

Nothing to fix, SAS has done the proper job here. What needs to change is the interpretation that you make of it. When you have a class variable, you will ALWAYS get one level set to zero, with no confidence intervals. That's the way things work with class variables, one of the levels adds no information, for example, if your class variables levels were Male and Female, and you know the individual was not Male, then knowing it is female tells you nothing additional over knowing it is not Male. 

--
Paige Miller
someone456
Calcite | Level 5

Thank you so much for your help. I suppose the interpretation piece is the next hardest part.

 

My results showed the following. Would this show that men take less medications when compared to women? I am really not sure how to interpret the interaction effect. Thanks again for any help.

ParameterEstimate Standard Errort valuep value
Men-0.1138B0.04-2.830.0048
Women0B...
      
Sex*AgeGroup Men Young0.1178B0.052.340.01
Sex*AgeGroup Men Middle Age0.0837B0.051.60.11
Sex*AgeGroup Men Oldest0B...
Sex*AgeGroup Women Young0B...
Sex*AgeGroup Women Middle Age0B...
Sex*AgeGroup Women Oldest0B...
PaigeMiller
Diamond | Level 26

As stated above by @data_null__, you can use the LSMEANS command to provide a much more interpretable main effect and interaction effect.

--
Paige Miller
data_null__
Jade | Level 19

Wouldn't you be more interested in the adjusted means(LSMEANS) of the class effects.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 7192 views
  • 4 likes
  • 4 in conversation