BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aminkarimid
Lapis Lazuli | Level 10

Hello everybody,

I want to regress dummy variables, which are time-based, on volume and use PROC GENMOD and PROC GLM statements to create dummies automatically.

When I fit the same data to GLM and GENMOD, I get different parameter estimates.

 

Here are some examples of my codes:

* Regressing dummy variables on normalized volume variable using calculated volume;
proc genmod data=Sampledata_adjvol;
   class TRD_EVENT_ROUFOR / param=effect;
   model adjusted_volume = TRD_EVENT_ROUFOR / noscale;
   ods select ParameterEstimates;
run;

* Same analysis by using the CLASS statement;
proc glm data=Sampledata_adjvol;
   class TRD_EVENT_ROUFOR;              /* Generates dummy variables internally */
   model adjusted_volume = TRD_EVENT_ROUFOR / solution;
   ods select ParameterEstimates;
quit;

 

Would you please explain why I get different results when I run this two procedures?

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

These two PROCs use a different model paramterization, but the results really give the same model.

 

If you want the results to match identically, in PROC GENMOD you want

 

class TRD_EVENT_ROUFOR / param=GLM;

 

--
Paige Miller

View solution in original post

12 REPLIES 12
PaigeMiller
Diamond | Level 26

These two PROCs use a different model paramterization, but the results really give the same model.

 

If you want the results to match identically, in PROC GENMOD you want

 

class TRD_EVENT_ROUFOR / param=GLM;

 

--
Paige Miller
aminkarimid
Lapis Lazuli | Level 10

Thanks. I find it very helpful.

Parameterization of Model Effects

 

Reeza
Super User

Note that the parameterization methods of GLM/EFFECT are not what you usually find in textbooks. It would be the REF parameterization method that's most commonly seen in textbooks. 

aminkarimid
Lapis Lazuli | Level 10
Would you please explain more precisely?
Which one is better?
Reeza
Super User

There isn't a better, they are different. 

This is a statistical concept though, not really a SAS concept. 

 

http://support.sas.com/kb/37/273.html

 

And see the docs specifically for the CLASS statement of the PROC of interest.

ie PROC GLM

http://documentation.sas.com/?docsetId=statug&docsetTarget=statug_glm_syntax04.htm&docsetVersion=14....

 

 

PaigeMiller
Diamond | Level 26

@aminkarimid wrote:
Would you please explain more precisely?
Which one is better?

In the future, could you please indicate which comment your are replying to?

--
Paige Miller
Ksharp
Super User

The most different thing between GLM and GENMOD is estimating method is different.

GLM is OLS, while GENMOD is MLE .

PaigeMiller
Diamond | Level 26

@Ksharp wrote:

The most different thing between GLM and GENMOD is estimating method is different.

GLM is OLS, while GENMOD is MLE .


Certainly this is true in general, but ... in a simple modelling situation such as this, where there are only dummy variable effects to be estimated, and the errors are iid normally distributed, wouldn't MLE and OLS produce the same model?

--
Paige Miller
Ksharp
Super User
Sure. MLE and OLS will generate the similar parameter estimator .But they are different estimating method. Aren't you agree with that ?

aminkarimid
Lapis Lazuli | Level 10

So, do you say that the difference between estimated parameters using GENMOD & GLM is because of difference between the method of MLE and OLS in former and latter statements?

PaigeMiller
Diamond | Level 26

As I stated above, I think the difference in coefficients is that you have chosen a different parameterization of the model than the one PROC GLM uses. But these are the same models, when you go to predict you get the exact same predicted values — and in fact if you combine terms in the models to un-do the effect of the different parameterization, you will see that the coefficients are the same in both models.

--
Paige Miller
JacobSimonsen
Barite | Level 11

I agree that difference in the coefficients are due to difference in the parametrization. The two models specified are the same.

 

But, there are quite big difference in how the two procedure works. Proc genmod use numerical methods to maximize the likelihood functions. Further, there can be differences in p-values as proc genmod use -2LogQ tests, and proc glm use F-tests. If data is normal distributed then proc glm should be used as it is more exact, while the distributions of test statistics in proc genmod are based on approximations.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 14744 views
  • 5 likes
  • 5 in conversation