Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 12-30-2015 08:23 PM
(902 views)

Hello All:

Please, I am trying to simulate a GLM model with two classication variables and an interaction term between them. That is,

y=b1*s+b2*t+b3*t*s +e; s and t have 2 levels. I see an example in Wilkin book Simulation with SAS, but for some reasons when I tried obtaining the estimate after simulation I obtain a parameter estimate that is way off. Number of observation in each level of s are equal.

I appreciate your thought in advance.

J

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

5 REPLIES 5

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The ANOVA and GLM sections presents main-effects models. Look at the section "Linear Models with Interaction and Polynomial Effects," which has an example of a 3x2 GLM model with interaction terms.

It can be tricky (impossible?) to get the parameter estimates for some models to agreee with the simulation parameters. The GLM parameterization can be non-intuitive because it "moves around" the coefficient weights. The last level of each main effects and several levels of the interaction effect are set to zero by the GLM parameterization. Consequently, the intercept term and other parameter estimates can be different than the specified values, even though the simulation is correct. This is the reason that it says (p. 215) "because the design matrix is singular, the parameter estimates found by PROC GLM might not be the same as the parameter values that were used to construct the data."

When you use the SOLUTION option on the MODEL statement, GLM reports the note

Note: | The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. |

That phrase "not uniquely estimable" means "the estimates reported by GLM might not agree with the parameters specified in your simulation."

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Model is:

y=alpha+ beta1*drug+beta2*disease+beta3*drug*disease+error.

alpha=0; beta1=0; beta2=3; beta3=1.5.

data int;

y=0;

do drug=1 to 2;

do disease=1 to 2;

do subject=1 to 5;

output;

end;

end;

end;

run;

proc print data=int;

run;

/*Design Matrix*/

proc logistic data=int

outdesignonly outdesign=designref(drop=y);

class drug disease/param=reference;

model y=drug|disease;

run;

proc print data=designref;

run;

proc iml;

call randseed(1);

use designref;

read all var _NUM_ into X;

close design;

beta={0, 0, 3, 1.5};

eps=j(nrow(X),1);

call randgen(eps,"Normal");

y=X*beta+eps;

create Y var{y};append;close Y;

data d;

merge y int(drop=y);

run;

proc print data=d;

run;

proc glm data=d;

class drug disease;

model y=drug|disease/solution p;

run;

Thanks

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You are using a reference parameterization to generate the data, so you need to use the same reference parameterization if you want the parameter estimates to be close to the parameters. You can use PROC GENMOD as follows:

```
proc genmod data=d;
class drug disease /param=ref;
model y=drug|disease;
run;
```

Of course, with only 20 data points, you should not expect a four-parameter model to give estimates that are very close to the parameters, but you will see that the 95% Wald CIs include the parameter values for the data simulated from this random number seed.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.