BookmarkSubscribeRSS Feed
renjie
Calcite | Level 5

Basically I have questions about three different "proc glimmix" code's model setup.

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

Data description;

The "mydata" is like a 44 days(Day=1,2,3...44) by 24 subjects(PID=1,2,3...24) observations total dataset and it only contains 5 columns each(Missed, PID, Day, Q, Z). The data are complete, balanced(each subjects have 1,2,3...44 days). "Missed" is the binary(0/1) response and only "Q" and "Z" are continuous data("Z" here is the subejct-level data, which means for the same subject "PID 1", it should be the same value of "-1.05643" for all days).

mydata.png

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

My problems:

(1)

 

proc glimmix data=mydata pconv=1e-4;
	class PID Day;
	model Missed=Q Z/s dist=binomial link=logit;
	random Q/subject=pid;
	random _residual_/ subject= pid(day) type=ar(1);
run;

(2)

 

 

proc glimmix data=mydata pconv=1e-4;
	class PID Day;
	model Missed=Q Z/s dist=binomial link=logit;
	random Q/subject=pid;
	random _residual_/ subject= day(pid) type=ar(1);
run;

 

(3)

 

 

proc glimmix data=mydata pconv=1e-4;
	class PID Day;
	model Missed=Q Z/s dist=binomial link=logit;
	random Q/subject=pid;
	random day / subject= pid type=ar(1) residual;
run;

 

I want to know the mathematical formulas of the above three codes actually fitting. 

 

 

Basically I am not very sure what is the meaning of "R-side" on mathematical formula of NON-Gaussian GLMM.

 

BTW, I can run those code((1) and (2) do not get the "Standard Error" of "AR(1)" of table "Covariance Parameter Estimates") and I can see (1) and (2) actually giving the same results, except for (3). So I become more confused...

 

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

For example,  if the code is like below without R-side;

 

proc glimmix data=mydata pconv=1e-4;
	class PID Day;
	model Missed=Q Z/s dist=binomial link=logit;
	random Q/subject=pid;
run;

 Then, its mathematical model should be(Y_{j,t} here is the value of Missed):

ex1.png

 

On the other hand, I know the one of repeated measure if it is LMM(Gaussian):

 

proc glimmix data=mydata1;
	class PID Day;
	model yjt=Q Z/ s dist=normal;
	random q/subject=pid;
	random day / subject= pid type=ar(1) residual;
run;

ex2.png

 

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

I am strungling of it for a while and cannot find out any answers about these kind of questions.

 

The questions may be short but I just attach all the information in case of any confusion.

 

I am really looking forward that someone can help me and am very appreciated of it.

 

Thank you.

6 REPLIES 6
Ksharp
Super User

R-side random effect means Residual term is random effect,

(i.e. Every obs' residual is correlated for (1) and (2) . Every strata's residual is correlated for (3) )

 

I think the first and second code should be the same thing,

since 

subject= pid(day)

is the same as

subject= day(pid)

 

They both get the same R-side covariance construct .

 

The third code is another type of R-side random effect (strata residual).which means have different covariance construct.

e.g. you data should look like

pid day

1 1

1 2

1 3

1 4

2 1

2 2

.

 

 

not like 

pid day

1 1

1 2

1 3

1 4

2 5

2 6

........

 

renjie
Calcite | Level 5

Thank you for your reply!

 

Yes the first two models fitting get exactly same results but the third one and my data is exactly like what you said.

 

Sorry, I believe GLMM is from GLM, similar as their residuals and I only know three kinds of residuals of binomial GLM: pearson residuals, deviance residuals and standardized residuals...

 

What are the obs' residual and strata residual you mean..

 

Obs' residuals you mean is:

resid = y_{j,t} - \hat{y}_{j,t}  ?

 

And how about the strata residual?

 

Sorry I am still confused and need it a little bit more specific.

 

Hope to hear from you!

 

 

Ksharp
Super User

Actually I am not expert about it.

Really suggest you to read documentation about PROC GLIMMIX or PROC MIXED.

 

GLM is assuming the residual (predict value - actual value) is not correlated, 

which lead to covariance construction is  I which is identity matrix ,

i.e.

1 0 0

0 1 0

0 0 1

 

 

while GLMM is assuming the residual is correlated

which lead to covariance construction is  v*N(mu,sigma)

v v12 v13

v21 v v23

v31 v32 v

Ksharp
Super User

"And how about the strata residual?"

It is usually for REPEATED Measure ,just like your third code .

renjie
Calcite | Level 5

Still thank you for your reply, I am still appreciated of it. But the problem is still not solved unfortunately I think.

 

I went through the documents before but could not find it.

 

Maybe you are correct the documents have already mentioned it and I just missed it. 

 

If so I hope someone can point out on the document, on which chapter, attached the sentence here so that I can see the mathematical formula of the residuals. I think that could be clear.

 

Thanks for your help and your patient.

Ksharp
Super User

Here is the picuture I took from " SAS.Publishing.SAS.for.Mixed.Models.2nd.Edition.Mar.2006 ".

 

x1.png

x2.png

x3.png

x4.png

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1658 views
  • 1 like
  • 2 in conversation