Re: R side repeated measure PROC GLIMMIX logsitic model setup

renjie · Posted 04-20-2017 02:01 PM

Basically I have questions about three different "proc glimmix" code's model setup.

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

Data description;

The "mydata" is like a 44 days(Day=1,2,3...44) by 24 subjects(PID=1,2,3...24) observations total dataset and it only contains 5 columns each(Missed, PID, Day, Q, Z). The data are complete, balanced(each subjects have 1,2,3...44 days). "Missed" is the binary(0/1) response and only "Q" and "Z" are continuous data("Z" here is the subejct-level data, which means for the same subject "PID 1", it should be the same value of "-1.05643" for all days).

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

My problems:

(1)

proc glimmix data=mydata pconv=1e-4;
	class PID Day;
	model Missed=Q Z/s dist=binomial link=logit;
	random Q/subject=pid;
	random _residual_/ subject= pid(day) type=ar(1);
run;

(2)

proc glimmix data=mydata pconv=1e-4;
	class PID Day;
	model Missed=Q Z/s dist=binomial link=logit;
	random Q/subject=pid;
	random _residual_/ subject= day(pid) type=ar(1);
run;

(3)

proc glimmix data=mydata pconv=1e-4;
	class PID Day;
	model Missed=Q Z/s dist=binomial link=logit;
	random Q/subject=pid;
	random day / subject= pid type=ar(1) residual;
run;

I want to know the mathematical formulas of the above three codes actually fitting.

Basically I am not very sure what is the meaning of "R-side" on mathematical formula of NON-Gaussian GLMM.

BTW, I can run those code((1) and (2) do not get the "Standard Error" of "AR(1)" of table "Covariance Parameter Estimates") and I can see (1) and (2) actually giving the same results, except for (3). So I become more confused...

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

For example, if the code is like below without R-side;

proc glimmix data=mydata pconv=1e-4;
	class PID Day;
	model Missed=Q Z/s dist=binomial link=logit;
	random Q/subject=pid;
run;

Then, its mathematical model should be(Y_{j,t} here is the value of Missed):

On the other hand, I know the one of repeated measure if it is LMM(Gaussian):

proc glimmix data=mydata1;
	class PID Day;
	model yjt=Q Z/ s dist=normal;
	random q/subject=pid;
	random day / subject= pid type=ar(1) residual;
run;

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

I am strungling of it for a while and cannot find out any answers about these kind of questions.

The questions may be short but I just attach all the information in case of any confusion.

I am really looking forward that someone can help me and am very appreciated of it.

Thank you.

Ksharp · Posted 04-20-2017 11:42 PM

R-side random effect means Residual term is random effect,

(i.e. Every obs' residual is correlated for (1) and (2) . Every strata's residual is correlated for (3) )

I think the first and second code should be the same thing,

since

subject= pid(day)

is the same as

subject= day(pid)

They both get the same R-side covariance construct .

The third code is another type of R-side random effect (strata residual).which means have different covariance construct.

e.g. you data should look like

pid day

1 1

1 2

1 3

1 4

2 1

2 2

.

not like

pid day

1 1

1 2

1 3

1 4

2 5

2 6

........

renjie · Posted 04-21-2017 12:13 AM

Thank you for your reply!

Yes the first two models fitting get exactly same results but the third one and my data is exactly like what you said.

Sorry, I believe GLMM is from GLM, similar as their residuals and I only know three kinds of residuals of binomial GLM: pearson residuals, deviance residuals and standardized residuals...

What are the obs' residual and strata residual you mean..

Obs' residuals you mean is:

resid = y_{j,t} - \hat{y}_{j,t} ?

And how about the strata residual?

Sorry I am still confused and need it a little bit more specific.

Hope to hear from you!

Ksharp · Posted 04-21-2017 12:28 AM

Actually I am not expert about it.

Really suggest you to read documentation about PROC GLIMMIX or PROC MIXED.

GLM is assuming the residual (predict value - actual value) is not correlated,

which lead to covariance construction is I which is identity matrix ,

i.e.

1 0 0

0 1 0

0 0 1

while GLMM is assuming the residual is correlated

which lead to covariance construction is v*N(mu,sigma)

v v12 v13

v21 v v23

v31 v32 v

Ksharp · Posted 04-21-2017 12:32 AM

"And how about the strata residual?"

It is usually for REPEATED Measure ,just like your third code .

renjie · Posted 04-21-2017 06:49 AM

Still thank you for your reply, I am still appreciated of it. But the problem is still not solved unfortunately I think.

I went through the documents before but could not find it.

Maybe you are correct the documents have already mentioned it and I just missed it.

If so I hope someone can point out on the document, on which chapter, attached the sentence here so that I can see the mathematical formula of the residuals. I think that could be clear.

Thanks for your help and your patient.

Ksharp · Posted 04-21-2017 09:34 AM

Here is the picuture I took from " SAS.Publishing.SAS.for.Mixed.Models.2nd.Edition.Mar.2006 ".