BookmarkSubscribeRSS Feed
ROBIZ7
Fluorite | Level 6

I have individuals on which we measured 4 dependent variables: 2 are normal and 2 are negative binomial.

I would like to fit a multivariate model in order to gain more power in the analysis. The dataset has one column for the responses (COL1) and one column for the type of sitribution (dist).

I looked at the example in the SAS documentation, so I wrote:

 

proc glimmix data=project.tall;
class dist tieger_id;
model COL1 = dist dist*FFWeekc / noint s dist=byobs(dist);
random intercept / subject=tieger_id;
run;

but I get the following error:

ERROR: Use the GROUP=DIST option in the RANDOM _RESIDUAL_ statement to accommodate the scale
parameter from each distribution that you specify in the BYOBS=DIST option.

 

So I tried:

proc glimmix data=project.tall;
class dist tieger_id;
model COL1 = dist dist*FFWeekc / s dist=byobs(dist);
random _residual_ / subject=tieger_id type=chol;
run;

WARNING: The R matrix depends on observation order within subjects. Omitting observations from
the analysis because of missing values can affect this matrix. Consider using a
classification effect in the RANDOM _RESIDUAL_ statement to determine ordering in the
R matrix.
ERROR: Use the GROUP=DIST option in the RANDOM _RESIDUAL_ statement to accommodate the scale
parameter from each distribution that you specify in the BYOBS=DIST option.

 

and finally:

proc glimmix data=project.tall;
class dist tieger_id;
model COL1 = dist dist*FFWeekc / s dist=byobs(dist);
random _residual_ / subject=tieger_id type=chol group=dist;
run;

and I get the same error message.

 

Without random statement I get the results, but since I did not take into account the correlation between the response for each individual, does it correspond to a multiavriate model? And in case I do not use the "dist" in the model what would be the interpretation?

 

Thank you very much

3 REPLIES 3
PaigeMiller
Diamond | Level 26

There is no multivariate version of GLIMMIX, which takes into account the correlations among the dependent variables.

 

The closest you could come to a multivariate model in your case would be something like PROC PLS, which can handle multiple dependent variables and make use of the correlation between the dependent variables, but PLS does not make use of any distributional information (normal or negative binomial), and any "hypothesis test" would be based on empirical methods such as cross-validation. Also, PLS does not have a concept of random versus fixed factors, they're all fixed ...

--
Paige Miller
PaigeMiller
Diamond | Level 26

Ooops, I forgot about generalized Partial Least Squares regression, which might be able to do everything you want, taking into account the random factors and the different distributions, but I don't know, it's been a really long time since I read the paper.

 

So, you can read the paper and see if it will help your situation or not

https://cedric.cnam.fr/fichiers/RC906.pdf

--
Paige Miller
ROBIZ7
Fluorite | Level 6

Thanks PaigeMiller, but I think POC GLIMMIX can also deal with multivariate responses with different distributions (see https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_glimmix_sec...).

The problem is that I do not understand how it  is working, and why I get error when I use a random statement.

 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1759 views
  • 0 likes
  • 2 in conversation