Re: How to conduct PROC MIXED

Charlotte22 · Posted 06-15-2019 12:38 PM

For an analytical epidemiology course I have to do a PROC MIXED test in order to see a change in cholesterol level. The cholesterol level is measured at three times, 1960, 1965 and 1970. We got some tips about how to solve this, I've copied that text below. The dataset is also added as an attachment.

I've made the following syntax:

PROC MIXED DATA=ELEARN.zutphen_broad; 
   CLASS RN; 
   MODEL TOTCHOL1 TOTCHOL2 TOTCHOL3 = RN /Solution; 
   RANDOM intercept/ Subject=RN; 
RUN;

Which gives me the error:

NOTE: PROCEDURE MIXED used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

NOTE: The SAS System stopped processing this step because of errors.
298  PROC MIXED DATA=ELEARN.zutphen_broad;
299     CLASS RN;
300     MODEL TOTCHOL1 TOTCHOL2 TOTCHOL3 = RN /Solution;
                       --------
                       73
                       202
ERROR 73-322: Expecting an =.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
301     RANDOM intercept/ Subject=RN;
302  RUN;

What am I doing wrong? I'm not skilled enough with SAS to be able to find out how to solve this. The hints are for the following questions as well, right now I'm stuck at question A (with hint A) which just want to know the regression coefficient and the P value in change from 1960 - 1970.

a) You can use PROC MIXED to conduct an analysis with a random intercept model.

PROC MIXED DATA=;

CLASS <subject identifier and categorical variables (if any)>;

MODEL <dependent variables> = <independent variables> /Solution;

RANDOM intercept/ Subject= <subject identifier>;

RUN;

In the first part of the assignment, the independent variable is time as a continuous variable.

b) In the output, you can find the slope of the regression under estimate in the table “solution of fixed effects”.

e) add TIME(ref=FIRST) to the class statement in order to model TIME as a categorical covariate. (ref=FIRST) makes TIME=0 into the reference class.

f) look at the regression results and use your common sense.

g) You can add a quadratic term of time by adding TIME*TIME as an independent variable.

If you do so, the (linear) term TIME should ALSO be in the model. Otherwise, TIME*TIME will pick up both the linear and the quadratic effect.

Testing whether this term is statistically significant, in a correct way, however, is not trivial.

The p-value given in the table “solution of fixed effects” might be to0 high because TIME and TIME*TIME are strongly correlated.

h) The correct way of testing is a likelihood ratio test, but in SAS you need to perform this test “by hand”. This means running the model with

totchol = TIME,

and also the model with

totchol=TIME TIME*TIME ,

USING MAXIMUM LIKELIHOOD, that is, using

PROC MIXED DATA=... method=ML ;

Find the log likelihoods of the two models in the output, and use these in the next piece of SAS code:

data LRT;

ll1 = 117xx.x;** fill here the loglikelihood from model TIME;

ll2 = 117xx.x; ** fill here the loglikelihood from model TIME TIME*TIME;

chi = ll1-ll2;

p = 1-probchi(chi,1);

run;

PaigeMiller · Posted 06-15-2019 12:56 PM

Only one dependent variable is allowed in your PROC MIXED MODEL statement.

You might want to re-arrange the input data so that the three time periods are denoted by a variable (TIMEPERIOD=1 or TIMEPERIOD=2 or TIMEPERIOD=3), and then you have a single response named CHOLESTEROL, repeated measures on the variable TIMEPERIOD.

Examples are discussed here: https://documentation.sas.com/?docsetId=statug&docsetVersion=14.3&docsetTarget=statug_mixed_examples...

--
Paige Miller

Charlotte22 · Posted 06-15-2019 01:03 PM

I thought that could be the case indeed, but how should I take all the
three time points in account then? And isn't it strange than that the hints
example syntax says "variables".

Thanks for you help! I'll just keep on trying.

PaigeMiller · Posted 06-15-2019 02:55 PM

data re_arrange;
    set ELEARN.zutphen_broad;
    cholesterol = totchol1;
    timperiod=1;
    output;
    cholesterol=totchol2;
    timeperiod=2;
    output;
    cholesterol=totchol3;
    timeperiod=3;
    output;
run;

--
Paige Miller

Rick_SAS · Posted 06-17-2019 09:29 AM

Linear regression models (ANOVA, REG, GLM, PLS) permit multiple variables on the left-hand side of the model statement. Generalized linear models (LOGISTIC, GENMOD) and mixed models (FMM, MIXED, GLIMMIX) only support a single variable on the left-hand side.

How to conduct PROC MIXED