BookmarkSubscribeRSS Feed
emaneman
Pyrite | Level 9

Dear all,

I am trying to replicate an analysis that another researcher has carried out using R, and I am lost. I could of course read-up on R, but maybe somebody here has already the knowledge and can help me speed things up.  I attach the R syntax below.

correct_recognized is the DV (dichotomous)

cent_trans_ART and emotion_type are the predictors of interest, and I see that their main effects and interaction are included as fixed effects.  Then there are a series of random effects, the part in bold, that I am not sure how to reproduce in PROC MIXED.

Any help would be most appreciated.

Thank you in advance.

 

Eman

  

```{r}
m = glmer(correct_recognized ~ 1 +
cent_trans_ART +
emotion_type +
cent_trans_ART:emotion_type +
(1 + cent_trans_ART|target_video_num) +
(1 + emotion_type|participant_id),
data = d,
family = binomial)
summary(m)
Anova(m, type = 3)
```

9 REPLIES 9
Rick_SAS
SAS Super FREQ

See the article "Hierarchical linear models and lmer". Briefly, both lmer() and glmer() use a vertical bar to specify random effects. A notation such as
(1 + x | subject) 

is a way to specify a random effect where each subject gets its own random intercept and slope.

 

emaneman
Pyrite | Level 9

Thank you, Rick, as always

SteveDenham
Jade | Level 19

Also, you'll want to use PROC GLIMMIX so that you can model the dichotomous response variable.  I follow the R mailing list, but I don't do enough in R to be really helpful.  However, a short description of the design used to collect the data would go a long way to coming up with GLIMMIX code.

 

SteveDenham

emaneman
Pyrite | Level 9

hello Steve,

 

you're right about asking for more clarity on the design. It is a cognitive science study, in which 400 participants watched 42 very short videos in which an emotion was depicted, and their task was to indicate what the emotion was. The dichotomous criterion is the var correct_recognized, coded -1 vs 1. The emotions vary and are a-priori categorized as simple or complex, which is the repeated-measure factor represented by the variable emotion_type. The other predictor is the continuous variable, cent_trans_ART. In addition to their main effects and interaction, the researchers added these random factors, which are target_video_num (which is categorical and has 42 different values) and the participant_id (also categorical, of course).

The researchers themselves describe the model tested in the article by saying

 

"Accuracy on each trial of the GERT-S (0 = incorrect, 1 = correct) was regressed on the emotion type presented in the video (0.5 = simple, 0.5 = complex), participant ART score (square rooted and mean centered), and the interaction between the two. By-item and by-participant random intercepts as well as a by-item random slope for participant ART score and a by- participant random slope for emotion type were included in the reported model."

 

Since I am doing additional analyses on the dataset that they kindly shared with me, I wanted to first replicate their findings, but I am a SAS user and not very familiar with random effects in general. Hence my post.

 

Thank you for your time.

 

Eman

 

 

SteveDenham
Jade | Level 19

Hi Eman,  I followed it all, and thought I could write code, but I ran into an issue with emotion_type as a repeated measure.  I know it is repeated within a subject, but is there any ordering to the presentation?  If so, is it identical for all subjects?  Judging from the R model, I would say that there is no ordering, or else there would be an additional R side factor covering the temporal effect.  So, in the R model emotion_type is a G side effect, with a two level unstructured covariance structure. I might be tempted to see if the simpler compound symmetry effect improved the fit based on corrected AIC.  So here is my proposed code:

 

proc glimmix data=d method=laplace;
class emotion_type target_video_num emotion_type participant_id;
model correct_recognized = emotion_type cent_trans_ART emotion_type*cent_trans_ART/dist=binary;
random emotion_type/subject=participant_id type=chol;
/* and here is the hard part, fitting a continuous variable as a random_effect*/
random cent_trans_ART/subject=target_video_num;
/*insert any lsmeans/oddsratio estimate/lsmeans/lsmestimate statements here, or use a STORE statement*/
run;

This fits an unequal slopes/unequal intercept model for cent_trans_ART.  A reduced model may be appropriate if the emotion_type by cent_trans_ART interaction is not significant.  Oh yeah, no guarantees that this code will converge, etc.

 

SteveDenham

 

emaneman
Pyrite | Level 9

Dear Steve,

 

that is brillant, thank you for this!

 

Emotion_type, with values "complex" or "simple", depends on which of the specific emotion (out of 14 emotions) were displayed in each of the 42 videos: some of the emotions are considered complex, while others are considered simple.

Each of the 14 emotions was presented in three different videos, and the order of presentation of the 42 video was randomized - hence (in principle) different for each participant.  

 

I run the syntax you sent on the dataset, which I add in attachment here (exportx). I attach also the article. The description of the model is in Experiment 1, but the data that I am working on are those of Experiment 2.

 

A few things:

 

1. This PROC Glimmix takes 30 minutes on my SAS studio to run!

 

2. I get an interaction between the fixed factors (emotion_type and cent_trans_ART), while the authors report not finding an interaction:

"Participants were not better at recognizing one type of emotion (simple/complex) over another, as indicated by similar rates of recognition for simple and complex emotions and a non-significant effect of emotion type, X2(1) = 0.49, p = .48. Furthermore, participants who had higher ART scores were not better at recognizing emotions, overall, as indicated by a non-significant effect of ART score, X2(1) = 2.34, p = .13. However, the interaction between the emotion type and ART score was significant, b = 0.18, X2(1) = 4.24, p < .05, indicating that participants who had higher ART scores were better able to recognize complex emotions."

 

3. From what they write in the paper (which I mentioned earlier) it looks like they included four, not 2 random factors. This might be the reason for the different results.

 

 

emaneman
Pyrite | Level 9
In the dataset above, I forgot that 13 participants should be excluded, as per the article.
This is the syntax to exclude them:

if participant_id in ("R_1OPSLlJlqe3kiC4", "R_1LGxs9MJDOpC4cW",
"R_1onI3J1yaENBuV5", "R_2cjUAsUP9KOJMuf", "R_ahE2LFATHw6jfxv",
"R_3PhnubUyhv5TJwC", "R_2le1UgRGD0H6B6p", "R_2trjDfHePcgEJtO",
"R_1i4HIHYXq68MaPG", "R_1n7xJ3q8CHDfXpo" "R_22F9xpFhoQWmYjZ",
"R_1PbpFkFxoK1xFk7", "R_wRg4AXfUfegZ8tz") then delete;
SteveDenham
Jade | Level 19

In point 2, you have a contradiction.  The interaction they found was significant.

 

You can get the four effects they report by adding intercept to each of the RANDOM statements:

 

proc glimmix data=d method=laplace;
class emotion_type target_video_num emotion_type participant_id;
model correct_recognized = emotion_type cent_trans_ART emotion_type*cent_trans_ART/dist=binary;
random intercept emotion_type/subject=participant_id type=chol;
/* and here is the hard part, fitting a continuous variable as a random_effect*/
random intercept cent_trans_ART/subject=target_video_num;
/*insert any lsmeans/oddsratio estimate/lsmeans/lsmestimate statements here, or use a STORE statement*/
run;

To get the cent_trans_ART variance components, you may have to create an identical variable (say cent_trans_ART_cat =cent_trans_ART) and include it in the class statement and in the second RANDOM statement.  That is going to push the execution time up immensely and may not converge, even in geological time.  There are other ways, but they aren't going to give you identical output.

 

SteveDenham

 

emaneman
Pyrite | Level 9

Thank you Steve.

It is not a contradiction, because they find the interaction effect in experiment 1, but not in experiment 2, which is the experiment for which I have the data.  But that is a detail. I appreciate your further suggestion!

Eman

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 845 views
  • 1 like
  • 3 in conversation