BookmarkSubscribeRSS Feed
mariebee
Calcite | Level 5

I'm new to SAS, and I am a bit stuck.

 

Roughly here is the design:

60 participants each completed the same comprehension test twice (scored 0-6).

 

Once after reading each text in two different Formats (i.e., Single and Control). 

When they read the different Formats, two topics were presented so participants didn't read the same topic twice. One was based on Animations and the other was based on People. 

(Note: order was counterbalanced and is not a variable of interest).

 

I want to look at ME of each Format and Topic AND Format*Topic

 

Data is set up like this:

ID Order Ani1 Single1 Score
11115
61116
70116
80112

---

11006
61006
70003
80004

---

50011
91012
31013
41012

--

50104
91103
31105
41102

 

My code looks like this:

 

PROC MIXED DATA=x;

MODEL score = Ani1|Single1 / S DDFM=kr;

REPEATED / SUBJECT = ID TYPE=un; run;

 

However, I am not confident it is correct, since I get the SAME results (below) with and without the Repeated/Subject line in the code. Am I doing something wrong?

 

Solution for Fixed EffectsEffect Estimate StandardError DF t Value Pr > |t|InterceptAni1Single1Ani1*Single1

 

3.88240.274511414.14<.0001
1.07760.42171142.560.0119
-0.84240.4217114-2.000.0482
0.14710.59641140.250.8057

 

 

12 REPLIES 12
SteveDenham
Jade | Level 19

I think the similarity is due to treating your independent factors as continuous variables (essentially a regression).  As a result, the REPEATED statement doesn't really accomplish what you want to do.  So, perhaps this would help:

 

PROC MIXED DATA=x;
CLASS Ani1 Single1 Order ID;
MODEL score = Ani1|Single1|Order / S DDFM=kr;
REPEATED Order/ SUBJECT = ID TYPE=un;
LSMEANS Ani1 Single1 Ani1*Single1;
 run;

This assumes that order indexes the order in which a subject gets either Control or Single.  There should be exactly one record for each ID for the combination of Ani1, Single1 and Order. Orderis left out of the calculation of the marginal means, so that those means are averages over the order.  I included order and all of its interactions in the MODEL statement because it is a design element that should be accommodated, whether you are interested in the means by order or not.  It may turn out that there is a significant order or order interaction effect, which may influence how you interpret the results.

 

SteveDenham

 

mariebee
Calcite | Level 5

Thank you, @SteveDenham 

 

The Class line makes sense. And I agree about Order. 

 

I ran the code, and get this in my log file: 

 

18 PROC MIXED DATA=xxx;
19 CLASS Ani1 Single1 Order1 ID;
20 MODEL score = Ani1|Single1|Order1 / S DDFM=kr;
21 REPEATED Order1/ SUBJECT = ID TYPE=un;
22 LSMEANS Ani1 Single1 Ani1*Single1;
23 run;

NOTE: An infinite likelihood is assumed in iteration 0 because of a nonpositive definite
estimated R matrix for ID 21.
NOTE: PROCEDURE MIXED used (Total process time):
real time 0.04 seconds
cpu time 0.01 seconds

 

The output reads like it should, except it doesn't produce results.

 

The SAS System

The Mixed Procedure
Model InformationData SetDependent VariableCovariance StructureSubject EffectEstimation MethodResidual Variance MethodFixed Effects SE MethodDegrees of Freedom Method
xxx
Score
Unstructured
ID
REML
None
Kenward-Roger
Kenward-Roger

 

Class Level InformationClass Levels ValuesAni1Single1Order1ID
20 1
20 1
20 1
59(blinded but accurate)

 

DimensionsCovariance ParametersColumns in XColumns in ZSubjectsMax Obs per Subject
3
27
0
59
2

 

Number of ObservationsNumber of Observations ReadNumber of Observations UsedNumber of Observations Not Used
118
118
0

 

 

mariebee
Calcite | Level 5

@SteveDenham 

 

Also -- ID 21 is actually the first participant in the datafile, not one in the middle. 

SteveDenham
Jade | Level 19

This error comes about when we don't specify the subject correctly, such that there is more than one observation with identical X values for a given subject. In this case, I suspect that there is a set where the IDs are duplicated.  This would explain why the number of subjects=59 when you said that 60 were given the test.  So the first thing I would do is use PROC FREQ to get a full cross-tab of your data, and check that there are the expected number entries for every combination of your X variables.  Since 118 is not evenly divisible by 8 (2 x 2 x 2 design), something is likely missing/miscoded.  

 

Then if the data all look correct, you could try changing the subject to subject=ID*single1.  From the sample data, this looks like it may remove the duplicate issue.

 

SteveDenham

mariebee
Calcite | Level 5

@SteveDenham Thanks and thanks for helping me work through this.

 

First, apologies for the confusion about sample size. I rounded in the first post for simplicity sake. n = 59

 

The data appear to be coded correctly. Order was not evenly balanced. This is an analysis on a subsample of participants from a larger study, so while order was balanced across the full study, it is not within this sample of 59.

 

 

The FREQ Procedure
Ani1Ani1 Frequency Percent CumulativeFrequency CumulativePercent01
5950.005950.00
5950.00118100.00

 

Single1Single1 Frequency Percent CumulativeFrequency CumulativePercent01
5950.005950.00
5950.00118100.00

 

Order1Order1 Frequency Percent CumulativeFrequency CumulativePercent01
5647.465647.46
6252.54118100.00

 

ID table Omitted, but all IDs are present with frequency of 2

SteveDenham
Jade | Level 19

Try subject=ID*Order1, as that looks like the only place pseudo-duplicates could show up.  Otherwise, I think you will have to make the assumption that you mentioned before - that order has no effect, and remove it from the model.  It may be as simple as adding a CLASS statement to your original PROC MIXED code.

 

SteveDenham

mariebee
Calcite | Level 5
ID*Order1 still wont run, but ID*Ani1 does (I tried prior to seeing your last reply).

However ...
1) I got this warning: Convergence criteria met but final Hessian is not positive definite.
2) Solution for Fixed Effects Table has many duplicate and blank rows.

Then, when I try to remove it altogether with this code, it wont run:

PROC MIXED DATA=x;
CLASS Ani1 Single1 ID;
MODEL score = Ani1|Single1 / S DDFM=kr;
REPEATED Order1/ SUBJECT = ID TYPE=un;
LSMEANS Ani1 Single1 Ani1*Single1;
run;
SteveDenham
Jade | Level 19

I would now suggest using a different optimizer, which means moving over to PROC GLIMMIX.

 

PROC GLIMMIX DATA=x;
NLOPTIONS maxiter=5000 tech=nmsimp;
CLASS Ani1 Single1 Order ID;
MODEL score = Ani1|Single1|Order / S DDFM=kr;
RANDOM Order/ SUBJECT = ID*Ani1 TYPE=un residual;
LSMEANS Ani1 Single1 Ani1*Single1;
 run;

See how this behaves.

 

SteveDenham

(I will be looking in occasionally over the holiday, but I am clocking out now.)

 

mariebee
Calcite | Level 5
Thank you! This runs nicely.
There are lots of blank rows in the Solutions for Fixed Effects table, but I believe that is by design, and that my interpretation should be based on the F-values in the Type III Tests of Fixed Effects Table.

I think that answers my questions for now. I want to keep this open while I dig through a bit more. Happy Thanksgiving. And thank you for your time!

mariebee
Calcite | Level 5
If you can, I am having trouble understanding which output table to interpret? The three main methods yield different results:
1) Solutions for Fixed effects.
2) Type III Tests of Fixed Effects.
3) LSM Tables.
Any resource would be welcome.

Also, do you have any guidance about how to identify which was more powerful (Ani1 vs Single1) and to report Cohen's D.
SteveDenham
Jade | Level 19

Starting at the top:

The solution vector is what is used to create the least squares means.  I find it useful if I need an estimate and standard error of a continuous covariate.  Otherwise, the latter two are more useful.

The Type3 F tests are testing to see if at least one mean is different from all the others in that effect (main or interaction). This is the primary test of "significance' for an effect.

The LSM (least squares means) tells you what the expected values are for each of the levels of the effects.  Using the diffs option allows you to test if one particular mean is "significantly' different from another.

 

I don't know what you mean by more powerful.  Do you mean which had a greater effect on the mean?  That is generally what Cohen's D is all about.  However, mixed models don't really lend themselves to calculating effect sizes. If you really want to look at something like it, add the /diff option to the lsmeans statement. The results table should present the t values for each comparison.  This is a ratio of the difference to the standard error of the difference.  Cohen's D is a ratio of the difference to the standard deviation of the reference group, so they should be analogous in direction.

 

SteveDenham

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 2131 views
  • 0 likes
  • 3 in conversation