BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
StellaPals
Obsidian | Level 7

Hi! I want to know if grades differ by sex (female, male). But they don't only have one final grade, I have 10 semesters so.. they can have 1-10 grades from various semesters...I can make them into a different format like so:

id    semester     grade

1             1                1.2

1             2               3.0

1             3               NA

..

1             8               NA

2             1                5.0

2            2                4.0

and so on..but the question is what test should I use? I wanted to use ttest but I have more than 2 groups and thats a problem. Then I thought about proc glm and proc mixed. But would they be ok with missing values (most people don't have all the grades, and I don't want to remove most rows because of missing values)? And if I make the dataset like the example then they are dependant variables... are these tests good with that? Or what should I use?

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

So you want to "know if grades differ by sex (female, male). " or by semester ?

Your data looks like a repeated measure (longititude) data, so I would like to use Mixed Model or GEE model as StatDave showed.

 

 

proc mixed data=have covtest;
class sex semester;
model grade=sex semester /ddfm=kr s;
repeated semester/ subject=id;
lsmeans sex/diff e;
lsmeans semester/diff e;
run;

View solution in original post

7 REPLIES 7
StellaPals
Obsidian | Level 7
Also there is sex too 😄 forgot to add that to the example data
StellaPals
Obsidian | Level 7
Or is the semester even important? can I just do the ttest? ..and will the dependancy be a problem?
PaigeMiller
Diamond | Level 26

@StellaPals wrote:

Hi! I want to know if grades differ by sex (female, male).

 

but the question is what test should I use? I wanted to use ttest but I have more than 2 groups and thats a problem.


You have more than two sexes? Perhaps you should explain.

 

Or is the semester even important?

That's really impossible to decide without understanding the experiment in more detail. Is semester 1 for ID = 1 always the same semester in time for all the other IDs? Or can semester 1 for ID = 1 be unrelated to semester 1 for ID = 2, for example for ID = 1 semester 1 is September 2023 and for ID = 2 semester 1 is September 2024? Are the courses taken by each ID the same as well?

 

But really, explain the whole experiment or study in a lot more detail.

 

--
Paige Miller
StellaPals
Obsidian | Level 7
No I only have 2 groups in that sense - female and male. But I meant that I have more than two semesters.
ID is related to the person, so that if the data is:
ID SEX 1. semester 2. semester ...
1 M 1.2 3.0 ...
then I just converted it into long format like so:
ID SEX Semester grade
1 M 1 1.2
1 M 2 3.0
StatDave
SAS Super FREQ

Since you have repeated values for each ID, you could consider fitting a Generalized Estimating Equations model using PROC GEE. It is tolerant of the missing values. You will need to decide what distribution your response (GRADE) has. You would need to use the long format of the data, like in your original post, with one observation for each individual grade so that all grades are in a single variable, GRADE. For example, assuming that you use the normal distribution, these statements would fit the model:

proc gee;
class sex;
model grade=sex / dist=normal link=identity;
repeated subject=id;
run;

The test for the SEX effect is a test of whether the sexes differ with respect to grades.

Ksharp
Super User

So you want to "know if grades differ by sex (female, male). " or by semester ?

Your data looks like a repeated measure (longititude) data, so I would like to use Mixed Model or GEE model as StatDave showed.

 

 

proc mixed data=have covtest;
class sex semester;
model grade=sex semester /ddfm=kr s;
repeated semester/ subject=id;
lsmeans sex/diff e;
lsmeans semester/diff e;
run;
SteveDenham
Jade | Level 19

I would make a couple minor changes to @Ksharp 's PROC MIXED code, in case there is a difference over time for the two sexes:

proc mixed data=have;
class sex semester;
model grade=sex semester sex*semester/ddfm=kr2 s;
repeated semester/ subject=id type=ar(1);
lsmeans sex semester/diff e;
lsmeans sex*semester/diff e; /* This should probably be modified to look at the simple effect of sex for each semester, and the simple effect of semester for each sex by using the SLICE option */
run;

There is at least one other thing to consider as well - should separate variance-covariance estimates be applied by sex, to handle any differences (non-homogeneity). If that is the case, you may need to change to PROC GLIMMIX to check on that.

 

SteveDenham

 

 

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 3227 views
  • 3 likes
  • 5 in conversation