Hello all!
I feel like I am missing something obvious and could use some help. I am running the following paired sample t-test looking to see if there is a difference between the two variables. I would like to add in a binary variable that allows me to control for male / female. The binary variable is called "gender". How do I put "gender" in this correctly? I tried VAR, GROUP and other things but I'm getting errors.
I was thinking of splitting my file to create 1 male file and 1 female file but that seems rather silly... I figured there is a better way.
My code:
PROC TTEST DATA=temp.IBES_STATpers_QTRS_8 ALPHA=.05;
PAIRED MEANEST_O_YE*MEANEST_N_YE;
RUN;
You could sort the data by GENDER and then perform PROC TTEST with a BY GENDER; statement. This would give you a paired t-test for males and a paired t-test for females, but does not provide the ability for you to test if males are different than females. If you need the ability to test to see if males are different than females, plus also if there is a difference between MEANEST_O_YE and MEANEST_N_YE, you can do this in PROC GLM. Let us know if you want this.
Paige,
To clarify my ask, I would like to know if the difference between before and after is due to gender. Can you provide some guidance on what the PROC GLM would look like?
@anweinbe wrote:
Hello all!
I feel like I am missing something obvious and could use some help. I am running the following paired sample t-test looking to see if there is a difference between the two variables. I would like to add in a binary variable that allows me to control for male / female. The binary variable is called "gender". How do I put "gender" in this correctly? I tried VAR, GROUP and other things but I'm getting errors.
I was thinking of splitting my file to create 1 male file and 1 female file but that seems rather silly... I figured there is a better way.
My code:
PROC TTEST DATA=temp.IBES_STATpers_QTRS_8 ALPHA=.05;
PAIRED MEANEST_O_YE*MEANEST_N_YE;
RUN;
Please, whenever you get errors copy from the log the entire procedure code with all the messages, notes, warnings and errors. The paste into a code box opened on the forum with the </> icon to preserve the formatting.
If have BY Gender; which would require sorting for that variable prior, you will get a separate ttest for each level of the gender variable.
Best is also to include example data in the form of a data step so we can test your code or see if the data needs some changes.
I did not provide the error message because I've never written it before with a control variable in it. As such, I didn't think it would be helpful to provide a log of my own random things.
You noted "BY Gender". Where would that typically go?
The BY statement is one of those supported by almost every procedure to process data by the variables listed. The data must be sorted prior to use in a procedure by the variable:
Proc sort data=have;
by somevaraible;
run;
Prod whatever data=have;
by somevariable;
run;
Most of the procedures that do not support BY processing are those that involve reading or writing to the system like Proc Import, Proc Export.
Some procedure's documentation will have details on how a BY statement effects the specific processing of that procedure.
Like this:
proc ttest data=...;
by gender;
paired ...;
run;
You said:
To clarify my ask, I would like to know if the difference between before and after is due to gender.
I'm afraid I am struggling with your wording. Wouldn't the difference between before and after be due to before and after (as well as other possible things)? Nevertheless, this code will test to see if males have a different difference before and after than females have. If that isn't exactly what you are asking, please explain further.
data diff;
set temp.IBES_STATpers_QTRS_8;
diff = MEANEST_O_YE - MEANEST_N_YE;
run;
proc glm data=diff;
class gender;
model diff=gender;
means gender/t;
run;
I'm going to try what you provided.
Yes, the difference between before and after would be due to the before and after and possibly other things. I am trying to see if gender is one of those other things.
Perhaps I see that the before and after for males is minimal, but the before and after for women is big.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.