BookmarkSubscribeRSS Feed
anweinbe
Quartz | Level 8

Hello all!

I feel like I am missing something obvious and could use some help. I am running the following paired sample t-test looking to see if there is a difference between the two variables. I would like to add in a binary variable that allows me to control for male / female. The binary variable is called "gender". How do I put "gender" in this correctly? I tried VAR, GROUP and other things but I'm getting errors.

 

I was thinking of splitting my file to create 1 male file and 1 female file but that seems rather silly... I figured there is a better way.

 

My code:

PROC TTEST DATA=temp.IBES_STATpers_QTRS_8 ALPHA=.05;
PAIRED MEANEST_O_YE*MEANEST_N_YE;
RUN;

 

 

 

8 REPLIES 8
PaigeMiller
Diamond | Level 26

You could sort the data by GENDER and then perform PROC TTEST with a BY GENDER; statement. This would give you a paired t-test for males and a paired t-test for females, but does not provide the ability for you to test if males are different than females. If you need the ability to test to see if males are different than females, plus also if there is a difference between MEANEST_O_YE and MEANEST_N_YE, you can do this in PROC GLM. Let us know if you want this.

--
Paige Miller
anweinbe
Quartz | Level 8

Paige,

 

To clarify my ask, I would like to know if the difference between before and after is due to gender. Can you provide some guidance on what the PROC GLM would look like?

ballardw
Super User

@anweinbe wrote:

Hello all!

I feel like I am missing something obvious and could use some help. I am running the following paired sample t-test looking to see if there is a difference between the two variables. I would like to add in a binary variable that allows me to control for male / female. The binary variable is called "gender". How do I put "gender" in this correctly? I tried VAR, GROUP and other things but I'm getting errors.

 

I was thinking of splitting my file to create 1 male file and 1 female file but that seems rather silly... I figured there is a better way.

 

My code:

PROC TTEST DATA=temp.IBES_STATpers_QTRS_8 ALPHA=.05;
PAIRED MEANEST_O_YE*MEANEST_N_YE;
RUN;

 


Please, whenever you get errors copy from the log the entire procedure code with all the messages, notes, warnings and errors. The paste into a code box opened on the forum with the </> icon to preserve the formatting.

 

If have BY Gender; which would require sorting for that variable prior, you will get a separate ttest for each level of the gender variable.

 

 

Best is also to include example data in the form of a data step so we can test your code or see if the data needs some changes.

anweinbe
Quartz | Level 8

I did not provide the error message because I've never written it before with a control variable in it. As such, I didn't think it would be helpful to provide a log of my own random things.

 

You noted "BY Gender". Where would that typically go?

ballardw
Super User

The BY statement is one of those supported by almost every procedure to process data by the variables listed. The data must be sorted prior to use in a procedure by the variable:

 

Proc sort data=have;

   by somevaraible;

run;

 

Prod whatever data=have;

    by somevariable;

run;

 

Most of the procedures that do not support BY processing are those that involve reading or writing to the system like Proc Import, Proc Export.

Some procedure's documentation will have details on how a BY statement effects the specific processing of that procedure.

anweinbe
Quartz | Level 8
I understand the sorting as you noted, but then how will the T-test know to separate by Gender.
Do I need to include a By statement in the PROC TTEST code? I would assume yes... Where does it go?

Sorry for the back and forth. Trying to learn.
PaigeMiller
Diamond | Level 26

Like this:

 

proc ttest data=...;
by gender;
paired ...;
run;

You said:

 

To clarify my ask, I would like to know if the difference between before and after is due to gender.

I'm afraid I am struggling with your wording. Wouldn't the difference between before and after be due to before and after (as well as other possible things)? Nevertheless, this code will test to see if males have a different difference before and after than females have. If that isn't exactly what you are asking, please explain further.

 

data diff;
    set temp.IBES_STATpers_QTRS_8;
    diff = MEANEST_O_YE - MEANEST_N_YE;
run;
proc glm data=diff;
    class gender;
    model diff=gender;
    means gender/t;
run;
--
Paige Miller
anweinbe
Quartz | Level 8

I'm going to try what you provided.

 

Yes, the difference between before and after would be due to the before and after and possibly other things. I am trying to see if gender is one of those other things.

Perhaps I see that the before and after for males is minimal, but the before and after for women is big.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 2326 views
  • 5 likes
  • 3 in conversation