BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ggollehon_OD
Calcite | Level 5

Hello,

 

I am trying to figure out a way to run t-tests on race to determine if there are any pay discrepancies. I have a code that runs now but I built it so i specify which race to compare, and create a data set for each of the possible combinations of races. Is there a more efficient way to accomplish this?

 

proc ttest data = mydata.client;
where group_n >=30 and race in ("A", "B");
class race;
by location job_group;
var annl_salary;
run;

 

What I am hoping to get is one data set that has each combination of race for each job location.

 

Job_group    RACE1 RACE2   T-stat     PVALUE    DF

1c                       A          B     

1c                       A          H

.

.

.

1c                       B         H

1c                       B         I

.

.

.     

2a                       A         B

...

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

This is a job for PROC GLM, which will do t-tests on all possible pairs (and also allows for multiple comparison adjustments so that if you are now comparing 15 different pairs, the p-values are adjusted appropriately, these are called Bonferroni t-tests, provided by the BON option).

 

proc glm data=mydata.client(where=(group_n>=30));
    class race;
    model annl_salary=race;
    means race / t bon;
run;
quit;
--
Paige Miller

View solution in original post

11 REPLIES 11
PaigeMiller
Diamond | Level 26

This is a job for PROC GLM, which will do t-tests on all possible pairs (and also allows for multiple comparison adjustments so that if you are now comparing 15 different pairs, the p-values are adjusted appropriately, these are called Bonferroni t-tests, provided by the BON option).

 

proc glm data=mydata.client(where=(group_n>=30));
    class race;
    model annl_salary=race;
    means race / t bon;
run;
quit;
--
Paige Miller
ggollehon_OD
Calcite | Level 5

This seems to work perfectly, the only issue I am having is trying to get the actual t-values included in the output. Any help there?

PaigeMiller
Diamond | Level 26

I think this is it (but didn't actually test it to see if it works):

 

ods output means=glmmeans;
proc glm ...;
...
run;
quit;
--
Paige Miller
ggollehon_OD
Calcite | Level 5

When I submit the below code:


ods output means = mydata.glmmeans;
proc glm data = mydata.christus(where =(group_n>=30));
class race;
model annl_salary = race;
means race / t bon;
run;
quit;

 

 

I get the error: "WARNING: Output 'means' was not created. Make sure that the output object name, label, or
path is spelled correctly. Also, verify that the appropriate procedure options are
used to produce the requested output object. For example, verify that the NOPRINT"

It seems that any output from proc glm is just the input data set with any additional measures added to the output statement. I apologize for all the questions, i am still fairly new at SAS.

ballardw
Super User

@ggollehon_OD wrote:

When I submit the below code:


ods output means = mydata.glmmeans;
proc glm data = mydata.christus(where =(group_n>=30));
class race;
model annl_salary = race;
means race / t bon;
run;
quit;

 

 

I get the error: "WARNING: Output 'means' was not created. Make sure that the output object name, label, or
path is spelled correctly. Also, verify that the appropriate procedure options are
used to produce the requested output object. For example, verify that the NOPRINT"

It seems that any output from proc glm is just the input data set with any additional measures added to the output statement. I apologize for all the questions, i am still fairly new at SAS.


When discussing errors or warnings it helps to include the entire procedure code and all messages related to that code copied from the log and pasted into a code box opened using the forums {I} to preserve the formatting of any error or warning messages.

 

Without the log we now ask questions like the following:

Did the Proc GLM run successfully?  If the procedure errored out and did not run then the output would not be created.

 

What ODS destination do you have open? If none of the ods destinations are open the Means table would not be generated and would create this warning.

Were there any messages involving the MEANS statement? If this statement has an issue then the Means output table would not display and you would get this warning.

ggollehon_OD
Calcite | Level 5

Here is my output from the log file.


"742 ods output ttest = mydata.ttest;
743
744 proc glm data = mydata.client (where=(group_n>=30));
745 class race;
746 model annl_salary = race/ p ;
747 means race / t bon ;
748 run;

WARNING: ODS graphics with more than 5000 points have been suppressed. Use the
PLOTS(MAXPOINTS= ) option in the PROC GLM statement to change or override the cutoff.
749 quit;

WARNING: Output 'ttest' was not created. Make sure that the output object name, label, or
path is spelled correctly. Also, verify that the appropriate procedure options are
used to produce the requested output object. For example, verify that the NOPRINT
option is not used."

 

Both the t and bon tables are calculated, so the procedure ran correctly.

 

ggollehon_OD
Calcite | Level 5

Here is my output from the log file.


"742 ods output ttest = mydata.ttest;
743
744 proc glm data = mydata.client (where=(group_n>=30));
745 class race;
746 model annl_salary = race/ p ;
747 means race / t bon ;
748 run;

WARNING: ODS graphics with more than 5000 points have been suppressed. Use the
PLOTS(MAXPOINTS= ) option in the PROC GLM statement to change or override the cutoff.
749 quit;

WARNING: Output 'ttest' was not created. Make sure that the output object name, label, or
path is spelled correctly. Also, verify that the appropriate procedure options are
used to produce the requested output object. For example, verify that the NOPRINT
option is not used."

 

Both the t and bon tables are calculated, so the procedure ran correctly.

 

I'm not quite sure what is meant when you say ODS destination?  

PaigeMiller
Diamond | Level 26

So, it would appear that GLM does not provide the t-values, but it does provide the confidence intervals. So you could compute the t-values from the confidence intervals and related information, using this formula: https://goo.gl/images/cjz12Q

 

ods output cldiffsinfo=cldiffsinfo cldiffs=cldiffs;
proc glm data = mydata.christus(where =(group_n>=30));
    class race;
    model annl_salary = race;
    means race / t bon;
run;
quit;

 

 

--
Paige Miller
ggollehon_OD
Calcite | Level 5
Thanks. the issue still is getting the output saved. the resulting dataset either does not save, or does not have any of the figures from the output.
PaigeMiller
Diamond | Level 26

@ggollehon_OD wrote:
Thanks. the issue still is getting the output saved. the resulting dataset either does not save, or does not have any of the figures from the output.

Words like "doesn't save" doesn't help us understand what is wrong. Show us the code you are using and show us the SASLOG.

--
Paige Miller
PaigeMiller
Diamond | Level 26

I see the problem now, it's my mistake 😞

 

 means race / t bon cldiff;
--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1029 views
  • 5 likes
  • 3 in conversation