BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Epi_Stats
Obsidian | Level 7

Hi,

 

I have a dataset (sample below) which reports the number of years worked by teachers in the same school.

 

Among teachers with a Bachelor's degree, I want to look at the independent effect of male teachers on the number of years worked.

 

The outcome (number of years worked) is continuous, and the independent variables (Bachelor's degree and male sex) are binary (1,2; yes/no).

 

data results;
input ID bachelors male years_worked;
datalines;
1 0 1 10
2 1 1 17
3 1 0 16
4 0 0 8
5 1 1 27
6 1 0 10
7 0 0 9
8 0 1 4
9 0 1 6
10 1 0 12
;
run;

 

I'm wondering if it's correct to use PROC GENMOD for this?

 

 

proc genmod data=results;
class bachelors(ref='0');
model years_worked= bachelors male bachelors*male / dist=gamma;
run;

 

Among teachers with a Bachelor's degree, I want to be able to interpret the effect of male sex on the number of years worked, and state if there is any difference.

 

I'm not sure if I have the GENMOD statement specified correctly, and how I should interpret this?

 

Also how can I see the difference in number of years worked among teachers with a Bachelor's degree who are male vs those who are not male?

 

I'd really appreciated any help, thank you

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

No, since the model uses the log link, that estimate is the difference in log means, or equivalently the log ratio of means. So the exponentiated estimate, 1.7368, is the estimate of the mean ratio. From the LSMEANS table, you can see that the difference is 22-12.667=9.333. If you want to directly estimate and test the difference in means rather than the log ratio and ratio of means, you can save the model and LSMEANS coefficients table and use the NLMeans macro. For example:

proc genmod data=results;
class bachelors(ref='0') male(ref='0');
model years_worked= bachelors male bachelors*male / dist=gamma link=log;
lsmeans bachelors*male / diff exp cl plots=none e;
ods output coef=c;
store gam;
run;
%nlmeans(instore=gam, coef=c, link=log,
     diff=all, title=bach-male diffs)

View solution in original post

6 REPLIES 6
StatDave
SAS Super FREQ

If your goal as stated means that you want to assess the difference between males and females in the bachelors group, then you can use the LSMEANS statement to do that. In the differences table from the following, the first row makes that comparison.

proc genmod data=results;
class bachelors(ref='0') male(ref='0');
model years_worked= bachelors male bachelors*male / dist=gamma link=log;
lsmeans bachelors*male / diff exp cl plots=none;
run;
Epi_Stats
Obsidian | Level 7

Thank you very much @StatDave for your help.

 

From the output results, how should I interpret the difference in the number of years worked by teachers with a Bachelor's degree who are male vs those who are not male? - In the differences table, the estimate = 0.5521, so do I interpret this as 0.55 more years compared to the same teachers who are not male?

StatDave
SAS Super FREQ

No, since the model uses the log link, that estimate is the difference in log means, or equivalently the log ratio of means. So the exponentiated estimate, 1.7368, is the estimate of the mean ratio. From the LSMEANS table, you can see that the difference is 22-12.667=9.333. If you want to directly estimate and test the difference in means rather than the log ratio and ratio of means, you can save the model and LSMEANS coefficients table and use the NLMeans macro. For example:

proc genmod data=results;
class bachelors(ref='0') male(ref='0');
model years_worked= bachelors male bachelors*male / dist=gamma link=log;
lsmeans bachelors*male / diff exp cl plots=none e;
ods output coef=c;
store gam;
run;
%nlmeans(instore=gam, coef=c, link=log,
     diff=all, title=bach-male diffs)
Epi_Stats
Obsidian | Level 7

Thank you @StatDave , so in this example, the difference in the number of years worked for teachers with a Bachelors would be on average +9.33 years for males compared to females

StatDave
SAS Super FREQ
Correct. Since the model is saturated, you'll notice that that is the same as the difference in those means if you compute them using, say, PROC MEANS. But you need GENMOD to get a standard error and confidence limits.
Epi_Stats
Obsidian | Level 7

Thanks again, really appreciate your help! 🙂

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 707 views
  • 6 likes
  • 2 in conversation