Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- GENMOD to test difference between groups?

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 02-13-2024 10:46 AM
(594 views)

Hi,

I have a dataset (sample below) which reports the number of years worked by teachers in the same school.

Among teachers with a Bachelor's degree, I want to look at the independent effect of male teachers on the number of years worked.

The outcome (number of years worked) is continuous, and the independent variables (Bachelor's degree and male sex) are binary (1,2; yes/no).

```
data results;
input ID bachelors male years_worked;
datalines;
1 0 1 10
2 1 1 17
3 1 0 16
4 0 0 8
5 1 1 27
6 1 0 10
7 0 0 9
8 0 1 4
9 0 1 6
10 1 0 12
;
run;
```

I'm wondering if it's correct to use PROC GENMOD for this?

```
proc genmod data=results;
class bachelors(ref='0');
model years_worked= bachelors male bachelors*male / dist=gamma;
run;
```

Among teachers with a Bachelor's degree, I want to be able to interpret the effect of male sex on the number of years worked, and state if there is any difference.

I'm not sure if I have the GENMOD statement specified correctly, and how I should interpret this?

Also how can I see the difference in number of years worked among teachers with a Bachelor's degree who are male vs those who are not male?

I'd really appreciated any help, thank you

- Tags:
- Difference
- GEE
- genmod

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

No, since the model uses the log link, that estimate is the difference in log means, or equivalently the log ratio of means. So the exponentiated estimate, 1.7368, is the estimate of the mean ratio. From the LSMEANS table, you can see that the difference is 22-12.667=9.333. If you want to directly estimate and test the difference in means rather than the log ratio and ratio of means, you can save the model and LSMEANS coefficients table and use the NLMeans macro. For example:

```
proc genmod data=results;
class bachelors(ref='0') male(ref='0');
model years_worked= bachelors male bachelors*male / dist=gamma link=log;
lsmeans bachelors*male / diff exp cl plots=none e;
ods output coef=c;
store gam;
run;
%nlmeans(instore=gam, coef=c, link=log,
diff=all, title=bach-male diffs)
```

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

If your goal as stated means that you want to assess the difference between males and females in the bachelors group, then you can use the LSMEANS statement to do that. In the differences table from the following, the first row makes that comparison.

```
proc genmod data=results;
class bachelors(ref='0') male(ref='0');
model years_worked= bachelors male bachelors*male / dist=gamma link=log;
lsmeans bachelors*male / diff exp cl plots=none;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you very much @StatDave for your help.

From the output results, how should I interpret the difference in the number of years worked by teachers with a Bachelor's degree who are male vs those who are not male? - In the differences table, the estimate = 0.5521, so do I interpret this as 0.55 more years compared to the same teachers who are not male?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

No, since the model uses the log link, that estimate is the difference in log means, or equivalently the log ratio of means. So the exponentiated estimate, 1.7368, is the estimate of the mean ratio. From the LSMEANS table, you can see that the difference is 22-12.667=9.333. If you want to directly estimate and test the difference in means rather than the log ratio and ratio of means, you can save the model and LSMEANS coefficients table and use the NLMeans macro. For example:

```
proc genmod data=results;
class bachelors(ref='0') male(ref='0');
model years_worked= bachelors male bachelors*male / dist=gamma link=log;
lsmeans bachelors*male / diff exp cl plots=none e;
ods output coef=c;
store gam;
run;
%nlmeans(instore=gam, coef=c, link=log,
diff=all, title=bach-male diffs)
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Correct. Since the model is saturated, you'll notice that that is the same as the difference in those means if you compute them using, say, PROC MEANS. But you need GENMOD to get a standard error and confidence limits.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks again, really appreciate your help! 🙂

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.