- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I would like to check interaction between 2 categorical variables. (outcome = drug prescription. I want to see is there any interaction variables with 'gender' variable)
So I came up with following code;
proc logistic data = hira.new_161718_outp_drugcode8;
model drug_code2 (event = '1') = gender age institution region mcode year gender*age;
run;
and I will do this for other variables (gender*institution, gender*region, gender*mcode, gener*year)
Does my approach correct?
Or is there any way that I can see this at once?
Does this code would allow me to do this?
proc logistic data = hira.new_161718_outp_drugcode8;
model drug_code2 (event = '1') = gender age institution region mcode year gender*age gender*institution gender*region gender*mcode gener*year);
run;
And when I run the code, I get following result; (I attached result screen)
Please check if my understanding is correct;
I interpret this data that there is weak interaction between gender and age, as estimate is -0.00001.
Thank you.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@waterjelly wrote:
Hi, I would like to check interaction between 2 categorical variables. (outcome = drug prescription. I want to see is there any interaction variables with 'gender' variable)
And when I run the code, I get following result; (I attached result screen)
Please check if my understanding is correct;
I interpret this data that there is weak interaction between gender and age, as estimate is -0.00001.
Thank you.
In addition to the point made above, where you are not treating age and sex as a categorical variable (which seems to be what you said you were trying to do), the –0.0001 does not imply a weak interaction. It is a slope, which depends on the scale of the (numeric) sex*age interaction, which we don't really know; and the scale of the variable sex*age means you can't compare it to the other variables, so you can't really tell if –0.0001 is weak just from this output. I would recommend you create some graphical outputs (such as the Effect plot) from PROC LOGISTIC to help you see what is really happening; or if you treat sex and age as categorical variables you could look at the LSMEANS to determine the magnitude of the effects.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please include the screen capture of your output in your reply by clicking on the "Insert Photos" icon (and not as a file attachment).
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
dont they need a class statement for the categorical variables, and a strata statement would simplify the code if interactions beetn gender and all other variables is intended
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@pmbrown wrote:
dont they need a class statement for the categorical variables, and a strata statement would simplify the code if interactions beetn gender and all other variables is intended
Good point. As written, the code treats the age and gender as continuous variables. This may not matter for gender, which could be 0 or 1. But age is treated as continuous, not what it seems the original post meant to do.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
actually, you would still need a class statement for age. The only way to avoid a class statement would be to derive multiple indicator vars, ie if c categories then c-1 indicator variables. In any case, any statistician would say not to categorise continuous age, perhaps the data were collected that way though and youre stuck with it
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Coding age as 0 1 2 still makes it continuous, not class. The only way to use class variables is to put them in a CLASS statement.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
no, it makes it categorical, they have said they categorised age, only a class statement makes sense here, the categorical variable shouldnt be treated as continuous even if it's ordinal, it's a coding error for sure, but they can do whatever they want, age should never be categorised, that analysis would not pass review unless the information loss is inherent in the data
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@pmbrown wrote:
no, it makes it categorical, they have said they categorised age, only a class statement makes sense here, the categorical variable shouldnt be treated as continuous even if it's ordinal, it's a coding error for sure, but they can do whatever they want, age should never be categorised, that analysis would not pass review unless the information loss is inherent in the data
yes, I agree
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@waterjelly wrote:
Hi, I would like to check interaction between 2 categorical variables. (outcome = drug prescription. I want to see is there any interaction variables with 'gender' variable)
And when I run the code, I get following result; (I attached result screen)
Please check if my understanding is correct;
I interpret this data that there is weak interaction between gender and age, as estimate is -0.00001.
Thank you.
In addition to the point made above, where you are not treating age and sex as a categorical variable (which seems to be what you said you were trying to do), the –0.0001 does not imply a weak interaction. It is a slope, which depends on the scale of the (numeric) sex*age interaction, which we don't really know; and the scale of the variable sex*age means you can't compare it to the other variables, so you can't really tell if –0.0001 is weak just from this output. I would recommend you create some graphical outputs (such as the Effect plot) from PROC LOGISTIC to help you see what is really happening; or if you treat sex and age as categorical variables you could look at the LSMEANS to determine the magnitude of the effects.
Paige Miller