- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I am a new SAS user and am working on a sensitivity analysis where I am predicting covid infection by water, sanitation and hygiene data. I am using proc genmod with link=log to produce risk instead of odds due to my study design. I am also testing an interaction between district and my primary predictor variable, a wash composite score. I have three districts and would like to produce risk for each of my districts (chiradzulu, chikawa, and blantyre) for a composite score of 1, 5 and 10. I also have 2 covariates, disability_num(binary 0/1) and num_members (count) Can someone help me? Here is what I have so far. I know I have to use an estimate statement but am struggling with what to enter!
PROC GENMOD DATA = CAPSTONE.FINAL DESCENDING;
CLASS DISTRICT (REF = 'Chikwawa')/ PARAM = REF ;
MODEL SENSITIVITY (EVENT='1')= SUM_COMP DISABILITY_NUM NUM_MEMBERS DISTRICT DISTRICT|SUM_COMP/ DIST = BIN LINK = LOG;
estimate 'chiradzulu 1' sum_comp 1 district 1/exp;
estimate 'chiradzulu 10' sum_comp 10 district 1/exp;
estimate 'blantyre 1' sum_comp 1 district -1/exp;
estimate 'blantyre 10' sum_comp 10 district -1/exp;
estimate 'chikwawa 1' sum_comp 1 district 0/exp;
estimate 'chikwawa10' sum_comp 10 district 0/exp;
RUN;
my design matrix:
Class Level Information | |||
---|---|---|---|
Class | Value | Design Variables | |
District | Blantyre | 1 | 0 |
Chikwawa | 0 | 0 | |
Chiradzulu | 0 | 1 |
I would appreciate some help! It is my first time doing an interaction with a three level categorical variable so please be kind. thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The log-linked binomial model very often cannot be successfully fit because the log link does not insure that the predicted values are valid binomial means on the probability scale (0 to 1). Instead, use a regular logistic model (logit link). If you want to estimate the risk (event probability) for each combination of your district and score variables, then use the LSMEANS statement. Do not use ESTIMATE statements since properly constructing them is difficult and and the LSMEANS statement can do this for you. To use the LSMEANS statement, remove the PARAM=REF option in the CLASS statement - it is not necessary in order to use the REF= option to set reference levels. Assuming that sum_comp is your score variable, then add it in the CLASS statement (assuming it is not actually continuous with many values) and use the following statement instead of your ESTIMATE statements. The ILINK and CL options will add columns labeled "Mean" that provide estimates for each combination on the mean (event probability or risk) scale.
lsmeans district*sum_comp / ilink cl;