BookmarkSubscribeRSS Feed
krueg314
Calcite | Level 5

proc surveylogistic data=data;
class TARGET b c;
model TARGET (event='1') = a b c d e f g / clparm;
strata STRATA;
cluster PSU;
weight WEIGHT;
run;

 

proc surveyfreq data=data;
strata STRATA;
cluster PSU;
weight WEIGHT;
tables (a b c d e f g)*TARGET / RelRisk clparm;
run;

 

I am interested in the odds ratio of variable b has on TARGET. I see different odds ratios for proc surveylogistic and proc survey freq, and manually proc surveyfreq makes sense when I take the weighted values. Why am I seeing different odds ratios and what can I do to fix? At least can I see relative risk in proc surveyfreq as that's the model I'm using.

7 REPLIES 7
ballardw
Super User

Without your data it is hard to tell exactly.

 

On possible cause is that Surveyfreq and Surveylogistic will treat missing values a bit differently. If any variable on the model statement is missing (unless the MISSING option is included on a the Class statement) then the entire record is not used for modeling (pretty common to most of the modeling procedures). Read the diagnostics about how many records are in the data set and how many actually used for the model.

krueg314
Calcite | Level 5

Ok I think the key is to drop all missing values before running because the proc surveyfreq only accounts for the only missing values of b and TARGET, rather than the other ones droped by the regression.

krueg314
Calcite | Level 5
I have tried this to delete all the missing before surveyfreq, but the odds ratio still a bit different 😕

data byemiss;
set data (KEEP=TARGET a b c d e f g STRATA PSU WEIGHT);
if nmiss(of _numeric_) > 0 then delete;
run;
StatDave
SAS Super FREQ

Beyond the issue of missing values, the results will still differ since the odds ratio estimates for any variable provided by SURVEYLOGISTIC are adjusted for the effects of the other variables in the model. The estimates from SURVEYFREQ are not adjusted for the other variables. 

krueg314
Calcite | Level 5

After accounting for missing this must be the reason. Any way I can still get the relative risk in the the proc surveylogistic statement, since I want to account for these interactions?

StatDave
SAS Super FREQ

Use the STORE, LSMEANS, and ODS OUTPUT statements in SURVEYLOGISTIC followed by the NLMeans macro as illustrated (using PROC LOGISTIC) in this note.

SteveDenham
Jade | Level 19

I really, really, really wish that this macro had been around back when I first used PROC LOGISTIC and GENMOD.  I was happily including ORs in stuff that went to study management, but they wanted everything expressed as relative risk, since that is what PROC FREQ generates and that is what they were used to.

 

SteveDenham

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 830 views
  • 3 likes
  • 4 in conversation