BookmarkSubscribeRSS Feed
sophiec
Obsidian | Level 7

Hello all, 

I'm using PROC SURVEYLOGISTIC to estimate odds of all-cause mortality using NHANES data. However, it appears that SAS is not including all available individuals in the model. Can anyone explain what's happening here? Let's just say for the purpose of this example the only exposure variables are age and sex. The overall N for my domain (pop=1, which represents inclusion criteria) is N = 31,304.

 

When I use PROC SURVEYFREQ for my outcome variable (all-cause mortality) within the domain that I am using (my inclusion criteria), I get N=5,022 deceased and N=26,229 alive. 

 

When I use PROC SURVEYLOGISTIC to model all-cause mortality= age sex, the "Domain Analysis" box shows "Number of observations in Domain" as 31,304. However, the "Response Profile" box shows only N=1,326 deceased and N=12,086 alive. What is causing the discrepancy here? 

 

Thank you! 

Sophie 

P.S. the same thing happened with my Cox model 

 

proc surveyfreq data=work.have; 
	weight WTMEC8YR;
	cluster sdmvpsu;
	strata sdmvstra;
	table pop*alldeath;
	run;

proc surveylogistic data=work.have ORDER=INTERNAL varmethod=taylor nomcar;
	weight WTMEC8YR;
	cluster sdmvpsu;
	strata sdmvstra;
	domain pop;
	class riagendr (REF=LAST)/ param=ref ORDER=INTERNAL;
	model alldeath (desc)= riagendr ridageyr; 
run 

proc_surveyfreq output.jpgdomain_summary.jpgresponse_profile.jpg

 

3 REPLIES 3
ballardw
Super User

The default behavior for SAS is that any of the variables that are on the CLASS, DOMAIN or the right side of the = sign in a MODEL statement and have missing values then the observation is dropped from the model.

Class or Domain statements that set MISSING as a valid level of the variable will keep them in the model but interpretation may be difficult depending on just what those variables represent.

 

So you need to look at more of the variable than just POP as to which are missing.

 

With the survey procs that will extend to the sample design variables as well. Missing  strata, cluster or weight means the observation is excluded.

sophiec
Obsidian | Level 7

Thanks for your response @ballardw! Unfortunately I still don't quite understand. 

 

The exact same variables are being used in both PROCs with regards to the stratum/cluster/domain. There are no missing values for the CLASS variable "RIAGENDR" or the explanatory variable "RIDAGEYR". The outcome variable "ALLCAUSE" in the MODEL statement only has 200 missing values. 

 

Shouldn't then the logistic regression model be using all the available observations from "ALLCAUSE" that are counted using PROC FREQ? I'm not clear on where additional missing values could be from. 

 

Thanks again for your time!

Sophie 

ballardw
Super User

 

You may have to share the LOG with the code and all messages. Copy the code and all the notes or warnings from from the log, open a text box on the forum and paste all the text.

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 247 views
  • 0 likes
  • 2 in conversation