Hi everyone.
I am wondering if I have understood a problem correctly.
I have missing values in my predictor and outcome. I have even more missing in control variables.
SAS only excludes variables included in step so in the first step (predictor/outcome) the ones with missing values on control variables are included. Then in next step (Predictor+Control/Outcome) SAS excludes more variables.
I suspect this is wrong procedure, and I have to exclude/impute all observations with missing values prior to any regression steps?
I think any comparison of models or comparison of coefficients has to be done on the same data, so it would be the data with all missings (predictors and confounders) removed.
Code? or better log entry to show what you are doing. Copy the log of all the steps you are concerned with including the code and all messages, notes or warnings and paste into a code box opened on the forum to maintain legibility and preserve formatting of any diagnostic information.
Not enough detail to tell what you are attempting or which tools may help.
Many of the modeling procedures that use categorical variables have a CLASS statement and you can usually specify that missing is a valid level for those.
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
70
71
72 title "Step 1";
73 proc logistic data=u.mf plots (only)= (effect oddsratio);
74 class ses (ref='1') kjønn klasse /param=ref;
75 model &dep(event='1')=&v/ clodds=pl;
76 run;
NOTE: PROC LOGISTIC is modeling the probability that akt_med=1.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 3463 observations read from the data set U.MF.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.43 seconds
user cpu time 0.28 seconds
system cpu time 0.03 seconds
memory 22273.12k
OS Memory 50544.00k
Timestamp 19.09.2020 10:43:41 PM
Step Count 91 Switch Count 6
Page Faults 0
Page Reclaims 4889
Page Swaps 0
Voluntary Context Switches 540
Involuntary Context Switches 1
Block Input Operations 0
Block Output Operations 1064
77
78
79 title "Step 2";
80 proc logistic data=u.mf plots (only)= (effect oddsratio);
81 class ses (ref='1') kjønn klasse
82 nærtilb2 nærtilb3 /param=ref;
83 model &dep(event='1')=&v nærtilb2 nærtilb3/ clodds=pl ;
84 run;
NOTE: PROC LOGISTIC is modeling the probability that akt_med=1.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 3463 observations read from the data set U.MF.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.49 seconds
user cpu time 0.35 seconds
system cpu time 0.02 seconds
memory 8719.37k
OS Memory 52268.00k
Timestamp 19.09.2020 10:43:41 PM
Step Count 92 Switch Count 6
Page Faults 0
Page Reclaims 1406
Page Swaps 0
Voluntary Context Switches 571
Involuntary Context Switches 1
Block Input Operations 0
Block Output Operations 768
85
86
87 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
98
title "Step 1";
proc logistic data=u.mf plots (only)= (effect oddsratio);
class ses (ref='1') kjønn klasse /param=ref;
model &dep(event='1')=&v/ clodds=pl;
run;
title "Step 2";
proc logistic data=u.mf plots (only)= (effect oddsratio);
class ses (ref='1') kjønn klasse
nærtilb2 nærtilb3 /param=ref;
model &dep(event='1')=&v nærtilb2 nærtilb3/ clodds=pl ;
run;
@mintbit wrote:
Hi everyone.
I am wondering if I have understood a problem correctly.
I have missing values in my predictor and outcome. I have even more missing in control variables.
SAS only excludes variables included in step so in the first step (predictor/outcome) the ones with missing values on control variables are included. Then in next step (Predictor+Control/Outcome) SAS excludes more variables.
I suspect this is wrong procedure, and I have to exclude/impute all observations with missing values prior to any regression steps?
I do not understand why this is being done via two sequential logistic regressions. What do you gain from doing this as two separate regressions, instead of one combined regression? What does the first one tell you?
I think any comparison of models or comparison of coefficients has to be done on the same data, so it would be the data with all missings (predictors and confounders) removed.
Thank you for helping me.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.