Hello,
I've been using an existing SAS EG job to process some survey information and am having a problem with a part of the job that does Logistic Regression.
I have used the in-built 'Logistic Regression' wizard menu in SAS EG and put the results of Question 1 in as the primary ('dependent' I think it's called?) variable and then all the other questions as secondary and followed the process through and it's showing me the p-values, weighted frequencies etc perfectly.
This has worked with nearly all the survey questions (Q2 vs all, Q3 vs all etc) but when I try plotting Question 14 vs all the LR step doesn't run and the log displays the message:
ERROR: There are no valid observations
The data for this question is exactly the same as all the others (the responses to the survey are recorded as numbers 1 to 6) and the data quality is the same, no blanks, no NULL, no errors etc so does anyone know why it might be displaying this error for me when I'm using the LR wizard in exactly the same way? Q19 and Q20 are also not running, but also look fine in the raw data.
Thank you for any help you can offer!
James
So, does q14a_weight have missing values?
Do all of the X-variables have values? Does the Y-variable have values?
Or could one or more of these x-variables be all missings?
What happens if you do a PROC SUMMARY and count the number of missings for all of your X-variables and y-variable and q14a_weight?
proc summary data=work.tempmod;
var q14a q14a_weight MethodCollection Gender AgeBand_1_2
Ethnicity Translated_recoded Sexuality_recode Religion PSR
SupportSetting MechanismDelivery Q1Std Q1ER Q1Comb Q2Std
Q2ER Q2Comb Q2b Q2c Q3a Q3b Q4a Q4b Q5a Q5b Q6a Q6b
Q7a Q7b Q8a Q8b Q9a Q9b Q10 Q11 Q12 Q12sub
Q12_recode Q13 Q13_recode Q14b Q15a Q15b Q15c Q15d
Q16a Q16b Q16c Q16d Q17 Q18 Q19a Q19b
Q19c Q20a Q20b Q20c Q21 Q22a Q22b Q22c Q22d
Q22e Q22f Q22Flag Q22bsub Q22csub Q22dsub Q22esub Q22fsub;
output out=_stats_ n= /autoname;
run;
When you look at the results from above, in data set _stats_, are there any variables for which the result from the above equals zero for N?
As an unrelated side issue, I think this is the exact situation where any type of stepwise regression (in your case backwards) can fail miserably and generate poor or nonsensical results. Please go to your favorite internet search engine and type in:
problems with stepwise regression
Please show us the LOG (not just the ERROR message, but the entire LOG of the part where you run PROC LOGISTIC).
IMPORTANT: Click on the {i} icon and paste the log into the window that appears. Do not show us the log any other way.
ERROR: There are no valid observations
The data for this question is exactly the same as all the others (the responses to the survey are recorded as numbers 1 to 6) and the data quality is the same, no blanks, no NULL, no errors etc so does anyone know why it might be displaying this error for me when I'm using the LR wizard in exactly the same way? Q19 and Q20 are also not running, but also look fine in the raw data.
These two statements are contradictory, and in cases where SAS says one thing, and the user says the opposite, I believe SAS. I suspect either you are not looking at the proper data set, or you have mis-programmed this somehow.
Hi Paige thank you for replying. I will paste the Log code as you suggest cheers
real time 0.01 seconds
2 The SAS System 07:49 Thursday, July 4, 2019
cpu time 0.01 seconds
49
50 DATA WORK.TMPMod;
51 SET WORK.SORTTempTableSorted;
52 length __RESPONSE $ 10;
53 IF Q14a=1 THEN __RESPONSE="01: 1";
54 IF Q14a=2 THEN __RESPONSE="02: 2";
55 IF Q14a=3 THEN __RESPONSE="03: 3";
56 RUN;
NOTE: There were 68746 observations read from the data set WORK.QUESTION_DATA_AND_STRATA.
NOTE: MVA_DSIO.OPEN_CLOSE| _DISARM| STOP| _DISARM| 2019-07-04T08:45:25,802+01:00| _DISARM| WorkspaceServer| _DISARM| SAS|
_DISARM| | _DISARM| 68746| _DISARM| 22904832| _DISARM| 11| _DISARM| 11| _DISARM| 166839623| _DISARM| 10854296320| _DISARM|
0.156250| _DISARM| 0.203000| _DISARM| 1877845525.600000| _DISARM| 1877845525.803000| _DISARM| 0.062500| _DISARM| | _ENDDISARM
NOTE: There were 68746 observations read from the data set WORK.SORTTEMPTABLESORTED.
NOTE: MVA_DSIO.OPEN_CLOSE| _DISARM| STOP| _DISARM| 2019-07-04T08:45:25,819+01:00| _DISARM| WorkspaceServer| _DISARM| SAS|
_DISARM| | _DISARM| -1| _DISARM| 22904832| _DISARM| 11| _DISARM| 11| _DISARM| 166839629| _DISARM| 10854296603| _DISARM|
0.171875| _DISARM| 0.219000| _DISARM| 1877845525.600000| _DISARM| 1877845525.819000| _DISARM| 0.078125| _DISARM| | _ENDDISARM
NOTE: The data set WORK.TMPMOD has 68746 observations and 72 variables.
NOTE: MVA_DSIO.OPEN_CLOSE| _DISARM| STOP| _DISARM| 2019-07-04T08:45:25,819+01:00| _DISARM| WorkspaceServer| _DISARM| SAS|
_DISARM| | _DISARM| 68746| _DISARM| 22904832| _DISARM| 11| _DISARM| 11| _DISARM| 167036755| _DISARM| 10854494020| _DISARM|
0.171875| _DISARM| 0.219000| _DISARM| 1877845525.600000| _DISARM| 1877845525.819000| _DISARM| 0.078125| _DISARM| | _ENDDISARM
NOTE: PROCEDURE| _DISARM| STOP| _DISARM| 2019-07-04T08:45:25,819+01:00| _DISARM| WorkspaceServer| _DISARM| SAS| _DISARM| |
_DISARM| 340316160| _DISARM| 22904832| _DISARM| 11| _DISARM| 11| _DISARM| 167202693| _DISARM| 10854494298| _DISARM| 0.187500|
_DISARM| 0.235000| _DISARM| 1877845525.584000| _DISARM| 1877845525.819000| _DISARM| 0.078125| _DISARM| | _ENDDISARM
NOTE: DATA statement used (Total process time):
real time 0.23 seconds
cpu time 0.18 seconds
57
58 TITLE;
59 TITLE1 "Logistic Regression Results";
60 FOOTNOTE;
61 FOOTNOTE1 "Generated by the SAS System (&_SASSERVERNAME, &SYSSCPL) on %TRIM(%QSYSFUNC(DATE(), NLDATE20.)) at
61 ! %TRIM(%SYSFUNC(TIME(), TIMEAMPM12.))";
62 PROC LOGISTIC DATA=WORK.TMPMod
63 PLOTS(ONLY)=ALL
64 ;
65 WEIGHT Q14a_Weight;
66 MODEL __RESPONSE=MethodCollection Gender AgeBand_1_2 Ethnicity Translated_recoded Sexuality_recode Religion PSR
66 ! SupportSetting MechanismDelivery Q1Std Q1ER Q1Comb Q2Std Q2ER Q2Comb Q2b Q2c Q3a Q3b Q4a Q4b Q5a Q5b Q6a Q6b Q7a Q7b Q8a
66 ! Q8b Q9a Q9b Q10 Q11 Q12 Q12sub Q12_recode Q13 Q13_recode Q14b Q15a Q15b Q15c Q15d Q16a Q16b Q16c Q16d Q17 Q18 Q19a Q19b
66 ! Q19c Q20a Q20b Q20c Q21 Q22a Q22b Q22c Q22d Q22e Q22f Q22Flag Q22bsub Q22csub Q22dsub Q22esub Q22fsub /
67 SELECTION=BACKWARD
68 SLS=0.05
69 INCLUDE=0
70 COVB
71 RSQUARE
72 LINK=LOGIT
73 CLPARM=BOTH
74 CLODDS=BOTH
75 ALPHA=0.05
76 ;
77 RUN;
3 The SAS System 07:49 Thursday, July 4, 2019
ERROR: There are no valid observations.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 68746 observations read from the data set WORK.TMPMOD.
NOTE: MVA_DSIO.OPEN_CLOSE| _DISARM| STOP| _DISARM| 2019-07-04T08:45:25,944+01:00| _DISARM| WorkspaceServer| _DISARM| SAS|
_DISARM| | _DISARM| 68746| _DISARM| 22904832| _DISARM| 11| _DISARM| 11| _DISARM| 40191768| _DISARM| 10894753102| _DISARM|
0.062500| _DISARM| 0.094000| _DISARM| 1877845525.850000| _DISARM| 1877845525.944000| _DISARM| 0.046875| _DISARM| | _ENDDISARM
NOTE: PROCEDURE| _DISARM| STOP| _DISARM| 2019-07-04T08:45:25,944+01:00| _DISARM| WorkspaceServer| _DISARM| SAS| _DISARM| |
_DISARM| 340316160| _DISARM| 22904832| _DISARM| 11| _DISARM| 11| _DISARM| 40258326| _DISARM| 10894753384| _DISARM| 0.062500|
_DISARM| 0.094000| _DISARM| 1877845525.850000| _DISARM| 1877845525.944000| _DISARM| 0.046875| _DISARM| | _ENDDISARM
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.09 seconds
cpu time 0.06 seconds
78 QUIT;
79
80 /* -------------------------------------------------------------------
81 End of task code
82 ------------------------------------------------------------------- */
83 RUN; QUIT;
84 %_eg_conditional_dropds(WORK.SORTTempTableSorted,
85 WORK.TMPMod);
My script in the {i} window keeps getting bounced as Spam so I've notified the moderator as per sites instructions
So, does q14a_weight have missing values?
Do all of the X-variables have values? Does the Y-variable have values?
Or could one or more of these x-variables be all missings?
What happens if you do a PROC SUMMARY and count the number of missings for all of your X-variables and y-variable and q14a_weight?
proc summary data=work.tempmod;
var q14a q14a_weight MethodCollection Gender AgeBand_1_2
Ethnicity Translated_recoded Sexuality_recode Religion PSR
SupportSetting MechanismDelivery Q1Std Q1ER Q1Comb Q2Std
Q2ER Q2Comb Q2b Q2c Q3a Q3b Q4a Q4b Q5a Q5b Q6a Q6b
Q7a Q7b Q8a Q8b Q9a Q9b Q10 Q11 Q12 Q12sub
Q12_recode Q13 Q13_recode Q14b Q15a Q15b Q15c Q15d
Q16a Q16b Q16c Q16d Q17 Q18 Q19a Q19b
Q19c Q20a Q20b Q20c Q21 Q22a Q22b Q22c Q22d
Q22e Q22f Q22Flag Q22bsub Q22csub Q22dsub Q22esub Q22fsub;
output out=_stats_ n= /autoname;
run;
When you look at the results from above, in data set _stats_, are there any variables for which the result from the above equals zero for N?
As an unrelated side issue, I think this is the exact situation where any type of stepwise regression (in your case backwards) can fail miserably and generate poor or nonsensical results. Please go to your favorite internet search engine and type in:
problems with stepwise regression
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.