BookmarkSubscribeRSS Feed
Michelle_AD
Calcite | Level 5

everyone I am new here and I need some help. I keep encountering this error:

 proc surveylogistic data=nhis29;
 cluster ppsu;
 strata pstrat;
 weight wtfa;
 class SRVY_YR(ref='2021') edu(ref='1') pov(ref='1') sex(ref='1')/param=ref;
 model ft=srvy_yr edu pov sex/expb;
 run;

ERROR: Invalid reference value for SRVY_YR.

Yet for the same variable and reference everything seems fine as shown below:

 proc surveylogistic data=nhis29;
 cluster ppsu;
 strata pstrat;
 weight wtfa;
 class SRVY_YR(ref='2021')/param=ref;
 model ft(event='1')=srvy_yr/expb;
 run;

NOTE: PROC SURVEYLOGISTIC is modeling the probability that ft=1. NOTE: Convergence criterion (GCONV=1E-8) satisfied. NOTE: PROCEDURE SURVEYLOGISTIC used (Total process time): real time 0.20 seconds cpu time 0.11 seconds

I can't think of what I missed, and any assistance would be appreciated

I tried proc surveyfreq and 2021 is a category and works fine

3 REPLIES 3
sbxkoenk
SAS Super FREQ

Your 2nd model (where the reference level '2021' is accepted) is a lot more parsimonious / succinct than the 1st model. The 1st one has many more Independent Variables (IV's).

 

By including extra IV's you risk that more observations are banned from the analysis because of missing values. Check the number of observations used in the 1st surveylogistic and check the number of observations used in the 2nd surveylogistic, I bet the 2nd number is much higher.

 

Among all complete-case observations remaining in the first surveylogistic, there are -- in my opinion -- none left that still contain ‘2021’ for that SRVY_YR Class variable. Please check !

Usage Note 37108: Setting reference levels for CLASS predictor variables
https://support.sas.com/kb/37/108.html

 

Good luck with your analysis.

Koen

ballardw
Super User

You can examine @sbxkoenk suggestion of possible problems with the multiple independent variables using code like this in Proc Freq:

 

Proc freq data=nhis29;
   tables srvy_yr * edu * pov * sex / list missing;
run;

If ALL of the Srvy_yr=2021 have missing values for one or more of the other variables it will appear pretty easily.

What the LIST option is does is place all the values on one line so is relatively easy to read and the Missing option means they appear in the body of the table so you can find how many and with which variables they appear. Probably not as useful with multiple continuous variable but your variables look like this shouldn't be to long of a result.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 886 views
  • 2 likes
  • 3 in conversation