Hi I am trying to compare flu vaccination coded 1 or 0 among immigrant and non-immigrant population. I have current age category, income quintile, urban/rural , gender and years since migration as variables. Years since migration has 5 categories : 1. before 20 years 2. 10-20 years 3. 3-10 years 4. last 2 years 5. non-migrants when I run a logistic regression model, using the following command proc logistic data=surv1 descending; class agecat(param=ref ref='3') GENDER (param=ref ref='M') rural (param=ref ref='0') ses(param=ref ref='4') imm_cat (param=ref ref='5'); model doc_visit(ref='0') = agecat GENDER rural ses imm_cat / risklimits lackfit selection=stepwise slentry=0.1 slstay=0.05 details lackfit; run; It gives me the nice output with all the variables showing association but, the hosmer and lemmeshow goodness of fit test shows p<0.0001. I included interaction term for age category and imm_cat it slightly improved the model (interaction is highly significant) fit but goodness of fit statistics is still p<0.0001. When I look at the cross tabulation for age category and doctor visit, the relationship among immigrants is inverted U shaped. You are more likely to get vaccine at the middle age and less likely to get if you are younger (12 years-18) or 65+. However, age and vaccination showed a 'J' shaped relation in non-immigrants. you have highest chances of vaccination if you are oldest and least chance at the middle age. I was wondering if this reverse relationship between age category and outcome among migrants and non-migrants is the cause for logistic model not fitting. Someone suggested me to include spline effect for age (age as a continuous variable) and I included "agesp" in the model but the model is still not fit. effect agesp = spline(ageyrs / naturalcubic basis=tpf(noint) knotmethod=percentiles(5)); I also tried to fit using general linear models but the same problem. am using SAS 9.4 . Apologies I can not share the data. It would be nice to hear your suggestion, Thanks Yuba
... View more