Hi everyone, I am working on a group research project in which we are exploring the research question "Is greater length of residence in the US among immigrant mothers associated with increased risk of preterm delivery and low birth weight?" Our primary predictor is continuous (length of residence in months) while our outcomes are each dichotomous. We are using multivariable logistic regression to perform our analysis and are a little unsure of whether we should use proc logistic, proc genmod, or a different option to calculate relative risk. Additionally, we are also having trouble defining our reference group. Ideally, we want to compare immigrant mothers to non-immigrant mother, however in our dataset, the values for length of residence are blank/missing for non-immigrant mothers. In this case, is it better to create a new dichotomous variable for immigrant status and use "no" as the reference group in the class statement (as done in the code below) or recode length of residence to set missing values (non-immigrant mothers) to an unrealistic month value (e.g. '999999') and use that as our reference group? So far, we have tried using both proc genmod and proc logistic (code below): proc genmod data = temp descending; class imgrt (ref = 'N')/param = ref; model preterm = LORMonths IMGRT/dist = poisson link = log; run; proc logistic data = temp descending; class IMGRT (ref = 'N')/param = ref; model preterm = LORMonths IMGRT; run; If anyone could explain which procedure is best in this situation to obtain relative risk and/or point us to understandable documentation, we would so appreciate it!
... View more