BookmarkSubscribeRSS Feed
cjchung
Calcite | Level 5

 

Hello, everyone, I want to explore the association between PM2.5 (X) and COPD (Y).

I defined the variable, PM2.5 as non-linear and treated with a natural cubic spline.

 

COPD.match_newcopd is my data file. Here is the code....

 

proc sort data=COPD.match_newcopd 
out=cubic (keep=copd pm25a mari_g diab hear Arthritis asthma cancer smk_g drk_g nut_g);
by pm25a;
run;
ods select Model ANOVA ParameterEstimates SplineKnots;
proc logistic data=cubic outdesign=SplineBasis;
effect spl = spline(pm25a / naturalcubic basis=tpf(noint) details knotmethod=EQUAL(5)) ; 
model COPD (event ='1')= spl /SELECTION = none;
output out=cubicOut5 predicted=Fit upper=u lower=l; /* output predicted values for graphing */
effectplot fit;
quit;

data a; set cubicout;
Fitexp=exp(Fit);
uexp=exp(u);
lexp=exp(l);
data copd.multipm; set work.a;run;

proc sort data=work.a; by pm25a; 
title "Restricted Cubic Spline Regression";
ods graphics/height=6.5in width=10in imagename="Figure2" border=off;
proc sgplot data=a noautolegend;
where Fitexp ^=.;
band x = pm25a lower=Lexp upper=Uexp;
xaxis label='PM2.5 (ug/m3)';
yaxis label='Risk of COPD' ;
series x=pm25a y=Fitexp / curvelabel = "Predicted value";
series x=pm25a y=uexp / curvelabel = "Upper CL";
series x=pm25a y=lexp / curvelabel = "Lower CL";
run;

 

Question 1: When I add the confounders (such as marriage) in the logistic model, I will get the terrible figure...... 

 

like: model COPD (event ='1')= spl marriage /SELECTION = none;

 

Is anything wrong with the above code?

 

Question 2: How I choose the best model with knots=3 or 5 by comparing the different AIC values?  How I write the SAS code?

 

Thank youFigure231.png

3 REPLIES 3
PGStats
Opal | Level 21

Just a guess. You might have more than two levels for COPD. What happens if you add the statement

 

where COPD in (0, 1);

 

in your call to proc logistic?

PG
cjchung
Calcite | Level 5

Hi, thanks for your reply. I checked it and there were just two levels of COPD, Yes and No. Therefore I still did not know the problem.

Thank you.

PGStats
Opal | Level 21

OK, I think the problem is the superposition of predicted values for diffrent marriage groups. Try sgpanel:

 

proc sort data=work.a; 
by marriage pm25a; 
run;

title "Restricted Cubic Spline Regression";
ods graphics/height=6.5in width=10in imagename="Figure2" border=off;
proc sgpanel data=a noautolegend;
where Fitexp ^=.;
panelby marriage;
band x = pm25a lower=Lexp upper=Uexp;
colaxis label='PM2.5 (ug/m3)';
rowaxis label='Risk of COPD' ;
series x=pm25a y=Fitexp / curvelabel = "Predicted value";
series x=pm25a y=uexp / curvelabel = "Upper CL";
series x=pm25a y=lexp / curvelabel = "Lower CL";
run;

or call proc sgplot with a BY marriage statement.

PG