BookmarkSubscribeRSS Feed
ak2011
Fluorite | Level 6

 

Hello,
I finding the association between agents a1,a2,a3 and a4 exposures and lung cancer. Exposed = 1,
unexposed =0. My ref. group(refA) is ids unexposed to any of the agents.

a.I am not sure whether my logistic regression model step 3b is right. I am modelling refA with lung-a1
association., then refA with lung-a2 association,etc.
b. Should the refA be considered as continuous or categorical(as I have done in step 3b of the model)?
Thanks in advance.
ak.






/* Logistic test ref group test*/
data agents_exp;
input id$ a1 a2 a3 a4 lung$ 14-21 income 23-29;
datalines;
os1 1 0 0 1 ca case 45424
os2 1 1 0 0 ca case 52877
os3 0 0 0 0 pop cont 25600
os4 1 0 0 1 pop cont 14888
os5 0 0 0 0 ca case 41036
os6 0 0 0 0 ca case 20365
os7 1 0 1 1 pop cont 16988
os8 0 0 0 0 ca case 100962
os9 1 0 1 0 pop cont 11230
os10 0 0 1 0 ca case 35850
os11 0 1 0 0 pop cont 28700
os12 0 0 0 0 pop cont 46320
os13 1 1 1 1 pop cont 24897
os14 0 0 0 0 pop cont 18966
os15 1 0 0 1 ca case 20540
os16 0 0 1 0 pop cont 150600
os17 1 1 1 1 pop cont 24897
os18 0 0 0 0 pop cont 17999
os19 0 0 0 0 pop cont 22540
os20 0 0 0 0 pop cont 158600
os21 0 0 0 0 pop cont 187365
os22 1 0 1 0 ca case 30580
;
run;

/*Step 1: Finding number of cases and controls unexposed to agents(a1,a2,a3 and a4)*/
proc freq data= agents_exp(where=(sum(a1,a2,a3,a4)=0));
tables lung;
title 'Table 1:Subjects unexposed to any of the 4 agents';
run;
/*Step 2:Using subjects unexposed to any of agents as a ref. group*/

proc sql;
create table t as
select
id, a1, a2, a3,a4,lung, income,
sum(a1,a2,a3,a4)=0 as refB
from agents_exp
;
quit;

proc print data=t;
title 'Table 2: original variables and ref group';
run;

/*proc freq data=t;
tables lung* refB lung*a1;
title 'Table 3: freq of ca case and pop cont for ref group';
run;*/

/*Step 3a: Finding odds ratio estimates for variables including ref.group*/

data logtest; set t;

if lung in ('ca case','pop cont');
run;


/* Step 3b:*/
proc logistic data=logtest;
class refb (param=ref ref ='0');
model lung(event='ca case') = a1 refb;
Title 'Table 3b: Estimates for ref. group';
run;

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73
74 /* Logistic test ref group test*/
75 data agents_exp;
76 input id$ a1 a2 a3 a4 lung$ 14-21 income 23-29;
77 datalines;
 
NOTE: The data set WORK.AGENTS_EXP has 22 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
 
 
100 ;
101 run;
102
103 /*Step 1: Finding number of cases and controls unexposed to agents(a1,a2,a3 and a4)*/
104 proc freq data= agents_exp(where=(sum(a1,a2,a3,a4)=0));
105 tables lung;
106 title 'Table 1:Subjects unexposed to any of the 4 agents';
107 run;
 
NOTE: There were 10 observations read from the data set WORK.AGENTS_EXP.
WHERE SUM(a1, a2, a3, a4)=0;
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.21 seconds
cpu time 0.20 seconds
 
 
108 /*Step 2:Using subjects unexposed to any of agents as a ref. group*/
109
110 proc sql;
111 create table t as
112 select
113 id, a1, a2, a3,a4,lung, income,
114 sum(a1,a2,a3,a4)=0 as refB
115 from agents_exp
116 ;
NOTE: Table WORK.T created, with 22 rows and 8 columns.
 
117 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
 
 
118
119 proc print data=t;
120 title 'Table 2: original variables and ref group';
121 run;
 
NOTE: There were 22 observations read from the data set WORK.T.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.38 seconds
cpu time 0.38 seconds
 
 
122
123 /*proc freq data=t;
124 tables lung* refB lung*a1;
125 title 'Table 3: freq of ca case and pop cont for ref group';
126 run;*/
127
128 /*Step 3a: Finding odds ratio estimates for variables including ref.group*/
129
130 data logtest; set t;
131
132 if lung in ('ca case','pop cont');
133 run;
 
NOTE: There were 22 observations read from the data set WORK.T.
NOTE: The data set WORK.LOGTEST has 22 observations and 8 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
 
 
134
135
136 /* Step 3b:*/
137 proc logistic data=logtest;
138 class refb (param=ref ref ='0');
139 model lung(event='ca case') = a1 refb;
140 Title 'Table 3b: Estimates for ref. group';
141 run;
 
NOTE: PROC LOGISTIC is modeling the probability that lung='ca case'.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 22 observations read from the data set WORK.LOGTEST.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.53 seconds
cpu time 0.49 seconds
 
 
142
143 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
155

 





 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

I finding the association between agents a1,a2,a3 and a4 exposures and lung cancer. Exposed = 1, unexposed =0. My ref. group(refA) is ids unexposed to any of the agents.

a. I am not sure whether my logistic regression model step 3b is right. I am modelling refA with lung-a1 association., then refA with lung-a2 association,etc.
b. Should the refA be considered as continuous or categorical(as I have done in step 3b of the model)?

 

Your code does not contain refA, so I cannot answer your questions. Can you clarify these questions and link them to the code?

--
Paige Miller
ak2011
Fluorite | Level 6
Hello Paige,
I have corrected the questions I asked.
Just to clarify, I am creating a reference group (refA) defined as ids unexposed to any of the agents a1,a2,a3 and a4. With this created refA, I need to model it in the logistic regression and find the OR between refA and a1.
I am not sure how to place the refA in the logistic model after creating it. The event is ca case.
Thanks in advance.
ak.
PaigeMiller
Diamond | Level 26

It sounds like you are saying that this RefA variable is the response in the Logistic regression. What is the predictor variable(s)?

--
Paige Miller
ak2011
Fluorite | Level 6
Hello Paige,
The response variable is lung, event is ca case. The predictor variables are a1,a2,a3 and a4, refA, being the reference, ie. ids unexposed to any of the agents(a1,a2,a3 and a4).
Thank you.
ak.
PaigeMiller
Diamond | Level 26

So I think you want

 

model lung(event='ca case') = a1 a2 a3 a4 refA;

although I have a concern that refA is correlated with a1 a2 a3 a4.

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 915 views
  • 1 like
  • 2 in conversation