Modelling created reference group with exposure-outcome variable in pr...

ak2011 · Posted 09-03-2020 03:45 AM

Hello,
I finding the association between agents a1,a2,a3 and a4 exposures and lung cancer. Exposed = 1, 
unexposed =0. My ref. group(refA) is ids unexposed to any of the agents.

a.I am not sure whether my logistic regression model step 3b is right. I am modelling refA with lung-a1
association., then refA with lung-a2  association,etc. 
b. Should the refA be considered as continuous or categorical(as I have done in step 3b of the model)?
Thanks in advance.
ak.






/* Logistic test ref group test*/
 data agents_exp;
input id$ a1 a2 a3 a4  lung$ 14-21 income 23-29;
datalines;
os1  1 0 0 1 ca case  45424
os2  1 1 0 0 ca case  52877
os3  0 0 0 0 pop cont 25600 
os4  1 0 0 1 pop cont 14888
os5  0 0 0 0 ca case  41036
os6  0 0 0 0 ca case  20365
os7  1 0 1 1 pop cont 16988
os8  0 0 0 0 ca case  100962
os9 1 0 1 0  pop cont 11230
os10 0 0 1 0 ca case  35850
os11 0 1 0 0 pop cont 28700
os12 0 0 0 0 pop cont 46320
os13 1 1 1 1 pop cont  24897
os14 0 0 0 0 pop cont  18966
os15 1 0 0 1 ca case  20540
os16 0 0 1 0 pop cont 150600
os17 1 1 1 1 pop cont  24897
os18 0 0 0 0 pop cont  17999
os19 0 0 0 0 pop cont  22540
os20 0 0 0 0 pop cont 158600
os21 0 0 0 0 pop cont 187365
os22 1 0 1 0 ca case  30580
;
run;

/*Step 1: Finding number of cases and controls unexposed to agents(a1,a2,a3 and a4)*/  
proc freq data= agents_exp(where=(sum(a1,a2,a3,a4)=0));
     tables lung;
title 'Table 1:Subjects unexposed to any of the 4 agents';
run;
/*Step 2:Using subjects unexposed to any of agents as a ref. group*/

proc sql;
create table t as 
  select
    id, a1, a2, a3,a4,lung, income,
    sum(a1,a2,a3,a4)=0 as refB
     from agents_exp
     ;
     quit;
     
   proc print data=t; 
   title 'Table 2: original variables and ref group';
   run;
   
    /*proc freq data=t;
    tables lung* refB lung*a1;
    title 'Table 3: freq of ca case and pop cont for ref group';
    run;*/

/*Step 3a: Finding odds ratio estimates for variables including ref.group*/ 

data logtest; set t;

if lung in ('ca case','pop cont');
run;

 
 /* Step 3b:*/
 proc logistic data=logtest;
class refb (param=ref ref ='0');
model lung(event='ca case') = a1 refb;  
Title 'Table 3b: Estimates for ref. group';
run;

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

72

73

74 /* Logistic test ref group test*/

75 data agents_exp;

76 input id$ a1 a2 a3 a4 lung$ 14-21 income 23-29;

77 datalines;

NOTE: The data set WORK.AGENTS_EXP has 22 observations and 7 variables.

NOTE: DATA statement used (Total process time):

real time 0.01 seconds

cpu time 0.01 seconds

100 ;

101 run;

102

103 /*Step 1: Finding number of cases and controls unexposed to agents(a1,a2,a3 and a4)*/

104 proc freq data= agents_exp(where=(sum(a1,a2,a3,a4)=0));

105 tables lung;

106 title 'Table 1:Subjects unexposed to any of the 4 agents';

107 run;

NOTE: There were 10 observations read from the data set WORK.AGENTS_EXP.

WHERE SUM(a1, a2, a3, a4)=0;

NOTE: PROCEDURE FREQ used (Total process time):

real time 0.21 seconds

cpu time 0.20 seconds

108 /*Step 2:Using subjects unexposed to any of agents as a ref. group*/

109

110 proc sql;

111 create table t as

112 select

113 id, a1, a2, a3,a4,lung, income,

114 sum(a1,a2,a3,a4)=0 as refB

115 from agents_exp

116 ;

NOTE: Table WORK.T created, with 22 rows and 8 columns.

117 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.01 seconds

cpu time 0.02 seconds

118

119 proc print data=t;

120 title 'Table 2: original variables and ref group';

121 run;

NOTE: There were 22 observations read from the data set WORK.T.

NOTE: PROCEDURE PRINT used (Total process time):

real time 0.38 seconds

cpu time 0.38 seconds

122

123 /*proc freq data=t;

124 tables lung* refB lung*a1;

125 title 'Table 3: freq of ca case and pop cont for ref group';

126 run;*/

127

128 /*Step 3a: Finding odds ratio estimates for variables including ref.group*/

129

130 data logtest; set t;

131

132 if lung in ('ca case','pop cont');

133 run;

NOTE: There were 22 observations read from the data set WORK.T.

NOTE: The data set WORK.LOGTEST has 22 observations and 8 variables.

NOTE: DATA statement used (Total process time):

real time 0.01 seconds

cpu time 0.02 seconds

134

135

136 /* Step 3b:*/

137 proc logistic data=logtest;

138 class refb (param=ref ref ='0');

139 model lung(event='ca case') = a1 refb;

140 Title 'Table 3b: Estimates for ref. group';

141 run;

NOTE: PROC LOGISTIC is modeling the probability that lung='ca case'.

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: There were 22 observations read from the data set WORK.LOGTEST.

NOTE: PROCEDURE LOGISTIC used (Total process time):

real time 0.53 seconds

cpu time 0.49 seconds

142

143 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

155

PaigeMiller · Posted 09-03-2020 07:12 AM

I finding the association between agents a1,a2,a3 and a4 exposures and lung cancer. Exposed = 1, unexposed =0. My ref. group(refA) is ids unexposed to any of the agents.

a. I am not sure whether my logistic regression model step 3b is right. I am modelling refA with lung-a1 association., then refA with lung-a2 association,etc.
b. Should the refA be considered as continuous or categorical(as I have done in step 3b of the model)?

Your code does not contain refA, so I cannot answer your questions. Can you clarify these questions and link them to the code?

--
Paige Miller

ak2011 · Posted 09-05-2020 08:46 PM

Hello Paige,
I have corrected the questions I asked.
Just to clarify, I am creating a reference group (refA) defined as ids unexposed to any of the agents a1,a2,a3 and a4. With this created refA, I need to model it in the logistic regression and find the OR between refA and a1.
I am not sure how to place the refA in the logistic model after creating it. The event is ca case.
Thanks in advance.
ak.

PaigeMiller · Posted 09-06-2020 07:18 AM

It sounds like you are saying that this RefA variable is the response in the Logistic regression. What is the predictor variable(s)?

--
Paige Miller

ak2011 · Posted 09-06-2020 11:38 AM

Hello Paige,
The response variable is lung, event is ca case. The predictor variables are a1,a2,a3 and a4, refA, being the reference, ie. ids unexposed to any of the agents(a1,a2,a3 and a4).
Thank you.
ak.

PaigeMiller · Posted 09-06-2020 11:54 AM

So I think you want

model lung(event='ca case') = a1 a2 a3 a4 refA;

although I have a concern that refA is correlated with a1 a2 a3 a4.

--
Paige Miller

ak2011 · Posted 09-07-2020 02:56 PM

Thanks, Paige. It helps.

Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Re: Modelling created reference group with exposure-outcome variable in proc logistics

Registration is open