BookmarkSubscribeRSS Feed
ak2011
Fluorite | Level 6

 

Hello,

This is the corrected version of the last post. I am modelling refA not refB. 

I finding the association between agents a1,a2,a3 and a4 exposures and lung cancer. Exposed = 1, unexposed =0. My ref. group(refA) is ids unexposed to any of the agents.

a. I am not sure whether my logistic regression model step 3b is right. I am modelling refA with lung-a1 association., then refA with lung-a2 association,etc.
b. Should the refA be considered as continuous or categorical(as I have done in step 3b of the model)?

Thanks in advance.

ak.

 




/* Logistic test ref group test*/
data agents_exp;
input id$ a1 a2 a3 a4 lung$ 14-21 income 23-29;
datalines;
os1 1 0 0 1 ca case 45424
os2 1 1 0 0 ca case 52877
os3 0 0 0 0 pop cont 25600
os4 1 0 0 1 pop cont 14888
os5 0 0 0 0 ca case 41036
os6 0 0 0 0 ca case 20365
os7 1 0 1 1 pop cont 16988
os8 0 0 0 0 ca case 100962
os9 1 0 1 0 pop cont 11230
os10 0 0 1 0 ca case 35850
os11 0 1 0 0 pop cont 28700
os12 0 0 0 0 pop cont 46320
os13 1 1 1 1 pop cont 24897
os14 0 0 0 0 pop cont 18966
os15 1 0 0 1 ca case 20540
os16 0 0 1 0 pop cont 150600
os17 1 1 1 1 pop cont 24897
os18 0 0 0 0 pop cont 17999
os19 0 0 0 0 pop cont 22540
os20 0 0 0 0 pop cont 158600
os21 0 0 0 0 pop cont 187365
os22 1 0 1 0 ca case 30580
;
run;

/*Step 1: Finding number of cases and controls unexposed to agents(a1,a2,a3 and a4)*/
proc freq data= agents_exp(where=(sum(a1,a2,a3,a4)=0));
tables lung;
title 'Table 1:Subjects unexposed to any of the 4 agents';
run;
/*Step 2:Using subjects unexposed to any of agents as a ref. group*/

proc sql;
create table t as
select
id, a1, a2, a3,a4,lung,
sum(a1,a2,a3,a4)=0 as refA
from agents_exp
;
quit;

proc print data=t;
title 'Table 2: original variables and ref group';
run;

/*proc freq data=t;
tables lung* refA lung*a1;
title 'Table 3: freq of ca case and pop cont for ref group';
run;*/

/*Step 3a: Finding odds ratio estimates for variables including ref.group*/

data logtest; set t;

if lung in ('ca case','pop cont');
run;


/* Step 3b:*/
proc logistic data=logtest;
class refA (param=ref ref ='0');
model lung(event='ca case') = a1 refA;
Title 'Table 3b: Estimates for ref. group';
run;
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 /* Logistic test ref group test*/
74 data agents_exp;
75 input id$ a1 a2 a3 a4 lung$ 14-21 income 23-29;
76 datalines;
 
NOTE: The data set WORK.AGENTS_EXP has 22 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
 
 
99 ;
100 run;
101
102 /*Step 1: Finding number of cases and controls unexposed to agents(a1,a2,a3 and a4)*/
103 proc freq data= agents_exp(where=(sum(a1,a2,a3,a4)=0));
104 tables lung;
105 title 'Table 1:Subjects unexposed to any of the 4 agents';
106 run;
 
NOTE: There were 10 observations read from the data set WORK.AGENTS_EXP.
WHERE SUM(a1, a2, a3, a4)=0;
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.23 seconds
cpu time 0.20 seconds
 
 
107 /*Step 2:Using subjects unexposed to any of agents as a ref. group*/
108
109 proc sql;
110 create table t as
111 select
112 id, a1, a2, a3,a4,lung,
113 sum(a1,a2,a3,a4)=0 as refA
114 from agents_exp
115 ;
NOTE: Table WORK.T created, with 22 rows and 7 columns.
 
116 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.02 seconds
cpu time 0.02 seconds
 
 
117
118 proc print data=t;
119 title 'Table 2: original variables and ref group';
120 run;
 
NOTE: There were 22 observations read from the data set WORK.T.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.49 seconds
cpu time 0.47 seconds
 
 
121
122 /*proc freq data=t;
123 tables lung* refA lung*a1;
124 title 'Table 3: freq of ca case and pop cont for ref group';
125 run;*/
126
127 /*Step 3a: Finding odds ratio estimates for variables including ref.group*/
128
129 data logtest; set t;
130
131 if lung in ('ca case','pop cont');
132 run;
 
NOTE: There were 22 observations read from the data set WORK.T.
NOTE: The data set WORK.LOGTEST has 22 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
 
 
133
134
135 /* Step 3b:*/
136 proc logistic data=logtest;
137 class refA (param=ref ref ='0');
138 model lung(event='ca case') = a1 refA;
139 Title 'Table 3b: Estimates for ref. group';
140 run;
 
NOTE: PROC LOGISTIC is modeling the probability that lung='ca case'.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 22 observations read from the data set WORK.LOGTEST.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.56 seconds
cpu time 0.50 seconds
 
 
141
142
143 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
155

 

 

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 0 replies
  • 439 views
  • 0 likes
  • 1 in conversation