Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Modelling created reference group with exposure-outcome variable in pr...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 09-03-2020 03:45 AM
(740 views)

Hello,

I finding the association between agents a1,a2,a3 and a4 exposures and lung cancer. Exposed = 1,

unexposed =0. My ref. group(refA) is ids unexposed to any of the agents.

a.I am not sure whether my logistic regression model step 3b is right. I am modelling refA with lung-a1

association., then refA with lung-a2 association,etc.

b. Should the refA be considered as continuous or categorical(as I have done in step 3b of the model)?

Thanks in advance.

ak.

/* Logistic test ref group test*/

data agents_exp;

input id$ a1 a2 a3 a4 lung$ 14-21 income 23-29;

datalines;

os1 1 0 0 1 ca case 45424

os2 1 1 0 0 ca case 52877

os3 0 0 0 0 pop cont 25600

os4 1 0 0 1 pop cont 14888

os5 0 0 0 0 ca case 41036

os6 0 0 0 0 ca case 20365

os7 1 0 1 1 pop cont 16988

os8 0 0 0 0 ca case 100962

os9 1 0 1 0 pop cont 11230

os10 0 0 1 0 ca case 35850

os11 0 1 0 0 pop cont 28700

os12 0 0 0 0 pop cont 46320

os13 1 1 1 1 pop cont 24897

os14 0 0 0 0 pop cont 18966

os15 1 0 0 1 ca case 20540

os16 0 0 1 0 pop cont 150600

os17 1 1 1 1 pop cont 24897

os18 0 0 0 0 pop cont 17999

os19 0 0 0 0 pop cont 22540

os20 0 0 0 0 pop cont 158600

os21 0 0 0 0 pop cont 187365

os22 1 0 1 0 ca case 30580

;

run;

/*Step 1: Finding number of cases and controls unexposed to agents(a1,a2,a3 and a4)*/

proc freq data= agents_exp(where=(sum(a1,a2,a3,a4)=0));

tables lung;

title 'Table 1:Subjects unexposed to any of the 4 agents';

run;

/*Step 2:Using subjects unexposed to any of agents as a ref. group*/

proc sql;

create table t as

select

id, a1, a2, a3,a4,lung, income,

sum(a1,a2,a3,a4)=0 as refB

from agents_exp

;

quit;

proc print data=t;

title 'Table 2: original variables and ref group';

run;

/*proc freq data=t;

tables lung* refB lung*a1;

title 'Table 3: freq of ca case and pop cont for ref group';

run;*/

/*Step 3a: Finding odds ratio estimates for variables including ref.group*/

data logtest; set t;

if lung in ('ca case','pop cont');

run;

/* Step 3b:*/

proc logistic data=logtest;

class refb (param=ref ref ='0');

model lung(event='ca case') = a1 refb;

Title 'Table 3b: Estimates for ref. group';

run;

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

72

73

74 /* Logistic test ref group test*/

75 data agents_exp;

76 input id$ a1 a2 a3 a4 lung$ 14-21 income 23-29;

77 datalines;

NOTE: The data set WORK.AGENTS_EXP has 22 observations and 7 variables.

NOTE: DATA statement used (Total process time):

real time 0.01 seconds

cpu time 0.01 seconds

100 ;

101 run;

102

103 /*Step 1: Finding number of cases and controls unexposed to agents(a1,a2,a3 and a4)*/

104 proc freq data= agents_exp(where=(sum(a1,a2,a3,a4)=0));

105 tables lung;

106 title 'Table 1:Subjects unexposed to any of the 4 agents';

107 run;

NOTE: There were 10 observations read from the data set WORK.AGENTS_EXP.

WHERE SUM(a1, a2, a3, a4)=0;

NOTE: PROCEDURE FREQ used (Total process time):

real time 0.21 seconds

cpu time 0.20 seconds

108 /*Step 2:Using subjects unexposed to any of agents as a ref. group*/

109

110 proc sql;

111 create table t as

112 select

113 id, a1, a2, a3,a4,lung, income,

114 sum(a1,a2,a3,a4)=0 as refB

115 from agents_exp

116 ;

NOTE: Table WORK.T created, with 22 rows and 8 columns.

117 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.01 seconds

cpu time 0.02 seconds

118

119 proc print data=t;

120 title 'Table 2: original variables and ref group';

121 run;

NOTE: There were 22 observations read from the data set WORK.T.

NOTE: PROCEDURE PRINT used (Total process time):

real time 0.38 seconds

cpu time 0.38 seconds

122

123 /*proc freq data=t;

124 tables lung* refB lung*a1;

125 title 'Table 3: freq of ca case and pop cont for ref group';

126 run;*/

127

128 /*Step 3a: Finding odds ratio estimates for variables including ref.group*/

129

130 data logtest; set t;

131

132 if lung in ('ca case','pop cont');

133 run;

NOTE: There were 22 observations read from the data set WORK.T.

NOTE: The data set WORK.LOGTEST has 22 observations and 8 variables.

NOTE: DATA statement used (Total process time):

real time 0.01 seconds

cpu time 0.02 seconds

134

135

136 /* Step 3b:*/

137 proc logistic data=logtest;

138 class refb (param=ref ref ='0');

139 model lung(event='ca case') = a1 refb;

140 Title 'Table 3b: Estimates for ref. group';

141 run;

NOTE: PROC LOGISTIC is modeling the probability that lung='ca case'.

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

NOTE: There were 22 observations read from the data set WORK.LOGTEST.

NOTE: PROCEDURE LOGISTIC used (Total process time):

real time 0.53 seconds

cpu time 0.49 seconds

142

143 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

155

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I finding the association between agents a1,a2,a3 and a4 exposures and lung cancer. Exposed = 1, unexposed =0. My ref. group(refA) is ids unexposed to any of the agents.

a. I am not sure whether my logistic regression model step 3b is right. I am modelling refA with lung-a1 association., then refA with lung-a2 association,etc.

b. Should the refA be considered as continuous or categorical(as I have done in step 3b of the model)?

Your code does not contain refA, so I cannot answer your questions. Can you clarify these questions and link them to the code?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello Paige,

I have corrected the questions I asked.

Just to clarify, I am creating a reference group (refA) defined as ids unexposed to any of the agents a1,a2,a3 and a4. With this created refA, I need to model it in the logistic regression and find the OR between refA and a1.

I am not sure how to place the refA in the logistic model after creating it. The event is ca case.

Thanks in advance.

ak.

I have corrected the questions I asked.

Just to clarify, I am creating a reference group (refA) defined as ids unexposed to any of the agents a1,a2,a3 and a4. With this created refA, I need to model it in the logistic regression and find the OR between refA and a1.

I am not sure how to place the refA in the logistic model after creating it. The event is ca case.

Thanks in advance.

ak.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

It sounds like you are saying that this RefA variable is the response in the Logistic regression. What is the predictor variable(s)?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello Paige,

The response variable is lung, event is ca case. The predictor variables are a1,a2,a3 and a4, refA, being the reference, ie. ids unexposed to any of the agents(a1,a2,a3 and a4).

Thank you.

ak.

The response variable is lung, event is ca case. The predictor variables are a1,a2,a3 and a4, refA, being the reference, ie. ids unexposed to any of the agents(a1,a2,a3 and a4).

Thank you.

ak.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

So I think you want

`model lung(event='ca case') = a1 a2 a3 a4 refA;`

although I have a concern that refA is correlated with a1 a2 a3 a4.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks, Paige. It helps.

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.