BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ak2011
Fluorite | Level 6


I would be grateful if someone could help me with the code to find an overall p-value for heterogeneity of association by type of breast cancer bct.

Using polytomous(multinomial) logistic regression I am looking for a single p-value that test for heterogeneity across 3 groups(breast cancer type 1 (bct1)), bct2 and bct3. The variable histo is histological differences among the bc types.

Histo 1=bct1, 2=bct2 and 3=bct3. Under each bct, 1=case 0=control.

My main independent variable is agent_exp(0=unexposed and 1=exposed).

My code, log and results for finding histological differences are found below:

data ht2; 
input id$ 1-7 agents_exp 8-9 histo 10-11 bct1 12-13 bct2 14-15 bct3 16-17;
datalines;
OSaa01 0 . 0 0 0
OSaa06 0 . 1 1 1
OSaa11 0 . 0 0 0
OSaa12 0 . 1 1 1
OSaa13 1 1 . 1 .
OSaa14 0 2 1 . .
OSaa15 0 1 . 1 .
OSaa19 0 . 1 1 1
OSaa21 0 . 0 0 0
OSaa22 0 . 1 1 1
OSaa23 0 . 0 0 0
OSaa24 0 1 . 1 .
OSaa29 1 . 1 1 1
OSaa30 1 2 1 . .
OSaa31 0 . 1 1 1
OSaa36 0 . 1 1 1
OSaa46 0 1 . 1 .
OSaa52 0 . 0 0 0
OSaa54 0 . 0 0 0
OSaa55 0 . 1 1 1
OSaa56 0 . . . .
OSaa58 0 . 1 1 1
OSaa63 0 2 1 . .
OSaa69 0 1 . 1 .
OSaa70 0 . 1 1 1
OSaa72 0 . 1 1 1
OSaa73 0 . 0 0 0
OSaa75 0 . 1 1 1
OSaa84 0 . 1 1 1
OSaa86 1 . 1 1 1
OSaa93 1 . 0 0 0
OSaa99 0 . 1 1 1
OSab00 0 . 1 1 1
OSab04 0 . 1 1 1
OSab12 0 . 1 1 1
OSab16 0 3 . . 1
OSab17 0 1 . 1 .
OSab19 0 . 1 1 1
OSab20 1 1 . 1 .
OSab24 0 . 1 1 1
OSab33 0 . 1 1 1
OSab37 0 . 1 1 1
OSab38 0 . 1 1 1
OSab39 0 . 1 1 1
OSab46 0 . 1 1 1
OSab50 0 . 0 0 0
OSab54 0 . 1 1 1
OSab58 0 . 0 0 0
OSab68 0 . 1 1 1
OSab70 0 . 1 1 1
OSab71 0 1 . 1 .
OSab73 0 1 . 1 .
OSab79 0 . 1 1 1
OSab84 0 . 1 1 1
OSab86 0 . 1 1 1
OSab89 0 . 1 1 1
OSab97 0 . 1 1 1
OSac02 0 . 1 1 1
OSac04 0 . 1 1 1
OSac07 1 . 0 0 0
OSac08 0 . 1 1 1
OSac13 0 . 1 1 1
OSac16 0 . 1 1 1
OSac17 1 . 1 1 1
OSac33 1 1 . 1 .
OSac34 0 . 1 1 1
OSac35 0 2 1 . .
OSac42 0 . 1 1 1
OSac43 0 . 0 0 0
OSac47 0 . 0 0 0
OSac49 0 2 1 . .
OSac52 0 . 1 1 1
OSac53 0 . 0 0 0
OSac58 0 . 1 1 1
OSac67 0 . 1 1 1
OSac74 1 1 . 1 .
OSac76 0 1 . 1 .
OSac80 0 . 0 0 0
OSac86 0 . 0 0 0
OSac87 0 . . . .
OSac88 0 . 0 0 0
OSac91 1 3 . . 1
OSac93 0 . 1 1 1
OSac97 0 . 0 0 0
OSad01 0 . 1 1 1
;

proc print data=ht2;
var id agents_exp histo bct1 bct2 bct3;
run;

proc logistic data=ht2;
class histo (ref ='2') /param=ref;
model histo = agents_exp/link=glogit;
run;

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 data ht2;
74 input id$ 1-7 agents_exp 8-9 histo 10-11 bct1 12-13 bct2 14-15 bct3 16-17;
75 datalines;
 
NOTE: The data set WORK.HT2 has 85 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
 
 
161 ;
162
163 proc print data=ht2;
164 var id agents_exp histo bct1 bct2 bct3;
165 run;
 
NOTE: There were 85 observations read from the data set WORK.HT2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.75 seconds
cpu time 0.75 seconds
 
 
166
167 proc logistic data=ht2;
168 class histo (ref ='2') /param=ref;
169 model histo = agents_exp/link=glogit;
170 run;
 
NOTE: PROC LOGISTIC is fitting the generalized logit model. The logits modeled contrast each response category against the
reference category (histo=2).
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 85 observations read from the data set WORK.HT2.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.63 seconds
cpu time 0.57 seconds
 
 
171
172
173 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
185

Obs id agents_exp histo bct1 bct2 bct3
1 OSaa01 0 . 0 0 0
2 OSaa06 0 . 1 1 1
3 OSaa11 0 . 0 0 0
4 OSaa12 0 . 1 1 1
5 OSaa13 1 1 . 1 .
6 OSaa14 0 2 1 . .
7 OSaa15 0 1 . 1 .
8 OSaa19 0 . 1 1 1
9 OSaa21 0 . 0 0 0
10 OSaa22 0 . 1 1 1
11 OSaa23 0 . 0 0 0
12 OSaa24 0 1 . 1 .
13 OSaa29 1 . 1 1 1
14 OSaa30 1 2 1 . .
15 OSaa31 0 . 1 1 1
16 OSaa36 0 . 1 1 1
17 OSaa46 0 1 . 1 .
18 OSaa52 0 . 0 0 0
19 OSaa54 0 . 0 0 0
20 OSaa55 0 . 1 1 1
21 OSaa56 0 . . . .
22 OSaa58 0 . 1 1 1
23 OSaa63 0 2 1 . .
24 OSaa69 0 1 . 1 .
25 OSaa70 0 . 1 1 1
26 OSaa72 0 . 1 1 1
27 OSaa73 0 . 0 0 0
28 OSaa75 0 . 1 1 1
29 OSaa84 0 . 1 1 1
30 OSaa86 1 . 1 1 1
31 OSaa93 1 . 0 0 0
32 OSaa99 0 . 1 1 1
33 OSab00 0 . 1 1 1
34 OSab04 0 . 1 1 1
35 OSab12 0 . 1 1 1
36 OSab16 0 3 . . 1
37 OSab17 0 1 . 1 .
38 OSab19 0 . 1 1 1
39 OSab20 1 1 . 1 .
40 OSab24 0 . 1 1 1
41 OSab33 0 . 1 1 1
42 OSab37 0 . 1 1 1
43 OSab38 0 . 1 1 1
44 OSab39 0 . 1 1 1
45 OSab46 0 . 1 1 1
46 OSab50 0 . 0 0 0
47 OSab54 0 . 1 1 1
48 OSab58 0 . 0 0 0
49 OSab68 0 . 1 1 1
50 OSab70 0 . 1 1 1
51 OSab71 0 1 . 1 .
52 OSab73 0 1 . 1 .
53 OSab79 0 . 1 1 1
54 OSab84 0 . 1 1 1
55 OSab86 0 . 1 1 1
56 OSab89 0 . 1 1 1
57 OSab97 0 . 1 1 1
58 OSac02 0 . 1 1 1
59 OSac04 0 . 1 1 1
60 OSac07 1 . 0 0 0
61 OSac08 0 . 1 1 1
62 OSac13 0 . 1 1 1
63 OSac16 0 . 1 1 1
64 OSac17 1 . 1 1 1
65 OSac33 1 1 . 1 .
66 OSac34 0 . 1 1 1
67 OSac35 0 2 1 . .
68 OSac42 0 . 1 1 1
69 OSac43 0 . 0 0 0
70 OSac47 0 . 0 0 0
71 OSac49 0 2 1 . .
72 OSac52 0 . 1 1 1
73 OSac53 0 . 0 0 0
74 OSac58 0 . 1 1 1
75 OSac67 0 . 1 1 1
76 OSac74 1 1 . 1 .
77 OSac76 0 1 . 1 .
78 OSac80 0 . 0 0 0
79 OSac86 0 . 0 0 0
80 OSac87 0 . . . .
81 OSac88 0 . 0 0 0
82 OSac91 1 3 . . 1
83 OSac93 0 . 1 1 1
84 OSac97 0 . 0 0 0
85 OSad01 0 . 1 1 1

The LOGISTIC Procedure

 
Model Information
Data Set WORK.HT2
Response Variable histo
Number of Response Levels 3
Model generalized logit
Optimization Technique Newton-Raphson
 
Number of Observations Read 85
Number of Observations Used 19
 
Response Profile
Ordered
Value
histo Total
Frequency
1 1 12
2 2 5
3 3 2

Logits modeled use histo=2 as the reference category.

Note:66 observations were deleted due to missing values for the response or explanatory variables.

 
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
 
Model Fit Statistics
Criterion Intercept Only Intercept and Covariates
AIC 37.384 40.738
SC 39.273 44.516
-2 Log L 33.384 32.738
 
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 0.6459 2 0.7240
Score 0.6415 2 0.7256
Wald 0.6137 2 0.7358
 
Type 3 Analysis of Effects
Effect DF Wald
Chi-Square
Pr > ChiSq
agents_exp 2 0.6137 0.7358
 
Analysis of Maximum Likelihood Estimates
Parameter histo DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept 1 1 0.6931 0.6124 1.2812 0.2577
Intercept 3 1 -1.3863 1.1180 1.5374 0.2150
agents_exp 1 1 0.6931 1.2748 0.2957 0.5866
agents_exp 3 1 1.3863 1.8028 0.5913 0.4419
 
Odds Ratio Estimates
Effect histo Point Estimate 95% Wald
Confidence Limits
agents_exp 1 2.000 0.164 24.328
agents_exp 3 4.000 0.117 136.958



 

 

 

 

 


Finding a single p-value for heterogeneity is where I am stuck at:

 

I checked the solution on "Heterogeneity test for multinomial logistic regression' posted on 04-09-2012 but I am still stuck on my question.

 

Please help. Thanks.

ak.

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

I would hate to try to answer this without better understanding these unusual looking data. Avoiding any subject matter terminology which will only further obfuscate, why is it that any observation with a nonmissing HISTO value has exactly 1 nonmissing value in the BCT variables? And why does any observation with missing HISTO have all three nonmissing BCT values... and why are they always the all the same - either all 0 or all 1 or all missing? How do the three values of HISTO relate to the BCT values and what do they mean - again, without subject matter terminology. Note that with so few usable observations (only 19 with nonmissing HISTO), any analysis results might be considered unreliable. 

View solution in original post

1 REPLY 1
StatDave
SAS Super FREQ

I would hate to try to answer this without better understanding these unusual looking data. Avoiding any subject matter terminology which will only further obfuscate, why is it that any observation with a nonmissing HISTO value has exactly 1 nonmissing value in the BCT variables? And why does any observation with missing HISTO have all three nonmissing BCT values... and why are they always the all the same - either all 0 or all 1 or all missing? How do the three values of HISTO relate to the BCT values and what do they mean - again, without subject matter terminology. Note that with so few usable observations (only 19 with nonmissing HISTO), any analysis results might be considered unreliable. 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 944 views
  • 1 like
  • 2 in conversation