BookmarkSubscribeRSS Feed
JanetSultana
Fluorite | Level 6

Hello! 

 

I need to conduct a test for trend on percentages to get a p-value. I found the following code in a SAS community website which is  similar to the data I have:

 

data test(keep=year case log_n);

infile datalines;

input year case n;

log_n=log(n);

output test;
datalines;

2000 100 50000

2005 75 60000

2010 50 75000;

run;

 

proc genmod data=test;

model case=year / dist=poisson link=log offset=log_n;

run;

 

A screenshot of the output is attached. 

  Output_screenshot.png

 

Is it correct to say that there is a significant difference between the proportions (cases/log_n) in the 3 years based on the Pr>ChisSq <0.001? I'm not convinced about this output because the DF for year is 1. Also, I don't understand why one would want to get the log of the denominator but not of the cases. 

 

Many thanks!

 

Janet

3 REPLIES 3
StatDave
SAS Super FREQ

See this note about testing trend in proportions. As shown there, you don't need to fit a model to get a test - nonmodeling approaches via PROC FREQ and PROC MULTTEST are available when there are no covariates. You can also use a modeling approach with a logistic model in PROC LOGISTIC as shown or PROC GENMOD, though your Poisson model with offset in GENMOD is also reasonable. To fit a logistic model to your summarized data, specify MODEL CASE/N=YEAR; in PROC LOGISTIC. The 1 df result from your GENMOD model is a test of linear trend. If you want to test for differences among the years, rather than trend, then add a CLASS statement with YEAR specified in it which will result in a 2 df test in your case. See this note about testing for differences (not trend) among proportions.

JanetSultana
Fluorite | Level 6

Hi, 

 

Thanks for your feedback. I've tried using:

 

Code 

 

data test;
infile datalines;

input year case n;

output test;
datalines;

2000 100 50000

2005 75 60000

2010 50 75000;

run;


data test;
set test;
perc=round(((case/n)*100),0.01);
run;

 

proc freq data = test;
table perc*year / exact trend;
run;

 

The p-value that I'm looking for seems to be  the Mantel-Haenszel Chi-Square in the output (attached).  

 

 

I'm curious as to why the p-value using the proc genmod is significant while the p-value from the proc freq is not:

 

Code:

 

data test(keep=year case log_n);

infile datalines;

input year case n;

log_n=log(n);

output test;
datalines;

2000 100 50000

2005 75 60000

2010 50 75000

;

run;

proc genmod data=test;

model case=year / dist=poisson link=log offset=log_n;

run;

 

SAS Output for proc genmod attached. 

 

Do you have any idea why the p-values are different? I would like to be sure which one is correct to use / what the correct interpretation is.  

 

Thanks,

JaneProc freqProc freqProc genmodProc genmod

StatDave
SAS Super FREQ

Your trend analysis in PROC FREQ is not correct since it doesn't use the actual sample sizes. The following code uses the observed sample sizes. Because your sample sizes are so large, there is no need for an exact test.

 

data test;
input year case n;
w=case; y=1; output;
w=n-case; y=0; output;
datalines;
2000 100 50000
2005 75 60000
2010 50 75000
;
proc freq;
weight w;
table year*y/trend;
run;

 

A similar analysis can be done in PROC GENMOD or PROC LOGISTIC as I mentioned. Note that the square root of the Score chi-square is the same as the Cochran-Armitage statistic in PROC FREQ. 

 

proc logistic;
freq w;
model y(event="1")=year;
run;

 

sas-innovate-2024.png

 

Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.

 

Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer. 

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 3406 views
  • 2 likes
  • 2 in conversation