Posted 09-27-2018 04:14 AM
(3405 views)

Hello!

I need to conduct a test for trend on percentages to get a p-value. I found the following code in a SAS community website which is similar to the data I have:

data test(keep=year case log_n);

infile datalines;

input year case n;

log_n=log(n);

output test;

datalines;

2000 100 50000

2005 75 60000

2010 50 75000;

run;

proc genmod data=test;

model case=year / dist=poisson link=log offset=log_n;

run;

A screenshot of the output is attached.

Is it correct to say that there is a significant difference between the proportions (cases/log_n) in the 3 years based on the Pr>ChisSq <0.001? I'm not convinced about this output because the DF for year is 1. Also, I don't understand why one would want to get the log of the denominator but not of the cases.

Many thanks!

Janet

Hi,

Thanks for your feedback. I've tried using:

**Code **

data test;

infile datalines;

input year case n;

output test;

datalines;

2000 100 50000

2005 75 60000

2010 50 75000;

run;

data test;

set test;

perc=round(((case/n)*100),0.01);

run;

proc freq data = test;

table perc*year / exact trend;

run;

The p-value that I'm looking for seems to be the Mantel-Haenszel Chi-Square in the output (attached).

I'm curious as to why the p-value using the proc genmod is significant while the p-value from the proc freq is not:

**Code:**

data test(keep=year case log_n);

infile datalines;

input year case n;

log_n=log(n);

output test;

datalines;

2000 100 50000

2005 75 60000

2010 50 75000

;

run;

proc genmod data=test;

model case=year / dist=poisson link=log offset=log_n;

run;

SAS Output for proc genmod attached.

Do you have any idea why the p-values are different? I would like to be sure which one is correct to use / what the correct interpretation is.

Thanks,

Jane

Your trend analysis in PROC FREQ is not correct since it doesn't use the actual sample sizes. The following code uses the observed sample sizes. Because your sample sizes are so large, there is no need for an exact test.

data test;

input year case n;

w=case; y=1; output;

w=n-case; y=0; output;

datalines;

2000 100 50000

2005 75 60000

2010 50 75000

;

proc freq;

weight w;

table year*y/trend;

run;

A similar analysis can be done in PROC GENMOD or PROC LOGISTIC as I mentioned. Note that the square root of the Score chi-square is the same as the Cochran-Armitage statistic in PROC FREQ.

proc logistic;

freq w;

model y(event="1")=year;

run;

