Programming the statistical procedures from SAS

Comparing prevalence rates across years

Reply
New Contributor
Posts: 4

Comparing prevalence rates across years

Hi!

I've just started using SAS and am trying to figure out how to compare prevalence rates across years and see average percent change (95% CI).

For example, if allergy rates to Allergen A are:

2012 - 2.5%

2013- 5.6%

2014- 5.9%

What is the best way to determine the average percent change per annum?

Also, if allergy rates to Allergen B are:

2012- 1.7%

2013- 4.7%

2014- 0.7%

Is there a way to determine if there is a significant difference between rates for allergen A and B for the same year?

Thank you so much!

Sherry

Super User
Posts: 18,583

Re: Comparing prevalence rates across years

Do you have the raw data that creates those proportions? If you don't have the N's then it's difficult to

If so you can consider using the Cochran Armitage test via proc freq:

Base SAS(R) 9.2 Procedures Guide: Statistical Procedures, Third Edition

EDIT: I think I misunderstood your question, but my question still stands - do you have the raw data?

New Contributor
Posts: 4

Re: Comparing prevalence rates across years

Yes, I do have the raw data that created these proportions.

Valued Guide
Posts: 858

Re: Comparing prevalence rates across years

Here is a solution for the first part.  If there is more than one record per year you would add group by year at the end.  In this solution it's the average of every year.

data have;

infile cards;

input year rate;

cards;

2012 2.5

2013 5.6

2014 5.9

;

run;

data prep;

set have;

by year;

lrate = lag(rate);

diff_rate = rate-lrate;

run;

proc sql;

create table part1 as

select *,avg(diff_rate) as avg

from prep;

Valued Guide
Posts: 858

Re: Comparing prevalence rates across years

If all you want is the difference between the two averages this will get you there.  I'm guessing more information is valuable you can tweak the code.  Let me know if this helps:

data have;

infile cards;

input year ratea;

cards;

2012 2.5

2013 5.6

2014 5.9

;

run;

data prep;

set have;

by year;

lratea = lag(ratea);

diff_ratea = ratea-lratea;

run;

proc sql;

create table part1 as

select *,avg(diff_ratea) as avga

from prep;

data have2;

infile cards;

input year rateb;

cards;

2012 1.7

2013 4.7

2014 0.7

;

run;

data prep2;

set have2;

by year;

lrateb = lag(rateb);

diff_rateb = rateb-lrateb;

run;

proc sql;

create table prep_diff as

select *,avg(diff_rateb) as avgb

from prep2;

proc sql;

create table want as

select distinct a.avga-b.avgb as avg_diff

from  part1 a inner join

      prep_diff b on

a.year=b.year;

New Contributor
Posts: 4

Re: Comparing prevalence rates across years

Thank you! This got me the average difference between each year exactly like I asked. Is there any way to determine if the rates for allergen A, for example, are statistically significantly different between years? (e.g. is 2.5% different from 5.9% with a 95% CI and p-value?)

In your code above, the output shows the average change between years (avga and avgb). Is there a way to see if these averages are statistically different from each other?

Thanks again,

Sherry

Valued Guide
Posts: 858

Re: Comparing prevalence rates across years

Are you referring to the P value or is there another value that is significant for your purposes?  I'm not sure I"ll be able to help much more. 

New Contributor
Posts: 4

Re: Comparing prevalence rates across years

I'm referring to a p-value. The code that you provided gives me an annual avg percent change and I'm just looking to get a confidence interval for this change, if possible.

Thanks!

Sherry

Super User
Posts: 18,583

Re: Comparing prevalence rates across years

With 3 data points, you don't have enough data to generate a P-Value and/or CI.

Respected Advisor
Posts: 2,655

Re: Comparing prevalence rates across years

Since the OP has the original data, then they can do this.  Any of the following could be adapted: FREQ (using Cochran-Armitage), MULTTEST (Cochran Armitage), CATMOD, GENMOD, GLIMMIX, LOGISTIC.  Personally, I would probably use GENMOD or GLIMMIX with a binomial distribution.  I would not fit year as a repeated measure, unless I had evidence that the same individuals were measured each year.  This rules out GEE as a method (and procedure).

Steve Denham

Ask a Question
Discussion stats
  • 9 replies
  • 526 views
  • 0 likes
  • 4 in conversation