Solved
Contributor
Posts: 33

Analyzing case counts across years

I'm looking at case counts for various diseases from 2001-2013.  I can plot the data (cases by year) but I'm having a hard time figuring out how to measure the increase or decrease (or no change) over that time span.  I considered using regression (proc reg) but I'm not sure year meets the continuous assumption.  I'm also not sure I can use proc reg for what is basically summary data.  This seems like a pretty standard thing to do but my google search keywords aren't giving me what I need.  Maybe I just don't know the right language.  Any help is appreciated.  Below is an example of the data for one of the diseases:

 Year COUNT 2001 1,489 2002 1,612 2003 2,039 2004 2,537 2005 2,837 2006 3,031 2007 2,943 2008 2,384 2009 2,394 2010 4,430 2011 5,214 2012 4,142 2013 3,298

Accepted Solutions
Solution
‎12-30-2014 12:01 AM
Super User
Posts: 20,716

Re: Analyzing case counts across years

You have no covariates?

This his is times series data so you can google time series analysis, but a basic plot and proc reg are good places to start. a basic test if the slope is not equal to zero.

All Replies
Solution
‎12-30-2014 12:01 AM
Super User
Posts: 20,716

Re: Analyzing case counts across years

You have no covariates?

This his is times series data so you can google time series analysis, but a basic plot and proc reg are good places to start. a basic test if the slope is not equal to zero.

Posts: 2,655

Re: Analyzing case counts across years

Go with 's suggestion of PROC REG as a first approximation.  Be sure to look at the Durbin-Watson statistic (looking for autocorrelation).

Just eyeballing the data, it appears that there is both a trend and a cyclicity (period approximately 7 years) for this data.  To get at that, you are probably going to have use some of the time series procedures in SAS/ETS.

Steve Denham.

Contributor
Posts: 33

Re: Analyzing case counts across years

Thanks Reeza and Steve.  I'm just doing some preliminary analyses on 60+ diseases.  From the results of the preliminary analyses we'll pick some to look look at more closely (i.e. time series, possibly multivariate).  From your responses it sounds like I can use proc reg and look at the slope as a basic test.

I'm getting a warning "WARNING: The range of variable year is so small relative to its mean that there may be loss of accuracy in the computations."  I was worried that the format of the data (summary data), one row for each year, turns year into a categorical variable of sorts.  When I look at an example of proc reg work I've done in school, we used a dataset with cases listing age and SBP, and the interpretation of the results (e.g.  For every year increase in age we see an increase in SBP of 0.73) is more obvious to me than it is for my current project.

Do you have any recommendations on time-series resources?  I've read a little but it seems complex enough that I might need a class to gain enough understanding to use comfortably.

Thanks again.

Posts: 2,655

Re: Analyzing case counts across years

The warning has to do with a range of 12 for variables with a mean of 2006.  To get around it, select a common year as a baseline (2000 in the example), and subtract that from all the year values in data step.  Then use the elapsed time since baseline (0, 1, 2, 3,...) as the right hand side variable.

As far as time series, a course would be useful.  Check out the offerings by SAS under Forecasting and Econometrics on their training pages.  It is a field where hands on learning is essential in the early stages.  The later, more theoretical, parts are not as data driven, but do require thinking differently than you ordinarily would think about designed experiments or surveys.

Steve Denham

Contributor
Posts: 33