turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Analyzing case counts across years

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-29-2014 06:46 PM

I'm looking at case counts for various diseases from 2001-2013. I can plot the data (cases by year) but I'm having a hard time figuring out how to measure the increase or decrease (or no change) over that time span. I considered using regression (proc reg) but I'm not sure year meets the continuous assumption. I'm also not sure I can use proc reg for what is basically summary data. This seems like a pretty standard thing to do but my google search keywords aren't giving me what I need. Maybe I just don't know the right language. Any help is appreciated. Below is an example of the data for one of the diseases:

Year | COUNT |

2001 | 1,489 |

2002 | 1,612 |

2003 | 2,039 |

2004 | 2,537 |

2005 | 2,837 |

2006 | 3,031 |

2007 | 2,943 |

2008 | 2,384 |

2009 | 2,394 |

2010 | 4,430 |

2011 | 5,214 |

2012 | 4,142 |

2013 | 3,298 |

Accepted Solutions

Solution

12-30-2014
12:01 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-30-2014 12:01 AM

You have no covariates?

This his is times series data so you can google time series analysis, but a basic plot and proc reg are good places to start. a basic test if the slope is not equal to zero.

All Replies

Solution

12-30-2014
12:01 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-30-2014 12:01 AM

You have no covariates?

This his is times series data so you can google time series analysis, but a basic plot and proc reg are good places to start. a basic test if the slope is not equal to zero.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-30-2014 09:36 AM

Go with 's suggestion of PROC REG as a first approximation. Be sure to look at the Durbin-Watson statistic (looking for autocorrelation).

Just eyeballing the data, it appears that there is both a trend and a cyclicity (period approximately 7 years) for this data. To get at that, you are probably going to have use some of the time series procedures in SAS/ETS.

Steve Denham.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-30-2014 12:12 PM

Thanks Reeza and Steve. I'm just doing some preliminary analyses on 60+ diseases. From the results of the preliminary analyses we'll pick some to look look at more closely (i.e. time series, possibly multivariate). From your responses it sounds like I can use proc reg and look at the slope as a basic test.

I'm getting a warning "WARNING: The range of variable year is so small relative to its mean that there may be loss of accuracy in the computations." I was worried that the format of the data (summary data), one row for each year, turns year into a categorical variable of sorts. When I look at an example of proc reg work I've done in school, we used a dataset with cases listing age and SBP, and the interpretation of the results (e.g. For every year increase in age we see an increase in SBP of 0.73) is more obvious to me than it is for my current project.

Do you have any recommendations on time-series resources? I've read a little but it seems complex enough that I might need a class to gain enough understanding to use comfortably.

Thanks again.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-30-2014 12:43 PM

The warning has to do with a range of 12 for variables with a mean of 2006. To get around it, select a common year as a baseline (2000 in the example), and subtract that from all the year values in data step. Then use the elapsed time since baseline (0, 1, 2, 3,...) as the right hand side variable.

As far as time series, a course would be useful. Check out the offerings by SAS under Forecasting and Econometrics on their training pages. It is a field where hands on learning is essential in the early stages. The later, more theoretical, parts are not as data driven, but do require thinking differently than you ordinarily would think about designed experiments or surveys.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-30-2014 12:53 PM

Thanks Steve. Much appreciated.