I'm comparing infection rates per 1000 device days for two sets of data.  I want to determine whether a new treatment reduced the infection rate. 


My question is how to compare the RATES?  The mean rate should be calculated as sum(all infections)/sum(all device days), NOT just the mean of the monthly rates.  But I am not sure how to perform a Wilcoxon test and report the results using the rates.


I originally ran the test on the number of monthly infections, but decided the rate would be more appropriate because the number of device days varies widely each month.  However, when I run the test on the rates, the mean is calculated by averaging the rates. 


any suggestions is greatly appreciated.

One approach is to use PROC GENMOD to model the rates and include the grouping variable as a CLASS variable. You can then use the LSMEANS statement to estimate the ratio between the two groups. If the confidence interval for the rate ratio includes 1, then the data indicate that the group rates are not significantly different. if the CI does not include 1, you can conclude a difference in rates.


There is a SAS Knowledge Base article that has data, example code, and a discussion. It uses a Poisson model for the rates, but you can also use a negative binomial or another model.


I think most researchers use the ratio of rates to do the comparison, but if for some reason you need to test the DIFFERENCE in rates, you can do that too.

thank you!
This may fit into the class of data considered 'rare events'.  SAS/QC recently added PROC RAREEVENTS to produce Shewhart control charts using the hypergeometric distribution.  Though not a statistical comparison per se, the graph may be able to illustrate the difference much better than a p-value.


The most recent SUGI proceedings had a nice article on it.


