BookmarkSubscribeRSS Feed
Calcite | Level 5



I'm in need of help for how to weight observations in a regression. I am trying to estimate the expected level of hospitalizations in a year for older individuals who receive care in the home provided by the municipality. The problem is that the individuals can receive care in a single month, two months, three months and so on during the year. Only the hospitalizations during the months receiving care are included in the regression. The problem is that some people (eg. those receiving care in all 12 months) are more exposed to hospitalizations than the individuals receiving care for fewer months.


I've tried to use PROC GENMOD using a POISSON distribution and using the WEIGHT statement.  A person who receives 12 months of care has the weight 1 while a person receiving one month of care has the weight 1/12 and so on. I'm not quite sure this is the right way to go about it. Can anyone help? 


Fun fact: Fewer than 50% receive care the whole year.




I don't think I understand the design of your study, but if the length of time spent in a hospital is an important predictor, I would create a variable named "Duration" or "DaysInHospital" that includes that information. Then include that variable in the model. 


Durations are important when you are using Poisson regression to predict the RATE at which something happens. The duration often enters into the model as an offset term. For example, there is a classic example in McCullagh and Nelder (1989) that models the rate of damage incidents for ships in a fleet. The number of months that each ship has been in service is an important variable. For an analysis of this problem, see


Whenever you model a count response and subjects have differing levels of exposure, like the differing amounts of care you refer to, then you can model the rate rather than the count itself. That is, you model the ratio of the count to the amount of care, which adjusts for the differing care amounts. To do this, you specify the log of the rate denominator (amount of care for each subject) in the OFFSET= option in the MODEL statement in PROC GENMOD. See this note

Calcite | Level 5

Thanks! This seems to have some reasonable results - I can use this :-). But what is it it actually does? for instance compared to something like this:


proc genmod data=hospital;

class sex agegroup partner kids education health;

model hospitalization = sex agegroup health ln_exposure/ dist=poisson link=log;

output out=estimated_values p=pred;



where I have included log to number of month of care divided by 12 as a proportion of exposure during the year.


In order to interpret the model in terms of the rate (number of hospitalizations PER YEAR), you need to use the OFFSET option approach I mentioned. Including the exposure (or the log of it) as a predictor does not, in general, result in a model with that interpretation. How the offset results in a rate interpretation is fully described in the note I referred to.


Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.


Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 3 in conversation