## PROC genmod is appropriate model

For my research purpose, I am using a sample of 7000 patients. These patients are divided into 3 categories:

Cat1 ( Diabetes+ High BP)

Cat2 ( DIabetes+ Low BP)

Cat3 ( Diabetes+ No BP)

My research objective is to get the mean total healthcare costs in these groups for comparing the differences in their costs.

I analyzed the cost data and I found it to be highly skewed with 24 patients with 0 total costs. After doing the Box-cox test, I found that I should use a generalized linear model with gamma distribution and log link function.

My understanding is that I can get rid of the zero costs and use the the positive costs for  PROC GENMOD.

MY dependent variable (total costs is continuous variable) and 8 of my Independent variables are categorical ( nominal) but two independent variables are continuous, so is it appropriate to use GENMOD ?

## Re: PROC genmod is appropriate model

You can fit a gamma model in GENMOD. However, zero is not a valid value for that distribution and observations with zero response will be ignored. A possibly better alternative for data that has a mass at zero but is otherwise positive is the Tweedie distribution. You can try modeling all of your data, including the zero response observations, by fitting the model in GENMOD using the DIST=TWEEDIE option. There is more information on the Tweedie model available in the Details section of the GENMOD documentation.

## Re: PROC genmod is appropriate model

## Re: PROC genmod is appropriate model

Just being a smart alec but doesn't "Cat3 ( Diabetes+ No BP) " actually mean Dead (no blood pressure)? Do you mean "no recorded BP" or similar?

## Re: PROC genmod is appropriate model

Sorry for the confusion. By BP I mean Hypertension. So, the No BP group represents patients with Diabetes only.
