Calcite | Level 5

## Two part model for health care costs

Hi,

I am analyzing prescription drug costs data from insurance claims. Health care data is heavily skewed with lots of 0's and a few really high cost patients (long tail).

I need to get the mean cost per patient in cohort 1 vs cohort 2. I have the total cost for each individual patient.  instead of just taking the log of the cost, then proc means, transform back and be done, I have been asked to use a two-model (proc genmod with log-link). I have no idea how to do this. Any references or examples?

I have basic analytic file:

patid, cohort, covariates 1-6, drugs 1-5 count, drugs 1-5 cost.  The zero values are currently just a ".".

Thanks!
Chris

4 REPLIES 4
Tourmaline | Level 20

## Re: Two part model for health care costs

The example here is a proc genmod with a gamma distributed response variable and a log link function

https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_genmod_sect...

Calcite | Level 5

## Re: Two part model for health care costs

Thanks for the response.  This makes sense from a regression point of view. I guess, I am just stuck in not knowing what to do with that information.  At the end of the day, i need a table that says Cohort A spent \$650 on drugs, and Cohort B speng \$500, and the diffeerence was significant.  I am not sure how to go about getting that sort of output.

Thanks!
Chris

Quartz | Level 8

## Re: Two part model for health care costs

I wrote a paper on this

https://support.sas.com/resources/papers/proceedings15/3600-2015.pdf

Maybe that will help.

Obsidian | Level 7

## Re: Two part model for health care costs

I have used the syntax referred in this article for my project but I don't understand the purpose of this step:

proc genmod data=data2;
class x2 x2 x5;
model costp = x1 x2 x5 x3 x4 x4*x1 /dist =normal link=log;
output out= y_hatC pred= condpred ;
run;