Programming the statistical procedures from SAS

Two part model for correlated health care costs

Posts: 24

Two part model for correlated health care costs


I have healthcare cost data for two different drugs (inpatient hospitalization and emergency department visits costs) which have a large number of zeros. I have also used propensity score matching to match patients taking the two drugs on a list of different covariates. Since there are so many zeros, I wanted to use a two-part model for the costs. Could someone please direct me where to look for the SAS code or help me with the basic code if they already have it ready.

Thank you very much in advance.


SAS Employee
Posts: 89

Re: Two part model for correlated health care costs

Hello Pooja,

The model you describe of healthcare costs associated with a hospitalization or emergency utilization sounds as if it might  be framed as a censored or truncated regression, depending on how
much you don’t know about the units of observation.

Going off your description, it sounds as if you might estimate your model using some variant of a Tobit model that you would find in PROC QLIM. The following links to the QLIM documentation that discusses these models.

Exmple code for a simple censoring problem might look like,

/*-- Tobit Model --*/

proc qlim data=subset;

   model cost = x1 x2;

   endogenous cost ~ censored(lb=0);


Without knowing more information as to what your data look like and what you are interested in learning from your data, I can only point you to the documentation and the samples within the doc.  If you are interested in sharing additional information about the research question and perhaps a model, I would be happy to work with you to specify the regression in PROC QLIM.

Thanks for your question-Ken 

Posts: 24

Re: Two part model for correlated health care costs

Hello Ken,

Thank you very much for your reply and willingness to help.

I am looking at the healthcare costs and utilization of patients initiated on four different medications for Attention Deficit Hyperactive Disorder (ADHD). I assumed one drug to be the control and used propensity score matching to match patients initiated on that drug to each of the three other drugs (I used 3 separate models). Now for the inpatient and ED costs, large proportion of the patients have zero since they did not have these events. I am interested  in getting the difference in cost between the control drug and each of the 3 case drugs after controlling for the propensity score and the covariates that I used to obtain my propensity score. Based on briefly reviewing the literature, I saw that it is possible to use a two-part model for excessive zeros where the first  part is a logistic regression and the second part could be gamma regression, normal regression etc. However, I am not familiar with such models. You have suggested a variant of the tobit model. I am not familiar with the differences between tobit and logit models.

Please do let me know if you need more information. Any help will be greatly appreciated.



New Contributor
Posts: 2

Re: Two part model for correlated health care costs

Hi Pooja,

I think you refer to the Heckman model (or Heckit) as described in

I've estimated that type of model using this SAS code:

You might also want to look at

Let me know if this helps!


Respected Advisor
Posts: 2,655

Re: Two part model for correlated health care costs

Another possibility is PROC FMM if you are running SAS/STAT12.1.  This proc enables you to specify a mixing probability function (say a binomial) and a model distribution (say the gamma), at least as I read the documentation, especially Example 37.1.  I have to provide the caveat that I have not yet run any real data through PROC FMM, so I will be curious to find out if you are able to use the procedure.

Steve Denham

Ask a Question
Discussion stats
  • 4 replies
  • 4 in conversation