Hi,
I am analyzing healthcare costs. There are so many zero values in the data. I would prefer to use two part model. Is there anyone familiar with this model code in sas?
My outcomes is the copay for the insurance and my covariates would be the plan ID. I want to investigate the relationship between copay values and the list of covariates.
Thanks!
Visually I don't see a lot of difference b/w ZIP and ZINB-- I have not done extensive work where it ultimately mattered (only some exploratory analysis). Google will return a lot of information if you search for it.
I am not familiar with the FMM proc. If t explicitly handles these distributions then it may be worth a shot. Regardless of PROC/method, the modeling ZI data should be a two part process as you describe, where it's estimating separately whether or not it's zero; and if not, then estimating the non-zero value.
A similar concept exists with forecasting methods for intermittent count data (where zeroes often occur). I have more experience with this, but not in SAS as I do not have SAS Forecasting Server which is where those procedures live.
You may want to use some kind of regression model suitable for zero-inflated data (ZIP or ZINB, where P and NB are Poisson or Negative Binomial distribution).
Example using proc genmod here may be helpful
https://stats.idre.ucla.edu/sas/dae/zero-inflatedpoisson-regression/
Hi,
Thank you for the response. Just wondering are there any difference between ZIP and ZINP as the distribution type of the model? Can we use the FMM (probit + gamma/log) instead?
Thanks!
Visually I don't see a lot of difference b/w ZIP and ZINB-- I have not done extensive work where it ultimately mattered (only some exploratory analysis). Google will return a lot of information if you search for it.
I am not familiar with the FMM proc. If t explicitly handles these distributions then it may be worth a shot. Regardless of PROC/method, the modeling ZI data should be a two part process as you describe, where it's estimating separately whether or not it's zero; and if not, then estimating the non-zero value.
A similar concept exists with forecasting methods for intermittent count data (where zeroes often occur). I have more experience with this, but not in SAS as I do not have SAS Forecasting Server which is where those procedures live.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.