## How to compare proportions of zero-inflated variables?

# How to compare proportions of zero-inflated variables?

Hi all,

I have a problem with the dbI'm working on. I want to compare the mean value of two variable. I can't use t-test since the variables are highly zero-inflated. The variables represent the number of security breaches over different time blocks that were recorded over a period of 4 years (Ex. Variable X1 and X2 represent the number of security breaches occurred in (7pm-9pm) & (10pm-12pm) blocks). I want to compare the mean value of these variables. I think I can't use Chi-square proportion test as well since most of the recorded data have the value of <5. In addition, these variables suffer from overdispersion problem, as well (Ex. Mean=8.36 & Variance=331.66). What should I do in this case? Thank you!

Regards,

Yazdan

## Re: How to compare proportions of zero-inflated variables?

1) using non-parament method

PROC NPAR1WAY

2) Data simulation.

http://blogs.sas.com/content/iml/2014/11/21/resampling-in-sas.html

## Re: How to compare proportions of zero-inflated variables?

Thank you for your response. I also have another problem. Please consider my example: I have a table for the number of incidents occurred in two-hour blocks (Ex. 0-2, 2-4, etc). The data for each block recorded over 4 years and it is heavily inflated with zeros. So I have a table like:

Day                     [0-2]         [2-4]        [4-6]    .......   [10-0]

Match 1                  0             1             1       .......      13

March 2                  1             2             0       ........       2

.

.

.

.

March 30              0            10             0                      2

How can I compare the proportion of the number of attacks occurred during different time blocks in March? Please give me a hand in coding as well. Thank you!

Yazdan

## Re: How to compare proportions of zero-inflated variables?

```From my opinion, maybe you need GLMM model.
Check PROC GLIMMIX.

```
## Re: How to compare proportions of zero-inflated variables?

Maybe you could use GEE model.

date count range

Mar1      0         2  <--[0-2]

Mar1      1        2   <---[2-4]

..

Mar1     13     10  <-- [10-0]

Mar2     1          2  <--[0-2]

Make RANGE as offset variable,and use PROC GENMOD or PROC GEE to

model a GEE. Check

PROC GENMOD

Example 44.7: Log-Linear Model for Count Data

PROC GEE

Example 43.2: Log-Linear Model for Count Data

and also consider to use LSMEAN and ZEROMODEL  statement for zero .

Here is how to compare the differece between two proportion

(i.e. move the OFFSET variable into left side of model)

http://support.sas.com/kb/24/188.html

