turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- How to compare proportions of zero-inflated variab...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-13-2017 01:20 PM

Hi all,

I have a problem with the dbI'm working on. I want to compare the mean value of two variable. I can't use t-test since the variables are highly zero-inflated. The variables represent the number of security breaches over different time blocks that were recorded over a period of 4 years (Ex. Variable X1 and X2 represent the number of security breaches occurred in (7pm-9pm) & (10pm-12pm) blocks). I want to compare the mean value of these variables. I think I can't use Chi-square proportion test as well since most of the recorded data have the value of <5. In addition, these variables suffer from overdispersion problem, as well (Ex. Mean=8.36 & Variance=331.66). What should I do in this case? Thank you!

Regards,

Yazdan

Accepted Solutions

Solution

04-19-2017
10:03 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yazdan

04-14-2017 10:24 AM

1) using non-parament method

PROC NPAR1WAY

2) Data simulation.

http://blogs.sas.com/content/iml/2014/11/21/resampling-in-sas.html

All Replies

Solution

04-19-2017
10:03 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yazdan

04-14-2017 10:24 AM

1) using non-parament method

PROC NPAR1WAY

2) Data simulation.

http://blogs.sas.com/content/iml/2014/11/21/resampling-in-sas.html

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

04-17-2017 10:32 AM

Thank you for your response. I also have another problem. Please consider my example: I have a table for the number of incidents occurred in two-hour blocks (Ex. 0-2, 2-4, etc). The data for each block recorded over 4 years and it is heavily inflated with zeros. So I have a table like:

Day [0-2] [2-4] [4-6] ....... [10-0]

Match 1 0 1 1 ....... 13

March 2 1 2 0 ........ 2

.

.

.

.

March 30 0 10 0 2

How can I compare the proportion of the number of attacks occurred during different time blocks in March? Please give me a hand in coding as well. Thank you!

Yazdan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yazdan

04-18-2017 12:50 AM

From my opinion, maybe you need GLMM model. Check PROC GLIMMIX. I am not expert about GLMM, so I can not help you any more.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yazdan

04-19-2017 10:33 PM

Maybe you could use GEE model.

make your data like:

date count range

Mar1 0 2 <--[0-2]

Mar1 1 2 <---[2-4]

..

Mar1 13 10 <-- [10-0]

Mar2 1 2 <--[0-2]

Make RANGE as offset variable,and use PROC GENMOD or PROC GEE to

model a GEE. Check

PROC GENMOD

Example 44.7: Log-Linear Model for Count Data

PROC GEE

Example 43.2: Log-Linear Model for Count Data

and also consider to use LSMEAN and ZEROMODEL statement for zero .

Here is how to compare the differece between two proportion

(i.e. move the OFFSET variable into left side of model)

http://support.sas.com/kb/24/188.html