Contributor
Posts: 29

# any ideas for distribution analysis?

Hi, i have interesting data tu analyse. there are two datasets (showing only example):

1) -1 -1 -1 0 1 1 0 -1 1 1 0 -1 -1 0 -1 1 1 -1 0 0 1 -1 1 1............................

2)

....

....

 0.99 0.8 1.11 -1 -1 0.8 0 -1 0 0 1 -1 0.78 0 0.76 0

...

basicaly datasets have 3 diferent types of values:

-1, 0, 1 in first dataset

and  -1, 0 ,poitive value which is in interval [0.5;1.2] in second dataset.

the size of data set is ~500 observations.

What i need to find out is the mean and confidence limits of it.

does anybody have an idea of evaluating distribution and parameters for such data? the main task is to get confident lower limit of mean and the minimum size of dataset which should be analysed to get confident result.

Posts: 2,655

## Re: any ideas for distribution analysis?

What do the values represent?  With some idea of the process that generated these values, it might be easier to come up with an answer.  I think of the first as ordinal categories, while the second looks like a mixture of some sort.  With a better understanding of how these values were generated, we might be able to give a better answer.

Steve Denham

Contributor
Posts: 29

## Re: any ideas for distribution analysis?

thank you for the interest. the data represents the outcome of gambling game. those datasets are from the same games observed.

1)simplified dataset:  if you lose your result is -1 . if its a draw the result is 0. if you win the result is 1. so we take  500 games, we win more often than loose and we have some draws. and i am trying to calculated the return of investment (average result).

2) second dataset is based on the same games but difference is that when you come in to the game you pay the price of 1 unit. if you loose the game, you loose 1 unit so profit is -1. if its a draw you get 1 unit refunded so profit is 0. if you win you win 1 multiplied by some coeficient frominterval [0.5; 1.2] and the profit is equal to the coeficient.

for example we know that from first 500 observations, we won 280, won lost 170, and draw 50, and the final result is  that the average profit on one game is 0.05 (5 percent) having in mind that profit of winning is less than 1, and lost amount is always -1 makes profit small while we win much more times than loose.

any help from my explanation?

Contributor
Posts: 29

any ideas?

Posts: 2,655

## Re: any ideas for distribution analysis?

The -1, 0, 1 data can be modeled with two processes.  The first calculates the probability of not tying, the second the probability of a win.  The expected value, based on your data, would be (number of zeroes/number of trials) * (number of wins/number of non-zero trials).  The variance could be calculated using the delta method, and applying it to the product of two binomials.

However, possibly the best way to estimate the mean and variance for these kinds of mixtures would be by bootstrapping.  Randomly sample 100 observations from each distribution and calculate the raw mean.  Repeat this about 1000 times, and calculate the overall mean and standard error based on the sample means.

I think the second distribution would be nearly intractable to any other analysis, as the fractional payoff is dependent on the distribution on the interval [0.5, 1.2].  I doubt very much that the distribution is uniform on that interval, probably a non-linear decreasing function like a gamma distribution, but truncated, so that moments would be almost impossible to calculate.  A perfect place to use bootstrapping.

Steve Denham

Contributor
Posts: 29

## Re: any ideas for distribution analysis?

Thank you so much. i found some prepared macros for bootstraping and i will test them with my data in few days. i will get back with results. Thanks again.

Contributor
Posts: 29