Solved
Contributor
Posts: 21

# Generating data from bernoulli distribution

Hello Rick. My question is:
What is the difference between generating data from bernoulli distribution such that the probability of the binary response to get value "zero" is p and the probability to get the value "one" is (1-p) and generating data from bernoulli distribution such that the probability of the binary response to get value "one" is p and the probability to get the value "zero" is (1-p)?
Thank you

Accepted Solutions
Solution
‎08-01-2016 07:37 PM
SAS Super FREQ
Posts: 3,837

## Re: Generating data from bernoulli distribution

If you use the statement

b = RAND("Bernoulli", p);

then b gets the value 1 with probability p.

If you want b to have the value 0 with probability p, you would use

b = RAND("Bernoulli", 1-p);

All Replies
SAS Super FREQ
Posts: 3,837

## Re: Generating data from bernoulli distribution

There is no mathematical difference. In the first case the "event" is 0; in the second case the "event" is 1.

If this is part of simulating data from a logistic regression model, then you can control the value of the event that you are modeling. By default PROC LOGISTIC models the "event" as the first ordered category, which would be 0 for a 0/1 response variable.  You can model 1 by using the following MODEL statement:

MODEL y(event='1') = x1 x2 x3 ...;

In a simulation context, if you want the parameter estimates to match the parameters in the simulation, you need to make sure that the event that you are modeling matches the response variable that has probability p.

Contributor
Posts: 21

## Re: Generating data from bernoulli distribution

You mentioned that:

To simulate logistic data, you need to do the following:

(1) Assign the design matrix (X) of the explanatory variables. This step is done once. It establishes the values of the explanatory variables in the (simulated) study.

(2) Compute the linear predictor, η = X β, where β is a vector of parameters. The parameters are the "true values" of the regression coefficients.

(3) Transform the linear predictor by the logistic (inverse logit) function.  The transformed values are in the range (0,1) and represent probabilities for each observation of the explanatory variables.

(4) Simulate a binary response vector from the Bernoulli distribution, where each 0/1 response is randomly generated according to the specified probabilities from Step 3.

My question is:

For step (4), How we can differentiate between generating data from bernoulli distribution such that the probability of the binary response to get value "zero" is p and the probability to get the value "one" is (1-p) and generating data from bernoulli distribution such that the probability of the binary response to get value "one" is p and the probability to get the value "zero" is (1-p)?

I mean, how we can guarantee that we will have data from bernoulli distribution such that the probability of the binary response to get value "zero" is p and the probability to get the value "one" is (1-p)

Thank you

Solution
‎08-01-2016 07:37 PM
SAS Super FREQ
Posts: 3,837

## Re: Generating data from bernoulli distribution

If you use the statement

b = RAND("Bernoulli", p);

then b gets the value 1 with probability p.

If you want b to have the value 0 with probability p, you would use

b = RAND("Bernoulli", 1-p);

Contributor
Posts: 21

## Re: Generating data from bernoulli distribution

I built confidence intervals for an unknown parameter (coefficient of variation) using different combinations of true parameters (mu and sigma) that are needed in the construction of these confidence intervals using simulation and I got the same coverage probability for these confidence intervals. Is it correct? What is the explantion?

Thank you

☑ This topic is solved.