BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
timtimalo
Obsidian | Level 7

Hello Rick. My question is:
What is the difference between generating data from bernoulli distribution such that the probability of the binary response to get value "zero" is p and the probability to get the value "one" is (1-p) and generating data from bernoulli distribution such that the probability of the binary response to get value "one" is p and the probability to get the value "zero" is (1-p)?
Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

If you use the statement

b = RAND("Bernoulli", p);

then b gets the value 1 with probability p.

 

If you want b to have the value 0 with probability p, you would use

b = RAND("Bernoulli", 1-p);

 

View solution in original post

4 REPLIES 4
Rick_SAS
SAS Super FREQ

There is no mathematical difference. In the first case the "event" is 0; in the second case the "event" is 1.

 

If this is part of simulating data from a logistic regression model, then you can control the value of the event that you are modeling. By default PROC LOGISTIC models the "event" as the first ordered category, which would be 0 for a 0/1 response variable.  You can model 1 by using the following MODEL statement:

MODEL y(event='1') = x1 x2 x3 ...;

 

In a simulation context, if you want the parameter estimates to match the parameters in the simulation, you need to make sure that the event that you are modeling matches the response variable that has probability p.

 

 

timtimalo
Obsidian | Level 7

You mentioned that:

To simulate logistic data, you need to do the following:

(1) Assign the design matrix (X) of the explanatory variables. This step is done once. It establishes the values of the explanatory variables in the (simulated) study.

(2) Compute the linear predictor, η = X β, where β is a vector of parameters. The parameters are the "true values" of the regression coefficients.

(3) Transform the linear predictor by the logistic (inverse logit) function.  The transformed values are in the range (0,1) and represent probabilities for each observation of the explanatory variables.

(4) Simulate a binary response vector from the Bernoulli distribution, where each 0/1 response is randomly generated according to the specified probabilities from Step 3.

My question is:

For step (4), How we can differentiate between generating data from bernoulli distribution such that the probability of the binary response to get value "zero" is p and the probability to get the value "one" is (1-p) and generating data from bernoulli distribution such that the probability of the binary response to get value "one" is p and the probability to get the value "zero" is (1-p)?

 

I mean, how we can guarantee that we will have data from bernoulli distribution such that the probability of the binary response to get value "zero" is p and the probability to get the value "one" is (1-p) 

Thank you

Rick_SAS
SAS Super FREQ

If you use the statement

b = RAND("Bernoulli", p);

then b gets the value 1 with probability p.

 

If you want b to have the value 0 with probability p, you would use

b = RAND("Bernoulli", 1-p);

 

timtimalo
Obsidian | Level 7

Thaks for your help.

I built confidence intervals for an unknown parameter (coefficient of variation) using different combinations of true parameters (mu and sigma) that are needed in the construction of these confidence intervals using simulation and I got the same coverage probability for these confidence intervals. Is it correct? What is the explantion? 

Thank you

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 4536 views
  • 1 like
  • 2 in conversation