BookmarkSubscribeRSS Feed
Silver77
Calcite | Level 5
Data _null_;
a1=pdf("binomial",60,0.5,100);
a2=1-cdf("binomial",59,0.5,100);
put a1 a2;
run;

I am trying to get a simple binomial function without having to give a option of above or below a certain number I am very lost
THis is the question that I am trying to answer, I don't need the answer just some more clues on how to do it.
For a particular infectious disease, 15% of non-vaccinated individuals will become infected with the disease in one year versus
5% of vaccinated individuals. Simulate a virtual randomized trial in which you vaccinate 1000 individuals with the real vaccine and
1000 individuals with a placebo vaccine and follow the groups for one year. (Hint: Generate values from random binomial functions
with N=1000, p=.15 and N=1000, p=.05; then subtract). How many more infections occurred in the placebo group than in the vaccine group in this single virtual trial? 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

"Simulate" usually does not involve pdf or cdf. I would imagine it involves the RAND function, such as this:

 

x=RAND('BINOMIAL', 0.15, 1000);
--
Paige Miller
ballardw
Super User

Depending on the purpose of the exercise and what comes next there are two, at least, ways to approach this.

One way is to create group total of occurrences. An approach:

data group;
   status='Vaccination';
   disease = rand('Binomial',0.05,1000);
   output;
   status='Placebo';
   disease = rand('Binomial',0.15,1000);
   output;
run;

And you might look at the results:

Proc print data=group noobs;
   var status disease;
run;

Another approach would be to simulate individual records with vaccination status and disease result for each. The Bernoulli distribution is a single event trial. Binomial distribution is the result of multiple Bernoulli trials.

So perhaps:

data individuals;
   Status='Vaccination';
   do id=1 to 1000;
      disease = rand('Bernoulli',0.05);
      output;
   end;
   Status='Placebo';
   do id=1001 to 2000;
      disease = rand('Bernoulli',0.15);
      output;
   end;
run;
proc means data=individuals sum;
   class status;
   var disease;
run;

Which approach is "better" really comes in what is next. Consider if you want to also include another random factor such as age, ethnicity, or hair color. To use the group approach you would have to have some idea of the rate for those along with the vaccination status.

But if I am given some other information such as 20% of the population is category 1, 30% is category 2 and 50% is category 3 we can create a random sample of individuals using one of the other random functions:

data individuals;
   Status='Vaccination';
   do id=1 to 1000;
      disease = rand('Bernoulli',0.05);
      category = rand('Table',0.2,0.3,0.4);
      output;
   end;
   Status='Placebo';
   do id=1001 to 2000;
      disease = rand('Bernoulli',0.15);
      category = rand('Table',0.2,0.3,0.4);
      output;
   end;
run;
Silver77
Calcite | Level 5

Thank you this was very helpful 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 543 views
  • 0 likes
  • 3 in conversation