Can this question be answered using the code below

Silver77 · Posted 08-21-2023 05:09 PM

Data _null_;
a1=pdf("binomial",60,0.5,100);
a2=1-cdf("binomial",59,0.5,100);
put a1 a2;
run;

I am trying to get a simple binomial function without having to give a option of above or below a certain number I am very lost
THis is the question that I am trying to answer, I don't need the answer just some more clues on how to do it.
For a particular infectious disease, 15% of non-vaccinated individuals will become infected with the disease in one year versus
5% of vaccinated individuals. Simulate a virtual randomized trial in which you vaccinate 1000 individuals with the real vaccine and
1000 individuals with a placebo vaccine and follow the groups for one year. (Hint: Generate values from random binomial functions
with N=1000, p=.15 and N=1000, p=.05; then subtract). How many more infections occurred in the placebo group than in the vaccine group in this single virtual trial?

PaigeMiller · Posted 08-21-2023 05:50 PM

"Simulate" usually does not involve pdf or cdf. I would imagine it involves the RAND function, such as this:

x=RAND('BINOMIAL', 0.15, 1000);

--
Paige Miller

ballardw · Posted 08-21-2023 08:08 PM

Depending on the purpose of the exercise and what comes next there are two, at least, ways to approach this.

One way is to create group total of occurrences. An approach:

data group;
   status='Vaccination';
   disease = rand('Binomial',0.05,1000);
   output;
   status='Placebo';
   disease = rand('Binomial',0.15,1000);
   output;
run;

And you might look at the results:

Proc print data=group noobs;
   var status disease;
run;

Another approach would be to simulate individual records with vaccination status and disease result for each. The Bernoulli distribution is a single event trial. Binomial distribution is the result of multiple Bernoulli trials.

So perhaps:

data individuals;
   Status='Vaccination';
   do id=1 to 1000;
      disease = rand('Bernoulli',0.05);
      output;
   end;
   Status='Placebo';
   do id=1001 to 2000;
      disease = rand('Bernoulli',0.15);
      output;
   end;
run;
proc means data=individuals sum;
   class status;
   var disease;
run;

Which approach is "better" really comes in what is next. Consider if you want to also include another random factor such as age, ethnicity, or hair color. To use the group approach you would have to have some idea of the rate for those along with the vaccination status.

But if I am given some other information such as 20% of the population is category 1, 30% is category 2 and 50% is category 3 we can create a random sample of individuals using one of the other random functions:

data individuals;
   Status='Vaccination';
   do id=1 to 1000;
      disease = rand('Bernoulli',0.05);
      category = rand('Table',0.2,0.3,0.4);
      output;
   end;
   Status='Placebo';
   do id=1001 to 2000;
      disease = rand('Bernoulli',0.15);
      category = rand('Table',0.2,0.3,0.4);
      output;
   end;
run;

Silver77 · Posted 08-22-2023 02:39 PM

Thank you this was very helpful

Can this question be answered using the code below

Re: Can this question be answered using the code below

Re: Can this question be answered using the code below

Re: Can this question be answered using the code below

Can this question be answered using the code below

Re: Can this question be answered using the code below

Re: Can this question be answered using the code below

Re: Can this question be answered using the code below

SAS Innovate 2025: Call for Content