BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
LABRADOR
Obsidian | Level 7

How do I write a SAS function to calculate the hypergeometric distribution of there being more than nine females surveyed from a sample of twenty?

 

In a large university, 40% of the students are female.

If a random sample of twenty students is selected, what is the probability that the sample will contain more than nine female students? (Round your answer to four decimal places.)

 

sample #of females surveyed:  
0 3.6561584400629733e-05
1 0.000487487792008398
2 0.0030874226827198492
3 0.012349690730879413
4 0.034990790404158215
5 0.0746470195288711
6 0.12441169921478513
7 0.1658822656197136
8 0.17970578775468962
9 0.1597384780041684
10 0.11714155053639005
11 0.07099487911296365
12 0.03549743955648174
13 0.01456305212573616
14 0.004854350708578719
15 0.0012944935222876583
16 0.00026968615047659553
17 4.2303709878681673e-05
18 4.70041220874241e-06
19 3.2985348833280015e-07
20 1.0995116277760013e-08

I generated the table above using Python, so I imagine this data is likely inconsistent. However, it provided me with the correct probability for exactly four females. which was 0.0350. 

 

I am currently trying to write a function for the excerpt below:

 If a random sample of twenty students is selected, what is the probability that the sample will contain more than nine female students?

1 ACCEPTED SOLUTION
7 REPLIES 7
Reeza
Super User
Please clarify your column headings? How is this binomial?
Reeza
Super User
You sure it's hypergeometric? I think you were right initially at binomial...
sbxkoenk
SAS Super FREQ

Hello,

 

Start here :

SAS® 9.4 and SAS® Viya® 3.5 Programming Documentation | SAS 9.4 / Viya 3.5
Functions and CALL Routines
CDF Hypergeometric Distribution https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lefunctionsref/p1x9o3ozc5ft8yn1kcn0p6yg4aw...

 

Maybe you need the QUANTILE function.
The QUANTILE function returns the quantile from a distribution that you specify.
The QUANTILE function is the inverse of the CDF function.

 

Other functions that SAS has for this distribution (and MANY other distributions) :

  • LOGCDF Function
  • LOGPDF Function
  • LOGSDF Function
  • PDF Hypergeometric Distribution Function
  • QUANTILE Function
  • SDF Function
  • SQUANTILE Function

data _NULL_;
   y=probhypr(10000,4000,20,9);
   put y= percent7.2;
run;

data _NULL_;
   y=cdf('HYPER', 9, 10000, 4000, 20);
   put y= percent7.2;
run;
/* end of program */

 

Koen

LABRADOR
Obsidian | Level 7

@Ksharp @sbxkoenk @Reeza

Thanks to all of you. I greatly appreciate the help.

Rick_SAS
SAS Super FREQ

Clearly, this is an assignment, so let me provide some hints rather than the solution:

1. The hypergeometric distribution is used when you want the probabilities for a small finite population where the size of the population is known. Assuming that the university is large, I suspect the instructor intends you to use a binomial distribution. To use the hypergeometric distribution, you must know the total number of students at the university and the number of females. For large populations (like 500 or more students), the hypergeometric and the binomial distributions are similar.

2. There are two related concepts here. If you want to know the probability that EXACTLY k females appear in a sample of size 20, you can use the PDF for the binomial distribution:

/* prob of exactly 4 females in a sample of size 20 */
p4 = pdf("Binomial", 
         4,        /* Prob that sample contains <= 9 females */
         0.4,      /* Population has 40% female */
         20);      /* sample size */

This appears to be what you did in Python.  If you want to know the probability that there will be 4 OR LESS females in the sample, you use the CDF function:

/* prob of 4 or less females in a sample of size 20 */
pLE4 = cdf("Binomial", 
         4,        /* Prob that sample contains <= 9 females */
         0.4,      /* Population has 40% female */
         20);      /* sample size */

3. Use the example above to compute the probability that 9 or fewer females are in the sample.

4. The problem asked for the probability of more than 9 females. How do you get that probability from the answer in 3? (Hint: maybe use subtraction....)

LABRADOR
Obsidian | Level 7

@Rick_SAS 

Thank you!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1314 views
  • 12 likes
  • 5 in conversation