Consider the hypothetical setup:
X denotes counties in a city. We can have X1, X2,..., Xi.
Y denotes hospitals in the city. So let's say for county X1, we have Y11, Y12,..., Y1j, and so on for other counties.
Z denotes number of nurses in these hospitals. So for county X1, we have Z11 for hospital Y11, Z12 for hospital Y12,... Z1j for hospital Y1j, and so on.
We take a sample of hospitals, so not all of them are included in the sample. Let's say:
x denotes counties in the sample, x1, x2,..., x(i)
y denotes hospitals in the sample. For x1 we may have y1, y2,..., y(u), and so on.
z denotes number of nurses in the sample's hospitals. For x1 we have z1, z2,..., z(u).
I want to use the data from the sample to estimate the number of nurses in the target population (the city). Using mean value in a quick and dirty fashion for this purpose, I think of two potential approaches:
Z = (mean of z)*Y
Z1 = (z1 + z2 + ... + z(u))/(number of sampled hospitals of county x1)*number of hospitals of county X1
After that: Z = Z1 + Z2 + ... + Zi
It turns out that the two methods produce different results most of the time. I wonder which is the better estimates.
What do you mean with hospital sizes? Are they total staff working for those hospitals or some other measure?
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.