Consider the hypothetical setup:
X denotes counties in a city. We can have X1, X2,..., Xi.
Y denotes hospitals in the city. So let's say for county X1, we have Y11, Y12,..., Y1j, and so on for other counties.
Z denotes number of nurses in these hospitals. So for county X1, we have Z11 for hospital Y11, Z12 for hospital Y12,... Z1j for hospital Y1j, and so on.
We take a sample of hospitals, so not all of them are included in the sample. Let's say:
x denotes counties in the sample, x1, x2,..., x(i)
y denotes hospitals in the sample. For x1 we may have y1, y2,..., y(u), and so on.
z denotes number of nurses in the sample's hospitals. For x1 we have z1, z2,..., z(u).
I want to use the data from the sample to estimate the number of nurses in the target population (the city). Using mean value in a quick and dirty fashion for this purpose, I think of two potential approaches:
Z = (mean of z)*Y
Z1 = (z1 + z2 + ... + z(u))/(number of sampled hospitals of county x1)*number of hospitals of county X1
After that: Z = Z1 + Z2 + ... + Zi
It turns out that the two methods produce different results most of the time. I wonder which is the better estimates.
What do you mean with hospital sizes? Are they total staff working for those hospitals or some other measure?
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.