<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Inferring data about a large population from a small sample in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473954#M121698</link>
    <description>&lt;P&gt;These numbers are the expected costs for each unit in 1 year. so yes, i suppose that they could be considered "guaranteed". We have this data projected out for 10 years (so 10 identical tables to the one I posted for every year from 2018-2027) Basically, assuming that the estimates are correct, we are looking for an estimated total repair cost in each year, as well as total over 10 years with a 95% confidence interval.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;not sure if that helps of confuses, but thanks for having a think about this with me.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Mike&lt;/P&gt;</description>
    <pubDate>Thu, 28 Jun 2018 02:08:13 GMT</pubDate>
    <dc:creator>righcoastmike</dc:creator>
    <dc:date>2018-06-28T02:08:13Z</dc:date>
    <item>
      <title>Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473949#M121695</link>
      <description>&lt;P&gt;HI All,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a question that might be as much about stats as it is about SAS programming. I'm hoping that you folks can help.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a sample of 35 records from a population of 9635 that looks like this:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Data have;
Input Unit Repair_cost;
datalines:
1	10,277.00
2	33,615.00
3	23,442.00
4	11,220.00
5	41,321.00
6	40,801.00
7	20,896.00
8	44,753.00
9	28,659.00
10	19,753.00
11	28,760.00
12	24,537.00
13	20,536.00
14	20,959.00
15	5,693.00
16	8,290.00
17	28,715.00
18	41,550.00
19	18,459.00
20	49,197.00
21	28,955.00
22	46,149.00
23	25,273.00
24	45,867.00
25	24,716.00
26	43,519.00
27	27,884.00
28	37,714.00
29	8,001.00
30	42,151.00
31	43,197.00
32	27,245.00
33	31,736.00
34	9,503.00
35	14,946.00
;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I figure I can calculate the SD and 95% confidence limits for the sample by using:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;ods select BasicIntervals;
proc univariate data=have cibasic;
   var Repair_cost;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;That should give me the mean repair cost and 95% confidence interval for an individual unit. My question is, can I then multiply the mean, upper and lower limits by the total population (9635) to get an expected total repair cost and associated confidence limits. It makes intuitive sense to me, but I've found that in stats, my intuition isn't always correct.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I can't do it this way, can someone suggest the best way to get a predicted total repair cost and associated confidence interval for the entire population of 9635 based on the sample of 35 I have above?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;any help is much appreciated.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks so much&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Mike&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 01:34:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473949#M121695</guid>
      <dc:creator>righcoastmike</dc:creator>
      <dc:date>2018-06-28T01:34:20Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473950#M121696</link>
      <description>&lt;P&gt;Is there a guarantee that all units need to be repaired at some point? This is what I would call a back of the napkin type estimate....&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/154458"&gt;@righcoastmike&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;HI All,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have a question that might be as much about stats as it is about SAS programming. I'm hoping that you folks can help.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have a sample of 35 records from a population of 9635 that looks like this:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Data have;
Input Unit Repair_cost;
datalines:
1	10,277.00
2	33,615.00
3	23,442.00
4	11,220.00
5	41,321.00
6	40,801.00
7	20,896.00
8	44,753.00
9	28,659.00
10	19,753.00
11	28,760.00
12	24,537.00
13	20,536.00
14	20,959.00
15	5,693.00
16	8,290.00
17	28,715.00
18	41,550.00
19	18,459.00
20	49,197.00
21	28,955.00
22	46,149.00
23	25,273.00
24	45,867.00
25	24,716.00
26	43,519.00
27	27,884.00
28	37,714.00
29	8,001.00
30	42,151.00
31	43,197.00
32	27,245.00
33	31,736.00
34	9,503.00
35	14,946.00
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I figure I can calculate the SD and 95% confidence limits for the sample by using:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;ods select BasicIntervals;
proc univariate data=have cibasic;
   var Repair_cost;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;That should give me the mean repair cost and 95% confidence interval for an individual unit. My question is, can I then multiply the mean, upper and lower limits by the total population (9635) to get an expected total repair cost and associated confidence limits. It makes intuitive sense to me, but I've found that in stats, my intuition isn't always correct.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If I can't do it this way, can someone suggest the best way to get a predicted total repair cost and associated confidence interval for the entire population of 9635 based on the sample of 35 I have above?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;any help is much appreciated.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks so much&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Mike&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 01:48:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473950#M121696</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-06-28T01:48:41Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473954#M121698</link>
      <description>&lt;P&gt;These numbers are the expected costs for each unit in 1 year. so yes, i suppose that they could be considered "guaranteed". We have this data projected out for 10 years (so 10 identical tables to the one I posted for every year from 2018-2027) Basically, assuming that the estimates are correct, we are looking for an estimated total repair cost in each year, as well as total over 10 years with a 95% confidence interval.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;not sure if that helps of confuses, but thanks for having a think about this with me.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Mike&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 02:08:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473954#M121698</guid>
      <dc:creator>righcoastmike</dc:creator>
      <dc:date>2018-06-28T02:08:13Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473956#M121700</link>
      <description>&lt;P&gt;No. You can't . It depends on how it sample from population .&lt;/P&gt;
&lt;P&gt;Or Calling&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt;&amp;nbsp;. Maybe he can shed a light .&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 02:16:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473956#M121700</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2018-06-28T02:16:55Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473957#M121701</link>
      <description>&lt;P&gt;If it helps, my sample should be considered as a simple random sample.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 02:35:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473957#M121701</guid>
      <dc:creator>righcoastmike</dc:creator>
      <dc:date>2018-06-28T02:35:59Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473958#M121702</link>
      <description>If I was right, then your estimator of sample is BLUE.
i.e. the mean of sample is almost the same as the population. also for mean's CL .</description>
      <pubDate>Thu, 28 Jun 2018 02:37:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473958#M121702</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2018-06-28T02:37:15Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473959#M121703</link>
      <description>&lt;P&gt;******UPDATE********* I think proc surveymeans might be what I am looking for, but I'm still not sure how to get an expected total repair costs w. 95% confidence intervals for the entire population (9363 Units), based on the data in the sample population (35 units).&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 02:40:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473959#M121703</guid>
      <dc:creator>righcoastmike</dc:creator>
      <dc:date>2018-06-28T02:40:54Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473962#M121705</link>
      <description>&lt;P&gt;If&amp;nbsp;it's a simple random sample you can use the method you initially suggested.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If it was a sample where the machines do not reflect your population of machines and each one has a specific weight attached to it to match the total population then that would be weighted analysis.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/154458"&gt;@righcoastmike&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;******UPDATE********* I think proc surveymeans might be what I am looking for, but I'm still not sure how to get an expected total repair costs w. 95% confidence intervals for the entire population (9363 Units), based on the data in the sample population (35 units).&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 03:33:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/473962#M121705</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-06-28T03:33:58Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/474026#M121734</link>
      <description>&lt;P&gt;Thanks Reeza, much appreciated.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Mike&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 11:28:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/474026#M121734</guid>
      <dc:creator>righcoastmike</dc:creator>
      <dc:date>2018-06-28T11:28:34Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/474031#M121736</link>
      <description>&lt;P&gt;This is an interesting&amp;nbsp;question.&amp;nbsp; I think the confidence interval will depend on the assumed distribution of the prices. For example, the&amp;nbsp;sum of&amp;nbsp;IID exponential random variables has a gamma distribution.&amp;nbsp; The sum of IID normal variables is normal.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Assuming&amp;nbsp;a simple random sample,&amp;nbsp;the expected sum is N*XBar, where XBar is the sample mean and N=9635. However, I don't think multiplying the lower/upper limits by N gives the correct CI. I think that interval is too conservative (that is, wider than it needs to me).&amp;nbsp; If you want a ballpark figure, you can use it.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 11:48:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/474031#M121736</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2018-06-28T11:48:32Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/474032#M121737</link>
      <description>&lt;P&gt;Thanks Rick,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would rather be too conservative as opposed to not, and for now I think a ballpark would work. At this point though, I'm just curious about how one would go about calculating the CI for the total properly. I'll keep looking and post a response here if I figure anything out.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 11:52:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/474032#M121737</guid>
      <dc:creator>righcoastmike</dc:creator>
      <dc:date>2018-06-28T11:52:14Z</dc:date>
    </item>
    <item>
      <title>Re: Inferring data about a large population from a small sample</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/474110#M121767</link>
      <description>&lt;P&gt;I think it becomes a prediction interval, not a confidence interval and that would be wider than the confidence interval.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jun 2018 15:10:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Inferring-data-about-a-large-population-from-a-small-sample/m-p/474110#M121767</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-06-28T15:10:52Z</dc:date>
    </item>
  </channel>
</rss>

