<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Sample for linear regression in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392367#M20469</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/71683"&gt;@Ps8813&lt;/a&gt; wrote:&lt;BR /&gt;If we can process full population then why we take sample?&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;1. Because historically that wasn't possible.&lt;/P&gt;
&lt;P&gt;2. Because when you have a lot of data your working on different principles, everything will be statistically significant at the point even if it's not significant due to effect size.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. Because having full population data is rare. So if I measure the hoof length of all the zebras in my zoo, that isn't the full population, it's a sample, but it's my full population. terminology&amp;nbsp;is important here.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you're in a rare case when you do have the full population, and it's a manageable size, ie 10,000 then going ahead and using the full population makes sense.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 31 Aug 2017 20:34:40 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2017-08-31T20:34:40Z</dc:date>
    <item>
      <title>Sample for linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392342#M20465</link>
      <description>If I have a population data of 10,000 observation and I want to perform linear regression, then still I need to take sample or I can perform it on full population? If we can process full population then why we take sample?</description>
      <pubDate>Thu, 31 Aug 2017 19:34:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392342#M20465</guid>
      <dc:creator>Ps8813</dc:creator>
      <dc:date>2017-08-31T19:34:45Z</dc:date>
    </item>
    <item>
      <title>Re: Sample for linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392343#M20466</link>
      <description>&lt;P&gt;I suppose the answer depends on how fast your computer is and how long you want to wait for the answer. I would go ahead an do the regression on the entire population; in other words the decision has nothing to do with statistics.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;10,000 records doesn't sound big to me at all, I'm sure I have done bigger.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2017 19:39:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392343#M20466</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2017-08-31T19:39:16Z</dc:date>
    </item>
    <item>
      <title>Re: Sample for linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392357#M20467</link>
      <description>&lt;P&gt;One thing to note is that most of the regression procedures by default will exclude from analysis any record that has a missing value for any of the variables on the model statement. So the regression may actually use many fewer records then you expect if you have many missing values.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2017 20:18:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392357#M20467</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-08-31T20:18:30Z</dc:date>
    </item>
    <item>
      <title>Re: Sample for linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392367#M20469</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/71683"&gt;@Ps8813&lt;/a&gt; wrote:&lt;BR /&gt;If we can process full population then why we take sample?&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;1. Because historically that wasn't possible.&lt;/P&gt;
&lt;P&gt;2. Because when you have a lot of data your working on different principles, everything will be statistically significant at the point even if it's not significant due to effect size.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. Because having full population data is rare. So if I measure the hoof length of all the zebras in my zoo, that isn't the full population, it's a sample, but it's my full population. terminology&amp;nbsp;is important here.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you're in a rare case when you do have the full population, and it's a manageable size, ie 10,000 then going ahead and using the full population makes sense.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2017 20:34:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392367#M20469</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-08-31T20:34:40Z</dc:date>
    </item>
    <item>
      <title>Re: Sample for linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392397#M20470</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It depends on the objective of the study. If you are building predictive models then it is usually suggested to split the data into training and test data sets. Training data is used to train the model while test data is used to see how stable the model is. On the other hand, if goal is to draw inferences then full data set can be used.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2017 23:33:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392397#M20470</guid>
      <dc:creator>stat_sas</dc:creator>
      <dc:date>2017-08-31T23:33:15Z</dc:date>
    </item>
    <item>
      <title>Re: Sample for linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392428#M20471</link>
      <description>&lt;P&gt;Computation speed or available memory is not an issue anymore. But remember that sampling is usually performed before measurement and measurement often has a high cost per unit. That's why it is often preferable to measure only a sample of the population.&lt;BR /&gt;Once we have the data (sample or full population), sampling is mostly useful for validating the structure or the performance of statistical models. Also, some statistical estimation methods (e.g. bootstrap) rely entirely on repeated sampling.&lt;/P&gt;</description>
      <pubDate>Fri, 01 Sep 2017 04:44:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-for-linear-regression/m-p/392428#M20471</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2017-09-01T04:44:18Z</dc:date>
    </item>
  </channel>
</rss>

