<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Sample size n and significance in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25595#M925</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;You are right that as sample size increases, the ability to resolve increases, and you are able to split hair if you have huge data volume.&amp;nbsp; Regression with multi-million records are being done routinely, something that's just not imaginable not that long ago.&amp;nbsp; At some point one need to stop asking only statistical significance (is it really there or not?) and start asking context significance (so that 0.00001% difference is truly there, so what?).&amp;nbsp; Ruth's point #4 last sentence is absolutely bang-on.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 05 Jul 2011 14:45:51 GMT</pubDate>
    <dc:creator>DLing</dc:creator>
    <dc:date>2011-07-05T14:45:51Z</dc:date>
    <item>
      <title>Sample size n and significance</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25591#M921</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Sir, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have a very large dataset (&amp;gt;500k records) and will use it to run linear regression. The large sample size makes the power of statistical tests also very large. This means that any tiny little effect will cause the null hypothesis H0 to be rejected. It becomes very impractical to judge the significance of a variable simply based on p-value.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there any reference or rule which provides a guideline for the sample size and significance level alpha? For example, rather than treat 5% as significance level for small sample, do I use 0.1% as significance level? So if the p-value for a variable is 2%, it is treated as not significant.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for your clarification and suggestion.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 04 Jul 2011 18:14:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25591#M921</guid>
      <dc:creator>bncoxuk</dc:creator>
      <dc:date>2011-07-04T18:14:33Z</dc:date>
    </item>
    <item>
      <title>Sample size n and significance</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25592#M922</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I recommend you to use Confidence interval .&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ksharp&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 05 Jul 2011 02:51:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25592#M922</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2011-07-05T02:51:39Z</dc:date>
    </item>
    <item>
      <title>Sample size n and significance</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25593#M923</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Confidence interval and cut offs are related so essentially the same test. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I recommend a multistep model building approach. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1. Hold back a set of data, ie use 250k for model fitting and 250k as model testing&lt;/P&gt;&lt;P&gt;2. Use a cutoff of 0.025 or less to test if a variable is significant&lt;/P&gt;&lt;P&gt;3. Test in the hold back dataset for significance.&lt;/P&gt;&lt;P&gt;4. Repeat with different hold back samples to ensure there is actually an effect. You'll also want to differentiate between statistically significant results and practically significant. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 05 Jul 2011 04:51:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25593#M923</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2011-07-05T04:51:27Z</dc:date>
    </item>
    <item>
      <title>Re: Sample size n and significance</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25594#M924</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi, Ksharp, and Reeza, I also want to understand this bit. Quite good knowledge.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Can I ask why confidence interval is a good option when the sample size is large? Confidence interval only tells if it contains the point of zero. If the sample size is large, then the confidence interval should be very narrow. Any more information from confidence interval that i don't know?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 05 Jul 2011 09:41:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25594#M924</guid>
      <dc:creator>Ruth</dc:creator>
      <dc:date>2011-07-05T09:41:15Z</dc:date>
    </item>
    <item>
      <title>Re: Sample size n and significance</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25595#M925</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;You are right that as sample size increases, the ability to resolve increases, and you are able to split hair if you have huge data volume.&amp;nbsp; Regression with multi-million records are being done routinely, something that's just not imaginable not that long ago.&amp;nbsp; At some point one need to stop asking only statistical significance (is it really there or not?) and start asking context significance (so that 0.00001% difference is truly there, so what?).&amp;nbsp; Ruth's point #4 last sentence is absolutely bang-on.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 05 Jul 2011 14:45:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Sample-size-n-and-significance/m-p/25595#M925</guid>
      <dc:creator>DLing</dc:creator>
      <dc:date>2011-07-05T14:45:51Z</dc:date>
    </item>
  </channel>
</rss>

