<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Appropriate method of limiting analyses? in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Appropriate-method-of-limiting-analyses/m-p/737039#M35792</link>
    <description>&lt;P&gt;P-values will change for almost any analysis that uses a subset of data.&lt;/P&gt;
&lt;P&gt;Consider:&lt;/P&gt;
&lt;PRE&gt;proc freq data=sashelp.class;
   tables age*sex/chisq;
run;

proc freq data=sashelp.class (obs=18);
   tables age*sex/chisq;
run;&lt;/PRE&gt;
&lt;P&gt;Removing 1 observation in this set changes the p-value of the chi-square test from 1.4848 to 1.8667.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The main concern with the survey data limitations is that anything that actually uses the variance in calculations may be off considerably. The secondary concern is if you want to project your results back to the population such as saying something like "Approximately 68,000 individuals have condition X" . A subset of the data means that estimate would likely be way off because the sums of the weights do not any longer actually represent the original population.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The approach is to use DOMAIN analysis for the procs that support it directly. That will do "all" the values using the variance ans needed. Then you only look at the bits you are concerned with.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 26 Apr 2021 17:31:22 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2021-04-26T17:31:22Z</dc:date>
    <item>
      <title>Appropriate method of limiting analyses?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Appropriate-method-of-limiting-analyses/m-p/737021#M35791</link>
      <description>&lt;P&gt;I've run into conflicting methods of&lt;STRONG&gt; limiting analysis&lt;/STRONG&gt;; I work with complex survey data, which naturally deals with descriptive analysis and general measures of association. Since starting this particular job a year ago, I was under the impression that it's wrong to "limit" or subset datasets like NHANES, BRFSS, YRBSS, etc. because it messes up the way the dataset represents the entire population respondents are sampled from. Instead, I and others I've worked with have usually created dichotomous "limiting" variables that essentially separate the "excluded" and "included" groups, then we only look at the "included" output.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A new co-worker that recently started did an analysis and (instead of creating the limiting variable), &lt;STRONG&gt;limited the dataset itself&lt;/STRONG&gt;. The prevalence and all other estimates were the same--the only thing that differed was the p-values of the&amp;nbsp;&lt;STRONG&gt;chi-square&lt;/STRONG&gt; tests.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What exactly about the data (or how SAS works) would make the p-values different? Or could've this been a random fluke in the program at the time--we've had some similar instances where every single measurement was the same, except the 95% CIs were slightly different between 2 or 3 people coding the same analysis.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Anyway...what are your thoughts? Is there anything that I seem to be mislead on here?&lt;/P&gt;</description>
      <pubDate>Mon, 26 Apr 2021 15:32:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Appropriate-method-of-limiting-analyses/m-p/737021#M35791</guid>
      <dc:creator>SAS93</dc:creator>
      <dc:date>2021-04-26T15:32:56Z</dc:date>
    </item>
    <item>
      <title>Re: Appropriate method of limiting analyses?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Appropriate-method-of-limiting-analyses/m-p/737039#M35792</link>
      <description>&lt;P&gt;P-values will change for almost any analysis that uses a subset of data.&lt;/P&gt;
&lt;P&gt;Consider:&lt;/P&gt;
&lt;PRE&gt;proc freq data=sashelp.class;
   tables age*sex/chisq;
run;

proc freq data=sashelp.class (obs=18);
   tables age*sex/chisq;
run;&lt;/PRE&gt;
&lt;P&gt;Removing 1 observation in this set changes the p-value of the chi-square test from 1.4848 to 1.8667.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The main concern with the survey data limitations is that anything that actually uses the variance in calculations may be off considerably. The secondary concern is if you want to project your results back to the population such as saying something like "Approximately 68,000 individuals have condition X" . A subset of the data means that estimate would likely be way off because the sums of the weights do not any longer actually represent the original population.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The approach is to use DOMAIN analysis for the procs that support it directly. That will do "all" the values using the variance ans needed. Then you only look at the bits you are concerned with.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 26 Apr 2021 17:31:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Appropriate-method-of-limiting-analyses/m-p/737039#M35792</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-04-26T17:31:22Z</dc:date>
    </item>
    <item>
      <title>Re: Appropriate method of limiting analyses?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Appropriate-method-of-limiting-analyses/m-p/737072#M35794</link>
      <description>&lt;P&gt;Thank you! That clears up my confusion.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 26 Apr 2021 17:56:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Appropriate-method-of-limiting-analyses/m-p/737072#M35794</guid>
      <dc:creator>SAS93</dc:creator>
      <dc:date>2021-04-26T17:56:45Z</dc:date>
    </item>
  </channel>
</rss>

